Goal¶

This post aims to introduce how to use pymc3 for Bayesian regression by showing the simplest single variable example.

Reference

Libraries¶

In [63]:

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import pymc3 as pm
%matplotlib inline

Create a data for Bayesian regression¶

To compare non-Bayesian linear regression, the way to generate data follows the one used in this post Linear Regression

\begin{equation*} \mathbf{y} = \mathbf{A}\mathbf{x} + \mathbf{b} + \mathbf{e} \end{equation*}

Here $\mathbf{x}$ is a 1 dimension vector, $\mathbf{b}$ is a constant variable, $\mathbf{e}$ is white noise.

In [58]:

a = 10
b = 4
n = 100
sigma = 3
e = sigma * np.random.randn(n) 
x = np.linspace(-1, 1, num=n)
y = a * x + b + e

plt.plot(x, y, '.', label='observed y');
plt.plot(x, a * x + b, 'r', label='true y');
plt.legend();

Modeling using `pymc`¶

In Bayesian world, the above formula is reformulated as below:

\begin{equation*} \mathbf{y} \sim \mathcal{N} (\mathbf{A}\mathbf{x} + \mathbf{b}, \sigma^2) \end{equation*}

In this case, we regard $\mathbf{y}$ as a random variable following the Normal distribution defined by the mean $\mathbf{A}\mathbf{x} + \mathbf{b}$ and the variance $\sigma^2$.

In [59]:

model = pm.Model()
with model:
    a_0 = pm.Normal('a_0', mu=1, sigma=10)
    b_0 = pm.Normal('b_0', mu=1, sigma=10)
    x_0 = pm.Normal('x_0', mu=0, sigma=1, observed=x)
    mu_0 = a_0 * x_0 + b_0 
    sigma_0 = pm.HalfCauchy('sigma_0', beta=10)

    y_0 = pm.Normal('y_0', mu=mu_0, sigma=sigma_0, observed=y)

    trace = pm.sample(500)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma_0, b_0, a_0]
Sampling 4 chains: 100%|██████████| 4000/4000 [00:01<00:00, 2748.91draws/s]
The acceptance probability does not match the target. It is 0.880262225775949, but should be close to 0.8. Try to increase the number of tuning steps.

In [60]:

pm.traceplot(trace);

In [61]:

pm.summary(trace)

Out[61]:

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
a_0	10.033580	0.529140	0.011378	8.924638	11.006548	2367.657604	1.001081
b_0	4.304798	0.312978	0.006715	3.694924	4.881839	2080.493833	1.000428
sigma_0	3.046738	0.226399	0.004346	2.648484	3.520783	2335.392735	0.999326

Linear Regression¶

In [65]:

reg = LinearRegression()
reg.fit(x.reshape(-1, 1), y);
print(f'Coefficients A: {reg.coef_[0]:.3}, Intercept b: {reg.intercept_:.2}')

Coefficients A: 10.1, Intercept b: 4.3

Plot Detrministic and Bayesian Regression Lines¶

In [83]:

plt.plot(x, y, '.', label='observed y', c='C0')
plt.plot(x, a * x + b, label='true y', lw=3., c='C3')
pm.plot_posterior_predictive_glm(trace, samples=30, 
                                 eval=x,
                                 lm=lambda x, sample: sample['b_0'] + sample['a_0'] * x, 
                                 label='posterior predictive regression', c='C2')
plt.plot(x, reg.coef_[0] * x + reg.intercept_ , label='deterministic linear regression', ls='dotted',c='C1', lw=3)
plt.legend(loc=0);
plt.title('Posterior Predictive Regression Lines');