To write the multiple logistic regression model using PyMC3, we take advantages of its vectorization capabilities, allowing us to introduce only minor modifications to the previous simple logistic model (model_0):
with pm.Model() as model_1:
α = pm.Normal('α', mu=0, sd=10)
β = pm.Normal('β', mu=0, sd=2, shape=len(x_n))
μ = α + pm.math.dot(x_1, β)
θ = pm.Deterministic('θ', 1 / (1 + pm.math.exp(-μ)))
bd = pm.Deterministic('bd', -α/β[1] - β[0]/β[1] * x_1[:,0])
yl = pm.Bernoulli('yl', p=θ, observed=y_1)
trace_1 = pm.sample(2000)
As we did for a single predictor variable, we are going to plot the data and the decision boundary:
idx = np.argsort(x_1[:,0])
bd = trace_1['bd'].mean(0)[idx]
plt.scatter(x_1[:,0], x_1[:,1], c=[f'C{x}' for x in y_0])
plt.plot(x_1[:,0][idx], bd, color='k');
az.plot_hpd(x_1[:,0], trace_1['bd'], color='k')
plt.xlabel(x_n[0])
plt.ylabel(x_n[1])
Figure 4.5
The boundary decision is a straight line, as we have already seen. Do not get confused by the curved aspect of the 94% HPD band. The apparent curvature is the result of having multiple lines pivoting around a central region (roughly around the mean of and the mean of ).