Posterior Predictive Checks (PPCs) are very useful for examining how well a model fits the data. They do so by generating data from the model using parameters from draws from the posterior. We use the pm.sample_ppc function for this purpose and obtain n samples for each observation (the GLM module automatically names the outcome 'y'):
ppc = pm.sample_ppc(trace_NUTS, samples=500, model=logistic_model)
ppc['y'].shape
(500, 29170)
We can evaluate the in-sample fit using the auc score, for example, to compare different models:
roc_auc_score(y_score=np.mean(ppc['y'], axis=0),
y_true=data.income)
0.8294958565103577