How to do it...

Execute the following steps to split the dataset into training and test sets.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)

y_train.value_counts(normalize=True)
y_test.value_counts(normalize=True)

In both sets, the percentage of defaults is ~22.12%.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.145.161.228

Table of Contents for How to do it...