How to do it...

Execute the following steps to split the dataset into training and test sets.

  1. Import the function from sklearn:
from sklearn.model_selection import train_test_split
  1. Split the data into training and test sets:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
  1. Split the data into training and test sets without shuffling:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)
  1. Split the data into training and test sets with stratification:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)
  1. Verify that the ratio of the target is preserved:
y_train.value_counts(normalize=True)
y_test.value_counts(normalize=True)

In both sets, the percentage of defaults is ~22.12%.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.161.228