Building our first decision tree

I think we are ready for a more complex example. As promised earlier, let's now move into the medical domain.

Let's consider an example where several patients have suffered from the same illness, such as a rare form of basorexia. Let's further assume that the true causes of the disease remain unknown to this day and that all of the information that is available to us consists of a bunch of physiological measurements. For example, we might have access to the following information:

A patient's blood pressure (BP)
A patient's cholesterol level (cholesterol)
A patient's gender (sex)
A patient's age (age)
A patient's blood sodium concentration (Na)
A patient's blood potassium concentration (K)

Based on all of this information, let's suppose a doctor made recommendations to their patient to treat their disease using one of four possible drugs, drug A, B, C, or D. We have data available for 20 different patients (the output has been trimmed down):

In [1]: data = [
...        {'age': 33, 'sex': 'F', 'BP': 'high', 'cholesterol': 'high',
...         'Na': 0.66, 'K': 0.06, 'drug': 'A'},
...        {'age': 77, 'sex': 'F', 'BP': 'high', 'cholesterol': 'normal',
...         'Na': 0.19, 'K': 0.03, 'drug': 'D'},
...
...
...
...        {'age': 38, 'sex': 'M', 'BP': 'high', 'cholesterol': 'normal',
...         'Na': 0.78, 'K': 0.05, 'drug': 'A'}
...     ]

Table of Contents for Building our first decision tree

Create new playlist

Sign In

Sign Up

Table of Contents for
Building our first decision tree