String columns

Now, it is time to explore different type of columns within our dataset. The easy step is to look at columns containing strings-these columns are like ID columns since they hold unique values:

val stringColumns = loanDataHf.names().indices
.filter(idx => loanDataHf.vec(idx).isString)
.map(idx => loanDataHf.name(idx))
println(s"String columns:${table(stringColumns, 4, None)}")

The output is shown in the following screenshot:

The question is whether the url feature contains any useful information that we can extract. We can explore data directly in H2O Flow and look at some samples of data in the feature column in the following screenshot:

We can see directly that the url feature contains only pointers to the Lending Club site using the application ID that we already dropped. Hence, we can decide to drop it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.55.69