Data overview

First, we are going to analyze the types of variables that we have in the dataset. For that, we can use the class function, which tells us whether a variable is a number, a character, or a matrix. For example, the class of the identifying number of a bank ID_RSSD can be obtained as follows:

class(Model_database$ID_RSSD)

## [1] "integer"

This function indicates that this variable is a number without decimals.

We can calculate the same information for all the variables and store it using the following code:

 classes<-as.data.frame(sapply(Model_database, class))
classes<-cbind(colnames(Model_database),classes)
colnames(classes)<-c("variable","class")

With sapply, calculate iteratively the class function on the dataset. Then, combine the name of variables with the class in only a data frame, and, finally, rename the resulting dataset:

head(classes)

## variable class
## ID_RSSD ID_RSSD integer
## UBPR1795 UBPR1795 numeric
## UBPR4635 UBPR4635 numeric
## UBPRC233 UBPRC233 numeric
## UBPRD582 UBPRD582 numeric
## UBPRE386 UBPRE386 numeric

This dataset contains four different types of variables:

table(classes$class)

## character Date integer numeric
## 462 1 4 1027

According to previous steps, we know that only variables with a Date format collect the date of financial statements.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.36.32