Preparing data

After we determined a machine learning model, we can prepare the data. In this project, I generated random values for temperature and humidity. You can see this data in the following graph:

We should make the data in a CSV file. We build three columns—Temperature, Humidity, and watering. You can see these data in the following screenshot. The watering column is used for the target decision for each row's data:

Save the data into the CSV file, for instance, Temp-Hum-Water.csv.

We also need to create a data schema. You should create a schema file in <data-file_name>.schema. For our case, we create a file, Temp-Hum-Water.csv.schema. You can write these scripts for our schema:

{
"version": "1.0",
"targetAttributeName": "Watering",
"dataFormat": "CSV",
"dataFileContainsHeader": true,
"attributes": [
{
"attributeName": "Temperature",
"attributeType": "NUMERIC"
},
{
"attributeName": "Humidity",
"attributeType": "NUMERIC"
},
{
"attributeName": "Watering",
"attributeType": "Categorical"
}
]
}

The next step is to upload the data and schema files into Amazon S3. Currently, Amazon Machine Learning can work with data from Amazon S3 and Amazon Redshift. For demo, we use Amazon S3 to store data and schema files.

Now you can upload the Temp-Hum-Water.csv and Temp-Hum-Water.csv.schema files into Amazon S3. You can see my data in the following screenshot:

We will build a machine learning model from Amazon Machine Learning in the next section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.30.19