Loading the data

Download the energy efficiency dataset from https://archive.ics.uci.edu/ml/datasets/Energy+efficiency.

The dataset is in Excel's XLSX format, which cannot be read by Weka. We can convert it into a comma-separated value (CSV) format by clicking on File | Save As and picking .csv in the saving dialog, as shown in the following screenshot. Confirm to save only the active sheet (since all of the others are empty), and confirm to continue, to lose some formatting features. Now, the file is ready to be loaded by Weka:

Open the file in a text editor and inspect whether the file was correctly transformed. There might be some minor issues that could cause problems. For instance, in my export, each line ended with a double semicolon, as follows:

X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2;; 
0,98;514,50;294,00;110,25;7,00;2;0,00;0;15,55;21,33;; 
0,98;514,50;294,00;110,25;7,00;3;0,00;0;15,55;21,33;; 

To remove the doubled semicolon, you can use the Find and Replace function: find ;; and replace it with ;.

The second problem was that my file had a long list of empty lines at the end of the document, which can be deleted, as follows:

0,62;808,50;367,50;220,50;3,50;5;0,40;5;16,64;16,03;; 
;;;;;;;;;;; 
;;;;;;;;;;; 

Now, we are ready to load the data. Let's open a new file and write a simple data import function by using Weka's converter for reading files in a CSV format, as follows:

import weka.core.Instances; 
import weka.core.converters.CSVLoader; 
import java.io.File; 
import java.io.IOException; 
 
public class EnergyLoad { 
 
  public static void main(String[] args) throws IOException { 
 
    // load CSV 
    CSVLoader loader = new CSVLoader();
loader.setFieldSeparator(","); loader.setSource(new File("data/ENB2012_data.csv")); Instances data = loader.getDataSet(); System.out.println(data); } }

The data is loaded! Let's move on.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.146.47