Data transformation – Glue

Data transformation is the process of taking our raw data from one format and mapping it to a new structure or format that we choose. This process is a fundamental part of all data processing and usually requires a lot of storage space, as well as large-scale computations. We can speed up the transformation process by parallelizing the processing, which is something that Glue can do for us out of the box.

In this section, we are going to create our own transformation process using Glue. We will take the existing knowledge about our data, which can be found in our data catalog, and create a target structure. It will be clear which fields we want to map to our new structure, and we'll also learn a tip about a file format that will increase the efficiency of our analytics queries in the future.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.226.255