When to transform your data? Before or after loading to BigQuery?

There are a couple of phases in the ETL process when an analyst might want to transform their data. BigQuery is a very sophisticated data warehouse system. Because of this, BigQuery has already implemented a number of functions useful for transformation. With that said, BigQuery does not have every function that you might find in other data warehouses or programming languages. Also, BigQuery's automatic type detection sometimes might coerce a data type with unwanted results. Because of this, the analyst will need to make a decision as to how much transformation he or she wants to apply to the data prior to loading it in BigQuery.

Chapter 8 cover Google Cloud Dataprep which is a server less service which can load the data from a file, transform it and insert into BigQuery. The ETL jobs developed using Cloud Dataprep can be scheduled to run automatically.

BigQuery supports two formats of SQL dialect one is legacy SQL which was developed by Google and the other is standard SQL which is compliant with  SQL 2011 standards. A project can use both legacy SQL and standard SQL. In this chapter if #legacySQL is specified before a query then it is supported only in legacy SQL. If #standardSQL is specified before a query then it is supported only in standardSQL. If neither of these are present it means the query is supported in both legacy and standard SQL.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.128.145