Structured data stores

Structured data stores have been around for decades and are the most familiar technology choice when it comes to storing data. Most of the transactional databases such as Oracle, MySQL, SQL Server, and PostgreSQL are row-based due to dealing with frequent data writes from software applications. Organizations often repurpose transactional database for reporting purposes, where frequent data reads are required, but much fewer data writes. Looking at high data-read requirements, there is more innovation coming into an area of query on structured data stores, such as the columnar file format, which helps to enhance data read performance for analytics requirements. 

Row-based formats store the data in rows in a file. Row-based writing is the fastest way to write the data to the disk but it is not necessarily the quickest read option because you have to skip over lots of irrelevant data. Column-based formats store all the column values together in the file. This leads to better compression because the same data types are now grouped together. It also typically provides better read performance because you can skip columns that are not required.

Let's look at common choices for the structured data store. Take an example where you need to query the total number of sales in a given month from the order table, which has fifty columns. In a row-based architecture, the query will scan the entire table with all fifty columns, but in columnar architecture, the query will just scan the order sales column, thus improving data query performance. Let's look into more details about relational databases, focusing on transaction data and data warehousing to handle data analytics needs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.229.113