Chapter 3 – How Teradata is Designed

“Design is not just what it looks like and feels like. Design is how it works.”
- Steve Jobs

Teradata is Designed Around the Primary Index

CREATE TABLE Employee_Table

( Employee_No

INTEGER

 ,Dept_No

INTEGER

 ,First_Name

VARCHAR (12)

 ,Last_Name

CHAR(20)

 ,Salary

DECIMAL (10,2)

) UNIQUE PRIMARY INDEX (Employee_No);

The Primary Index is defined when a table is created. Choosing the best column is by far the most important decision about the table design that you will make.

The three things to know about real estate are location, location and location. The five things to know about Teradata are: Primary Index, Primary Index, Primary Index, Primary Index and Primary Index. Most executives tend to shy away from the technology portion, but this is a career opportunity because technology has become so exciting! I am here to explain it to you. Stay with me now and you will understand technology better than most the members on your team. That might prove to be quite advantageous.

Why the Primary Index Important to Teradata? Distribution

image

Teradata distributes the rows of a table by running a math formula on the row’s Primary Index column value. There is only one formula. The Primary Index column’s value will determine a row’s physical location

Why Else the Primary Index Important to Teradata? Retrieval

image

Teradata follows the same formula to find a row that it used to distribute the row. A query that uses the Primary Index column is always a Single-AMP retrieve.

Use the Primary Index and Only One Block is Transferred

image

Only AMP 2 is contacted by the Parsing Engine, and only one block is moved into memory for processing.

Why Else the Primary Index Important to Teradata? Sorting

image

Teradata distributes the rows of a table by running a math formula on the row’s Primary Index column value. Then, each AMP sorts the rows by that Primary Index value formula. Why is this important?

AMPs Know Which Block Holds the Requested Data

image

The User wants to see information about Employee_No 3011.
AMP 2. Please move your luggage into memory and retrieve it!

This is important because each AMP sorts their data and tags it. The Parsing Engine tells the AMP it knows who holds the row to retrieve it, and the AMP knows which luggage it should move into memory.

As An Executive Know That Teradata Delivers in One Second

image

For Primary Index queries – You can count on 1 second response times.

You already know that Teradata can handle massive amounts of data because of the parallel processing, but rest assured that part of your user community will need sub-second response times for their applications.

Teradata Has a Different Design for Tables Called Partitioning

image

Some Users will query based on a range, such as wanting to see all orders from March. When these queries begin happening a lot, consider Partitioning the table. All AMPs will be involved, but they only move one block each inside their memory. That is magnitudes of order faster than doing a Full Table Scan!

An All AMPs Retrieve By Way of a Single Partition

image

All AMP are used to satisfy the query, but each AMP only reads a portion of their rows.

Partitioning Uses All AMPs, but a Portion of the Data Moves

image

All AMPs are involved because each AMP holds a portion of the March Data. Yet, instead of moving in all 12 blocks of orders, each AMP only moves in one block. This is the true meaning of performance tuning and design. Find out what type of queries your user base is running and then have your IT team design for it.

What does a Columnar Table look like?

image

The two tables above contain the same Employee data, but the bottom example is a Columnar table. Employee_Normal has 3 rows on each AMP with 5 columns. Employee_Columnar is split into 5 containers and each container has one column.

Teradata Has a Different Design for Tables Called Columnar

image

image

A Columnar Table design breaks up the columns into their own separate blocks. Turn the page and get a visual that is even clearer than this.

A Columnar Table is Like Separating Columns in Excel

image

Imagine if you created five excel spreadsheets with only one column, but ordered the rows in each so that if you needed complete information you could put the entire row back together. That’s a Columnar design.

A Columnar Table is Best for Queries with Few Columns

image

A Columnar design is great for queries that only ask for a few columns, but as you can imagine they are difficult to keep everything in sync and to load. That is the trade-off.

A Comparison of Data for Normal Vs. Columnar

image

The normal table on top is one block containing three rows and five columns. The columnar table below has five blocks each containing three rows with one column. Columnar tables are better when users query just a few columns and not all columns.

Which Move From Disk to Memory Would You Choose?

image

Like we use to say in the 1960s, “It’s all about the movement”. Find out what requirements there are in your system, and then performance tune it by taking advantage of the Primary Index column, a Partitioned Table or Columnar. All computer systems tune by moving the least amount of data from disk into memory.

Teradata has Two More Tricks to Move Less Blocks

1. Have Users use the Table’s Primary Index column in the SQL.

2. Create a table that is Partitioned.

3. Create a table that uses a Columnar Design.

4. Track the most popular tables and keep them in-memory.

5. Use a Secondary Index.

The next couple of pages will describe how Teradata uses an In-Memory strategy and secondary indexes to move less blocks inside memory.

Teradata Tracks the Most Used Data and Keeps it In-Memory

image

In Teradata V14.10, the system tracks the tables the users are accessing the most. Then, with extra memory placed on each AMP, the system keeps the most popular blocks in memory.

Eventually, every system will attempt to load the data from disk one time and then keep the data in memory. Queries are thousands of times faster. There just isn’t enough memory available to do this in mass right now, but Teradata has the ability to do this with the most popular tables.

Teradata Has Secondary Indexes

image

When a secondary index is created on a column, a small table (or suitcase) is created to track the column. The Parsing Engine will first query the small table to find the exact blocks where the real data resides. It is a little like using an Index at the back of a book. It makes sense if you are looking up a specific subject and it is only on a couple pages in the book, but it doesn’t make sense if the subject is scattered throughout the book. Why use an index at the back of the book for the word “The” because it is on every page.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.27.119