Chapter 3 – Space

“In the end, we'll remember not the words of our enemies, but the silence of our friends.”

– Martin Luther King, Jr.

Perm and Spool Space

images

Each AMP has its own disk where it stores the data it has been assigned. Each AMP will hold permanent tables on its disk in Perm Space. Any space left (empty space) is used for Spool space. Perm Space is for tables and Spool Space is used for building query answer sets. Not all users get to create Permanent tables so few users get Perm Space, but all users must have Spool Space if they want to query the Teradata system.

Perm Space is for Permanent Tables

images

PERM Space is where an AMP keeps its tables. Each AMP has the exact same tables, but each AMP is responsible for different rows of those tables. If each AMP works in parallel to retrieve their portion of the rows, and the data is spread evenly, then parallel processing works brilliantly. As AMPs retrieve rows from a table to satisfy a query, they place those rows in Spool. This is the intermediate area. It is like you using a pad of paper to add some numbers together. It is an area that helps come up with a final answer set.

Spool Space is work space that builds a User's Answer Sets

images

Spool space is used by each AMP in order to build the answer set for the user. It is intermediate results. Once the query no longer needs the intermediate results, the Spool space is released. Once the query comes back to the user, the Spool is always deleted.

How is Spool Space like a Hotel Lobby?

images

Teradata systems are configured with enough spool space to accommodate the average amount of users that simultaneously use the system.

If every person simultaneously drove their cars, the roads would come to a standstill, but not everyone drives simultaneously. The key to building good roads and highways is to estimate how many people will be driving at one time. This is the same thing with spool. We determine how many users will be querying at the same time and how large their queries will be. Then, we can configure the system so that there is enough spool space to run well.

Think of Spool Space like Aisles in a Grocery Store

images

Spool space is system wide. When a user runs a query, the intermediate results are stored in empty space anywhere on the AMP's disk. If there is empty space on disk, then that can be used to store the intermediate results. When the user's query is done, the spool space becomes empty.

A Grocery Store could load their property 100% with goods, but then nobody would have room to browse and shop. So the Grocery Store creates aisles that are empty space giving customers an area to shop and load their cart.

Each customer can buy groceries unless they run out of money. Each user is assigned a spool space limit, and if they go over their limit their query is aborted.

When a user purchases their groceries and has not run out of money, they receive what they came for and the aisle is clear for other customers. That is spool space!

Spool is just like an aisle in a Grocery Store, but one produces results and the other results in produce!

When a User is created they are assigned a Spool Space Limit

 CREATE USER MaryJones
      FROM Marketing
   AS
      PASSWORD=abc123
      PERM=0
      SPOOL=5000000
      TEMPORARY = 0
      ACCOUNT=('$Med_Mrkt')
      DEFAULT DATABASE = Marketing ;

Every time a user is created, they are assigned a Spool Space limit. This user, defined as MaryJones, has five million bytes of spool (5 Megabytes). Each time MaryJones runs a query, she is tracked. If the intermediate results or answer set goes over 5 MB, her query is aborted.

MaryJones can run queries (up to 5 MB), but she cannot create Permanent Tables because she is assigned a Perm = 0. She can't create or use Global Temporary Tables because she is assigned a Temporary=0. When MaryJones logs into the system, her default database is Marketing. She will be tracked in CPU and Disk I/O reports by her account. Her priority is Medium.

The most important part of this lesson is that a user is assigned spool space when they are created and that is their query limit. It is similar to a credit card. Go over your limit and transactions fail!

All Spool Space is calculated on a Per AMP Basis

 CREATE USER MaryJones
      FROM Marketing
   AS
      PASSWORD=abc123
      PERM=0
      SPOOL=5000000
      TEMPORARY = 0
      ACCOUNT=('$Med_Mrkt')
      DEFAULT DATABASE = Marketing ;

images

Since MaryJones has been assigned five Megabytes of spool space, and there are five AMPs in the system, MaryJones really gets one Megabyte of spool per AMP. That is her true spool limit.

All space assignments are always divided by the number of AMPs in the entire system. Mary Jones' queries will abort if she goes over one Megabyte of spool on any AMP.

What is a common reason that a User runs out of Spool?

images

A user has a spool limit. That limit is divided by the number of AMPs in the system. This gives the user a Per AMP limit. If one AMP has a lot more data on it than the other AMPs it is called a “Hot AMP”. This is the most likely reason you ran out of spool. You went over your Per AMP limit on the “Hot AMP”. A Non-Unique Primary Index distributes the data unevenly!

You often run out of spool because you have written a bad query and it accesses too much data, but more often than not, it is because the data is skewed.

Why is a Database Assigned Spool Space?

MRKT
Database

5 MB
Spool

Sales
Database

10 MB
Spool

IT
Database

100 MB
Spool

Executive
Database

500 MB
Spool

When a database is assigned spool space, that is the maximum spool limit for every user in that database, but every user can run queries simultaneously.

It does NOT mean that the users in a database all share the spool. Each user in MRKT can have a 5 MB spool limit, and each user can simultaneously run queries.

Why do users in MRKT only get 5 MB of spool each, while the Executive users get 500 MB each? We expect the MRKT users to run simpler queries, and if they go over 5 MB then they probably made a mistake, and it is a good thing the query aborts. We expect the Executive users to run massive queries, so we allow them to run these massive queries.

If spool is like a credit card limit, then everyone in MRKT gets a $5 limit on their card. They can all spend their $5 as they choose, and two people in MRKT could simultaneously shop and spend $5 each. The users in sales all get a $10 limit and each can spend as they choose. The same goes for IT and the Executive users. Each gets their max limit on their individual credit card.

Spool is an individual limit and not a Pool to Share

images

All MRKT users don't share 5 MBs of spool. Each gets their own 5 MBs of spool.

 

Spool is not a pool fool! It is a maximum for each individual. It is just like driving a car on the highway and seeing a maximum speed limit of 80. If there are two cars on the highway, they don't share the 80 and each go 40. The 80 is a maximum speed limit designed so everyone can go up to 80. Each MRKT user gets to stay alive and go 5!

What is the Purpose of Spool Limits?

images

There are two reasons for Spool Limits:

  1. If a user makes a mistake and runs a query that could take weeks to run it will abort as soon as the user goes over their allotted spool limit.
  2. It keeps users from hogging the system.

Spool is assigned to users and the only way a user is aborted is if they go over their spool limit. Marketing, Sales, and DBC have unlimited spool, but the max for each individual user is 5 GBs in Marketing, 10 GBs in Sales, and user DBC's maximum spool is 100 GBs.

Perm Space is for Permanent Tables

 CREATE DATABASE MRKT
   FROM DBC
   AS
   PERM = 100000000000
   SPOOL = 5000000
   NO FALLBACK
   ACCOUNT= '$Med_Mrkt';

The MRKT database has 5 MBs of spool, but it has 100 GBs of PERM.

MRKT can create tables and load up to 100 GBs of data.

This is not 100 GBs per table, but all tables combined can add up to 100 GBs max.

 

A database that contains Perm can create and house tables. There might be 1,000 tables, but in this case, the total combined space cannot go over 100 GBs of Perm.

Perm and Spool is Calculated on a Per AMP Basis

images

Each User or database space limits are calculated on a per AMP basis, so if you are assigned 10 GBs of Perm and the system has 10 AMPs, you are really assigned 1 GB of Perm per AMP!

If data is skewed and you exceed your 1 MB Spool limit on any AMP, you are “out of spool”.

Spool is assigned to every user, but since Teradata is a parallel processing system each AMP is only concerned with itself. Each AMP processes its portion of the data in parallel. Because of this philosophy, the Perm Space of (10 GBs) and the Spool Space (10 MBs) is divided among the total AMPs in the system. This example has 10 AMPs in the system, so the space limits are 1 GB Perm and 1 MB Spool per AMP.

Why did my query Abort and say “Out of Spool”?

images

How is it possible that I ran out of spool?

You ran out of spool because your query used over your limit of 10 MBs of spool.

It is also possible that you have logged onto multiple machines or ran multiple queries, and the combination went over 10 MBs of spool.

It is also very likely that the data you were working with was NOT evenly distributed (skewed), and this is a major cause of Spool errors.

Spool is assigned to users and the only way a user is aborted is if they go over their spool limit. No user has ever failed because they are in the database Mrkt and Mrkt has only 10 MBs of spool. It doesn't work that way. Thousands of users in Mrkt can run queries simultaneously because Mrkt has unlimited amounts of spool, but each user in Marketing is limited to 10 MBs individually.

How come my Join caused me to run “Out of Spool”?

images You might not have put in a Join Condition.

SELECT First_Name, Last_Name, Department_Name
FROM    Employee_Table   as E
               INNER JOIN
               Department_Table as D

images You might have Aliased the table and then fully qualified with the real table name.

SELECT First_Name, Last_Name, Department_Name
FROM    Employee_Table as E
               INNER JOIN
               Department_Table as D
ON   Employee_Table.Dept_No = D.Dept_No ;

images There might be skewed data on one of the tables.

images

images A Lot of NULLs on a table on an Outer Join.

SELECT e.*, d.* from Employee_Table as E
LEFT OUTER JOIN Department_Table as D
ON E.Dept_No= D.Dept_No ;

Finding out how much Space you have

images

You can use this view in the DBC Data Dictionary to see your space. Notice that there are no quotes around the keyword USER. This is because Teradata knows your login, and the keyword USER means you.

Space per AMP on all tables in a Database shows Skew

images

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.171.212