Teradata Material: Partition Primary Index (PPI)

Saturday, 22 February 2014

Partition Primary Index (PPI)

What is Partitioned Primary Index (PPI) in Teradata?

Partitioned primary index is physically splitting the table into a series of subtables, one for every partitioning value. When a single row is accessed, it looks first at the partitioning value to determine the subtable, then at the primary index to calculate the rowhash for the row(s).

For example, we have PPI on a MONTH Column, the rows of particular months are all sorted with in the same partition and whenever data is accessed for particular month, it will retrive the data in a faster way.

It helps to avoid full table scans.

What are the advantages and disadvantages of PPI in Teradata?

Advantages:

Range queries don’t have to utilize a Full Table Scan.
Deletions of entire partitions are lightning fast.
PPI provides an excellent solution instead of using Secondary Indexes
Tables that hold yearly information don’t have to be split into 12 smaller tables to avoid Full Table Scans (FTS). This can make modeling and querying easier.
Fastload and Multiload work with PPI tables, but not with all Secondary Indexes.

Disadvantages:

· A two-byte Partition number is added to the ROW-ID and it is now called a ROW KEY. The two-bytes per row will add more Perm Space to a table.

· Joins to Non-Partitioned Tables can take longer and become more complicated for Teradata to perform.

· Basic select queries utilizing the Primary Index can take longer if the Partition number is not also mentioned in the WHERE clause of the query.

· You can’t have a Unique Primary Index (UPI) if the Partition Number is not at least part of the Primary Index. You must therefore create a Unique Secondary Index to maintain uniqueness.

Teradata Material

Pages

Saturday, 22 February 2014

Partition Primary Index (PPI)

No comments:

Post a Comment