What is Partitioned Primary Index
(PPI) in Teradata?
Partitioned primary index is physically splitting
the table into a series of subtables, one for every partitioning value. When a
single row is accessed, it looks first at the partitioning value to determine
the subtable, then at the primary index to calculate the rowhash for the
row(s).
For example, we have PPI on a MONTH Column, the
rows of particular months are all sorted with in the same partition and
whenever data is accessed for particular month, it will retrive the data in a
faster way.
It helps to avoid full table scans.
What are the advantages and
disadvantages of PPI in Teradata?
Advantages:
- Range queries don’t have to utilize a Full Table Scan.
- Deletions of entire partitions are lightning fast.
- PPI provides an excellent solution instead of using Secondary Indexes
- Tables that hold yearly information don’t have to be split into 12 smaller tables to avoid Full Table Scans (FTS). This can make modeling and querying easier.
- Fastload and Multiload work with PPI tables, but not with all Secondary Indexes.
Disadvantages:
· A two-byte
Partition number is added to the ROW-ID and it is now called a ROW KEY. The
two-bytes per row will add more Perm Space to a table.
· Joins to
Non-Partitioned Tables can take longer and become more complicated for Teradata
to perform.
· Basic select
queries utilizing the Primary Index can take longer if the Partition number is
not also mentioned in the WHERE clause of the query.
· You can’t have a
Unique Primary Index (UPI) if the Partition Number is not at least part of the
Primary Index. You must therefore create a Unique Secondary Index to maintain
uniqueness.
No comments:
Post a Comment