Saturday 22 February 2014

Primary Index Data Distribution in TERADATA

How the data is distributed among AMPs based on PI in Teradata?

•       Assume a row is to be inserted into a Teradata table
•       The Primary Index Value for the Row is put into the Hash Algorithm
•       The output is a 32-bit Row Hash
•       The Row Hash points to a bucket in the Hash Map.The first 16 bits of the Row Hash of is used to            locate a bucket in the Hash Map
•       The bucket points to a specific AMP
•       The row along with the Row Hash are delivered to that AMP

When the AMP receives a row it will place the row into the proper table, and the AMP checks if it has any other rows in the table with the same row hash. If this is the first row with this particular row hash the AMP will assign a 32-bit uniqueness value of 1. If this is the second row hash with that particular row hash, the AMP will assign a uniqueness value of 2. The 32-bit row hash and the 32-bit uniqueness value make up the 64-bit Row ID. The Row ID is how tables are sorted on an AMP.

This uniqueness value is useful in case of NUPI's to distinguish each BUPI value.

Both UPI and NUPI is always a One AMP operation as the same values will be stores in same AMP.

No comments:

Post a Comment