Partitioning with DB2 Multisystem

 

Partitioning is the process of distributing a file across the nodes in a node group.

Partitioning is done using the hash algorithm. When a new record is added, the hash algorithm is applied to the data in the partitioning key. The result of the hash algorithm, a number between 0 and 1023, is then applied to the partitioning map to determine the node on which the record resides.

The partition map is also used for query optimization, updates, deletes, and joins. The partition map can be customized to force certain key values to certain nodes.

For example, during I/O, the system applies the hash algorithm to the values in the partitioning key fields. The result is applied to the partition map stored in the file to determine which node stores the record.

The following example shows how these concepts relate to each other.

Employee number is the partitioning key and a record is entered into the database for an employee number of 56 000. The value of 56 000 is processed by the hash algorithm and the result is a partition number of 733. The partition map, which is part of the node group object and is stored in the distributed file when it is created, contains node number 1 for partition number 733. Therefore, this record is physically stored on the system in the node group that is assigned node number 1. The partitioning key (the PTNKEY parameter) was specified by you when you created the partitioned (distributed) file.

Fields in the partitioning key can be null-capable. However, records that contain a null value within the partitioning key always hash to partition number 0. Files with a significant number of null values within the partitioning key can result in data skew on the partition number 0, because all of the records with null values hash to partition number 0.

After you have created your node group object and a partitioned distributed relational database file, you can use the DSPNODGRP command to view the relationship between partition numbers and node names. In the Displaying node groups using the DSPNODGRP command with DB2® Multisystem topic, you can find more information about displaying partition numbers, node groups, and system names.

When creating a distributed file, the partitioning key fields are specified either on the PTNKEY parameter of the Create Physical File (CRTPF) command or in the PARTITIONING KEY clause of the SQL CREATE TABLE statement. Fields with the data types DATE, TIME, TIMESTAMP, and FLOAT are not allowed in a partitioning key.

 

Parent topic:

Distributed files with DB2 Multisystem

 

Related concepts


How node groups work with DB2 Multisystem
Display Node Group command
Change Node Group Attributes command