Teradata Wiki: Teradata Columnar

The purpose of a Columnar(NoPI) table is to spread the rows evenly across the AMPs. This is why a NoPI table is often used as a staging table.

Columnar Table Fundamentals

Columnar Tables must be a NoPI Table so No Primary Index (NoPI).
The NoPI brings even distribution to the table.
Columnar Tables allow Columns to be Partitioned.
An AMP still holds the entire row, but partitions vertically.
Columns are placed inside their own individual Container.
All Containers have the same amount of rows in the exact order.
Single Columns or Multi-Columns can be placed inside containers.
Each container looks like a small table for I/O purposes.
Add up all the containers and you rebuild the row.
Columnar Tables make sense when users query only certain columns.
When a row is deleted it is NOT Physically Deleted but marked deleted

Normal table vs columnar table
The two tables above contain the same Employee data, but one is a columnar table. Employee_Normal has placed 3 rows on each AMP with 5 columns. The other table Employee_Columnar has 5 Containers each with one column.

Add caption

Example
CREATE Table Employee
(
Emp_Id Integer
,Dept_Id Integer
,First_Name Varchar(20)
,Last_Name Char(20)
,Salary Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN;

Columnar table

This AMP is assigned 3 Employee Rows
All AMPs hold 3 different Employee Rows also
Each Row has 5 Columns
This Columnar Table partitions in 5 separate containers
Each container has a relative row number (1, 2, 3)
Each container has the exact same number of rows

NoPI Table Capabilities:

Are always Multi-Set Tables
Have Secondary Indexes (USI or NUSI)
Have Join Indexes
Be Volatile or Global Temporary Tables
Can COLLECT STATISTICS
Be FALLBACK Protected
Have Triggers
Be Large Objects (LOBs)
Have Primary Key Foreign Key Constraint

NoPI Table Restrictions

No Primary Indexes allowed
No SET Tables
No Partition Primary Index (PPI) tables
No Queue Tables
No Hash Indexes
No Identity Columns
No Permanent Journaling
Can't be the Target Table for any UPDATE, UPSERT or MERGE-INTO Statements

Examples of Columnar tables

Multi-Columnar

CREATE Table Employee
(
Emp_Id Integer
,Dept_Id Integer
,First_Name Varchar(20)
,Last_Name Char(20)
,Salary Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN
(Emp_Id
,Dept_id
(,First_name, Last_name, Salary));

Row Hybrid Columnar

CREATE Table Employee
(
Emp_Id Integer
,Dept_Id Integer
,First_Name Varchar(20)
,Last_Name Char(20)
,Salary Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN
(Emp_Id No Auto Compress
,Dept_id
(,First_name, Last_name, Salary)
No Auto Compress);

Columnar Partitions

Crete Table Order_table_PPI_Col
(Order_no integer not null
,Customer_no integer
,Order_date date
,Order_total decimal(10,2)
)
NO PRIMARY INDEX
PARTITION BY(Column
,Range_N(Order_date between date '2015-01-01 AND date '2015-12-31'
EACH INTERVAL '1' Month));

Teradata Columnar or Column Partitioning (CP) is a new physical database design implementation option (starting with Teradata 14.0) that allows single columns or sets of columns of a NoPI table to be stored in separate partitions. Column partitioning can also be applied to join indexes.

Columnar is simply a new approach for organizing the data of a user-defined table or join index on disk.

Columnar offers the ability to partition a table or join index by column. It can be used alone or in combination with row partitioning in multilevel partitioning definitions. Column partitions may be stored using traditional 'ROW' storage or alternatively stored using the new 'COLUMN' storage option. In either case, columnar can automatically compress physical rows where appropriate.

The key benefit in defining row-partitioned (PPI) tables is when queries access a subset of rows based on constraints on one or more partitioning columns. The major advantage of using column partitioning is to improve the performance of queries that access a subset of the columns from a table, either for predicates (e.g., WHERE clause) or projections (i.e., SELECTed columns).

Because sets of one or more columns can be stored in separate column partitions, only the column partitions that contain the columns needed by the query need to be accessed. Just as row-partitioning can eliminate rows that need not be read, column partitioning eliminates columns that are not needed.

The advantages of both can be combined, meaning even less data moved and thus reduced I/O. Fewer data blocks need to be read since more data of interest is packed together into fewer data blocks.

Columnar makes more sense in CPU-rich environments because CPU cycles are needed to "glue" columns back together into rows, for compression and for different join strategies (mainly hash joins).

Benefits of using the Teradata Columnar:

Improved query performance
Reduced disk space
Increased flexibility
Reduced I/O
Ease of use

Social Icons

Pages

Facebook fan

Recent Updates

Interview Questions

Links

Total Pageviews

Columnar

Teradata Columnar

Quick Links

Utilities

SQL