Social Icons

Showing posts with label Teradata Columnar. Show all posts
Showing posts with label Teradata Columnar. Show all posts

Columnar

Teradata Columnar
The purpose of a Columnar(NoPI) table is to spread the rows evenly across the AMPs. This is why a NoPI table is often used as a staging table.

Columnar Table Fundamentals
  • Columnar Tables must be a NoPI Table so No Primary Index (NoPI).
  • The NoPI brings even distribution to the table.
  • Columnar Tables allow Columns to be Partitioned.
  • An AMP still holds the entire row, but partitions vertically.
  • Columns are placed inside their own individual Container.
  • All Containers have the same amount of rows in the exact order.
  • Single Columns or Multi-Columns can be placed inside containers.
  • Each container looks like a small table for I/O purposes.
  • Add up all the containers and you rebuild the row.
  • Columnar Tables make sense when users query only certain columns.
  • When a row is deleted it is NOT Physically Deleted but marked deleted


Normal table vs columnar table
The two tables above contain the same Employee data, but one is a columnar table. Employee_Normal has placed 3 rows on each AMP with 5 columns. The other table Employee_Columnar has 5 Containers each with one column.
Teradata Columnar containers
Add caption


Example 
CREATE Table Employee
(
  Emp_Id        Integer
 ,Dept_Id       Integer
 ,First_Name    Varchar(20)
 ,Last_Name     Char(20)
 ,Salary        Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN;


Columnar table 
Teradata Columnar AMP



  • This AMP is assigned 3 Employee Rows
  • All AMPs hold 3 different Employee Rows also
  • Each Row has 5 Columns
  • This Columnar Table partitions in 5 separate containers
  • Each container has a relative row number (1, 2, 3)
  • Each container has the exact same number of rows


NoPI Table Capabilities:
  • Are always Multi-Set Tables
  • Have Secondary Indexes (USI or NUSI)
  • Have Join Indexes
  • Be Volatile or Global Temporary Tables
  • Can COLLECT STATISTICS
  • Be FALLBACK Protected
  • Have Triggers
  • Be Large Objects (LOBs)
  • Have Primary Key Foreign Key Constraint


NoPI Table Restrictions
  • No Primary Indexes allowed
  • No SET Tables
  • No Partition Primary Index (PPI) tables
  • No Queue Tables
  • No Hash Indexes
  • No Identity Columns
  • No Permanent Journaling
  • Can't be the Target Table for any UPDATE, UPSERT or MERGE-INTO Statements



Examples of Columnar tables

Multi-Columnar

CREATE Table Employee
(
  Emp_Id        Integer
 ,Dept_Id       Integer
 ,First_Name    Varchar(20)
 ,Last_Name     Char(20)
 ,Salary        Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN
(Emp_Id
,Dept_id
(,First_name, Last_name, Salary));

Row Hybrid Columnar

CREATE Table Employee
(
  Emp_Id        Integer
 ,Dept_Id       Integer
 ,First_Name    Varchar(20)
 ,Last_Name     Char(20)
 ,Salary        Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN
(Emp_Id No Auto Compress
,Dept_id
(,First_name, Last_name, Salary)
No Auto Compress);

Columnar Partitions

Crete Table Order_table_PPI_Col
(Order_no integer not null
,Customer_no integer
,Order_date date
,Order_total decimal(10,2)
)
NO PRIMARY INDEX
PARTITION BY(Column
,Range_N(Order_date between date '2015-01-01 AND date '2015-12-31'
EACH INTERVAL '1' Month));

Continue Reading...

Teradata Columnar

TeradataWiki-Teradata_columnar
Teradata Columnar or Column Partitioning (CP) is a new physical database design implementation option (starting with Teradata 14.0) that allows single columns or sets of columns of a NoPI table to be stored in separate partitions.  Column partitioning can also be applied to join indexes.

Columnar is simply a new approach for organizing the data of a user-defined table or join index on disk.

Columnar offers the ability to partition a table or join index by column.  It can be used alone or in combination with row partitioning in multilevel partitioning definitions.  Column partitions may be stored using traditional 'ROW' storage or alternatively stored using the new 'COLUMN' storage option.  In either case, columnar can automatically compress physical rows where appropriate.  

The key benefit in defining row-partitioned (PPI) tables is when queries access a subset of rows based on constraints on one or more partitioning columns.  The major advantage of using column partitioning is to improve the performance of queries that access a subset of the columns from a table, either for predicates (e.g., WHERE clause) or projections (i.e., SELECTed columns).

Because sets of one or more columns can be stored in separate column partitions, only the column partitions that contain the columns needed by the query need to be accessed.  Just as row-partitioning can eliminate rows that need not be read, column partitioning eliminates columns that are not needed.

The advantages of both can be combined, meaning even less data moved and thus reduced I/O.  Fewer data blocks need to be read since more data of interest is packed together into fewer data blocks.  

Columnar makes more sense in CPU-rich environments because CPU cycles are needed to "glue" columns back together into rows, for compression and for different join strategies (mainly hash joins).

Benefits of using the Teradata Columnar:
  • Improved query performance
  • Reduced disk space
  • Increased flexibility
  • Reduced I/O
  • Ease of use

Continue Reading...