TP2829

Teradata Physical Database Design Training

In this course, students will delve deeper into the Teradata architecture and cover more advanced topics such as modeling and compression.

Course Details

Duration

1 day

Prerequisites

No prior knowledge is presumed.

Target Audience

Advanced audience who have a desire to learn Teradata Architecture at the most detailed level.

Skills Gained

  • Deeper knowledge and understanding of the architecture of their Teradata system at a very advanced level.
Course Outline
  • Physical Database Design in Layman's Terms
    • What is Parallel Processing?
    • The Basics of a Single Computer
    • Teradata Parallel Processes Data
    • Parallel Architecture
    • The Teradata Architecture
    • All Teradata Tables are spread across ALL AMPS
    • Understand that Teradata can scale to incredible size
    • Teradata Parallel Processing
    • How Teradata Creates Tables
    • Every AMP has the Exact Same Tables
    • All Teradata Tables are spread across All AMPs
    • The Table Header and the Data Rows are Stored Separately
    • An AMP Stores the Rows of a Table inside a Data Block
    • To Read a Data Block an AMP Moves the Block into Memory
    • A Full Table Scan Means All AMPs must Read All Rows
    • The “Achilles Heel” or Slowest Process is Block Transfer
    • Each Table has a Primary Index
    • A Query Using the Primary Index is a Single AMP Retrieve.
    • Using EXPLAIN
    • As Rows are added a Data Block will Eventually Split
    • A Full Table Scan Means All AMPs must Read All Blocks
    • An EXPLAIN that shows a Full Table Scan
    • A Primary Index Query uses a Single AMP and Single Block
    • Using EXPLAIN in a NUPI Query
    • Each AMP Can Have Many Blocks for a Single Table
    • A Full Table Scan Means All AMPs must Read All Blocks
    • Intelligent Memory (Teradata V14.10)
    • Teradata V14.10 Intelligent Memory Gives Data a Temperature
    • Data deemed VeryHot stays in each AMP's Intelligent Memory
    • Intelligent Memory Stays in Memory
    • What is the Goal of a Teradata Physical Database Design?
  • Three Options for Teradata Table Design
    • There are Three Options to Teradata Table Design
    • Teradata V14.10 Intelligent Memory
    • How Teradata Creates Traditional Tables
    • Each Table has a Primary Index
    • A Query Using the Primary Index is a Single AMP Retrieve
    • A Primary Index Query uses a Single AMP and Single Block
    • How Teradata Creates a PPI Table
    • PPI Table Sorting the Rows by Month of Order_Date
    • An All AMPs Retrieve By Way of a Single Partition
    • Creating a PPI Table with CASE_N
    • A Visual of Case_N Partitioning
    • An All AMPs Retrieve By Way of a Single Partition
    • What does a Columnar Table look like?
    • A Comparison of Data for Normal Vs. Columnar
    • A Columnar Table is Best for Queries with Few Columns
    • Factors When Choosing Table Design
    • Teradata Limits for V12, V13 and V14
  • Teradata - The Cold Hard Facts
    • Teradata Parallel Processing
    • Each Table has a Primary Index that is Unique or Non-Unique
    • The Hash Map Determines which AMP will own the Row
    • A Unique Primary Index Spreads the Data Evenly
    • The AMP Adds a Uniqueness Value to Create the Row-ID
    • Each AMP Sorts Their Tables by the Row-ID
    • A Non-Unique Primary Index Skews the Data
    • Comparing the Same Table with Different Primary Indexes
    • Unique Primary Index Queries are a Single AMP Retrieve
    • A Non-Unique Primary Index is also a Single AMP Retrieve
    • Teradata has a No Primary Index Table called a NoPI Table
    • There are Normal Tables and then There are Partitioned Tables
    • A Visual of One Year of Data with Range_N per Month
    • Partitioning is Designed to Eliminate the Full Table Scan
    • A Partition # and Row-ID = Row Key
    • An AMP Stores its Rows Sorted in only Two Different Ways
    • AMPs Moves Their Data Blocks into Memory to Read/Write
    • Most Taxing thing for an AMP is Moving Blocks into Memory
    • Rows are Stored in Data Blocks which are stored in Cylinders
    • Rows for an AMP Stored Inside a Data Block in a Cylinder
    • An AMP’s Master Index is Used to Find the Right Cylinder
    • The Row Reference Array (RRA) Does the Binary Search?
    • A Block Splits into Two Blocks at Maximum Block Size
    • Data Blocks Maximum Block Size has Changed (V14.10)
    • The New Block Split with Teradata V14.10
    • The Block Split with Even More Detail in Teradata V14.10
    • Teradata V14.10 Block Split Defaults
    • There is One Master Index and Thousands of Cylinder Indexes
    • Each Table has a 48-bit TableID
  • How Joins Work Under the Covers
    • The Joining of Two Tables
    • Teradata Moves Joining Rows to the Same AMP
    • Imagine Joining Two NoPI Tables that have No Primary Index
    • Both Tables are Redistributed to Join Rows on the Same AMP
    • How do you join if One Table is Big and One Table is Small?
    • Duplicate the Small Table on Every AMP (like a mirror)
    • What Could You Do If Two Tables Joined 1000 Times a Day?
    • Joining Two Tables with the same PK/FK Primary Index
    • A Join With No Redistribution or Duplication
    • A Performance Tuning Technique for Large Joins
    • The Joining of Two Tables with an Additional WHERE Clause
    • Using a Simple Volatile Table
    • A Volatile Table with a Primary Index
    • Using a Simple Global Temporary Table
    • Two Brilliant Techniques for Global Temporary Tables
    • The Joining of Two Tables Using a Global Temporary Table
    • Teradata V14.10 Join Feature PRPD
  • Secondary Indexes – An Alternate Path to the Data
    • Creating a Unique Secondary Index (USI)
    • What is in a Unique Secondary Index (USI) Subtable?
    • A Unique Secondary Index (USI) Subtable is Hashed
    • How the Parsing Engine uses the USI Subtable
    • A USI is a Two-AMP Operation
    • Creating a Non-Unique Secondary Index (NUSI)
    • What is in a Unique Secondary Index (USI) Subtable?
    • Non-Unique Secondary Index (NUSI) Subtable is AMP Local
    • How the Parsing Engine uses the NUSI Subtable
    • Creating a Value-Ordered NUSI
    • The Hash Map Determines which AMP will own the Row
    • A Unique Primary Index Spreads the Data Evenly
    • A Picture with a Base Table, USI, and NUSI Subtable
    • A Query Using an USI Only Moves Two Blocks
    • A Query Using A NUSI Always Uses All AMPs
    • Two Non-Unique Secondary Indexes (NUSI) on a Table
    • A NUSI BITMAP Query
    • A Brilliant Technique for a Unique Secondary Index
    • The USI for Partitioned Tables Points to the Row Key
    • A Brilliant Technique for a Non-Unique Secondary Index
    • The NUSI for Partitioned Tables Points to the Row Key
    • Multiple Choice DBA
    • What are the Big Four Tactical Queries?
  • Physical Design
    • The Four Stages of Modeling for Teradata
    • The Logical Model
    • The Logical Model can be loaded inside Nexus
    • First, Second and Third Normal Form
    • The Employee_Table and Department_Table can be Joined
    • The Employee_Table and Department_Table Join SQL
    • The Extended Logical Model Template
    • User Access is of Great Importance
    • User Access in Layman’s Terms
    • User Access for Joins in Layman’s Terms
    • The Nexus Shows Users the Table’s Primary Index
    • Data Demographics Tell Us if the Column is Worthy
    • Data Demographics – Distinct Rows
    • Data Demographics – Distinct Rows Query
    • Data Demographics – Max Rows Null
    • Data Demographics – Max Rows Null Query
    • Data Demographics – Max Rows Per Value
    • Data Demographics – Max Rows Per Value
    • Data Demographics – Typical Rows Per Value
    • Typical Rows Per Value – Query 1 (Median)
    • Typical Rows Per Value – Query 2 (Median)
    • Row_Number With Qualify to get the Typical Rows Per Value
    • SQL to Get the Average Rows Per Value for a Column (Mean)
    • Getting the Average Values Per Column
    • Data Demographics – Change Rating
    • Factors When Choosing Teradata Indexes
    • Finding Table Sizes
    • Finding Skew in The Tables in a Database
    • Finding Skew in a Table
    • Display the Distribution of a Column Per AMP
    • Primary Index Data Demographics Candidate Guidelines
    • Primary Index Access Considerations
    • The First Step is to Pick All Potential Primary Index Columns
    • The Second Step is to Pick All Potential Secondary Indexes
    • Answer to 2nd Step to Picking Potential Secondary Indexes
    • Now it is time to choose the Primary and Secondary Indexes
    • 3rd Step is to Picking your Indexes
    • Our Index Picks
  • Denormalization
    • Denormalization
    • Derived Data
    • Repeating Groups
    • Pre-Joining Tables
    • Storing Summary Data with a Trigger
    • Summary Tables or Data Marts the Old Way
    • Aggregate Join Index the New Way
    • New Aggregate Join Index (Teradata V14.10)
    • Horizontal Partitioning the Old Way
    • Horizontal Partitioning the New Way
    • Vertical Partitioning the Old Way
    • A Vertical Partitioning Trick that is Old School
    • Vertical Partitioning the New Way
    • Temporary Tables - A Volatile Table with a Primary Index
    • The Joining of Two Tables Using a Volatile Table
    • Temporary Tables - Global Temporary Tables
    • The Joining of Two Tables Using a Global Temporary Table
    • Creating a Multi-Table Join Index
    • Visual of a Join Index
    • Creating a Single-Table Join Index
    • Conceptual of a Single Table Join Index on an AMP
    • Single Table Join Index Great For LIKE Clause
    • Single Table Join Index with Value Ordered NUSI
    • Sparse Join Index
    • A Global Multi-Table Join Index
    • Creating a Hash Index
  • Things to Consider
    • Teradata Mode vs. ANSI Mode
    • ANSI Mode Transactions
    • Teradata Mode Transactions
    • Multi-Statement Transaction in BTEQ
    • Teradata Mode Default is a Set Table
    • Creating a Set Table with a Unique Primary Index
    • Creating a Set Table with a Unique Secondary Index
    • ANSI mode defaults to a Multiset Table
    • Using a Volatile Table to Get Rid of Duplicate Rows
    • Creating a Volatile Table
    • You Populate a Volatile Table with an INSERT/SELECT
    • The Three Steps to Use a Volatile Table
    • Why Would You Use the ON COMMIT DELETE ROWS?
    • The HELP Volatile Table Command Shows your Volatiles
    • A Volatile Table with a Primary Index
    • The Joining of Two Tables Using a Volatile Table
    • Creating a Secondary Index on a Volatile Table
    • You Can Collect Statistics on Volatile Tables
    • The New Teradata V14 Way to Collect Statistics
    • Creating Partitioned Primary Index (PPI) Volatile Tables
    • CREATING A Global Temporary Table
    • Using a Simple Global Temporary Table
    • Two Brilliant Techniques for Global Temporary Tables
    • The Joining of Two Tables Using a Global Temporary Table
    • Making an exact copy a Table
    • Making a NOT-So-Exact Copy a Table
    • Copying a Table
    • Troubleshooting Copying and Changing the Primary Index
    • Copying only specific columns of a table
    • Copying a Table and Keeping the Statistics
    • Copying a Table with Statistics
    • Copying a table Structure with NO Data but Statistics
    • What to COLLECT STATISTICS On?
    • A Huge Hint that No Statistics Have Been Collected
    • The Basic Syntax for COLLECT STATISTICS
    • The New Teradata V14 Way to Collect Statistics
    • COLLECT STATISTICS Directly From another Table
    • How to Recollect STATISTICS on a Table
  • Deep Dive Overhead for each Row
    • Why Go Deep inside the Overhead of a Row?
    • A Row Layout in Teradata
    • Row Length
    • Row ID
    • How The Row Hash is created for Each Row
    • Unique Primary Indexes have Even Distribution
    • The AMP adds a Uniqueness Value to Its Rows
    • The Row-Hash is 32-bits and so is the Uniqueness Value
    • Non-Unique Primary Indexes have Skewed Data
    • Flag Byte
    • Presence Byte
    • Presence Byte is used to show Null Values in each Row
    • A Close-up look at the Presence Byte for Nulls
    • What Happens when we need more than One Presence Byte?
  • Compression
    • Important Information about Compression
    • Presence Bytes are also used for Compression
    • Why One Byte (8 bits) can represent up to 255 Values
    • Now that you Understand that 8 Bits can Represent 0 – 255
    • The Cost Vs. the Savings
    • The Cost List of Compression
    • A Deeper Dive Into NULL Values
    • Using the DBC Tables in a Compression Experiment
    • We then moved all Eight Tables to another Database
    • Compression Reports with Nexus and SmartCompress
    • We Then Created Two Global Temporary Tables
    • We Then Created and Executed our Macro
    • Report Comparing Compressed and NonCompressed Tables
  • Top SQL Commands Cheat Sheet
    • SELECT All Columns from a Table and Sort
    • Select Specific Columns and Limiting the Rows
    • Changing your Default Database
    • Keywords that describe you
    • Select TOP Rows in a Rank Order
    • A Sample number of rows
    • Getting a Sample Percentage of rows
    • Find Information about a Database
    • Find information about a Table
    • Using Aggregates
    • Performing a Join
    • Performing a Join using ANSI Syntax
    • Using Date, Time and Timestamp
    • Using Date Functions
    • Using the System Calendar
    • Using the System Calendar in a Query
    • Formatting Data
    • Using Rank
    • Using a Derived Table
    • Using a Subquery
    • Correlated Subquery
    • Using Substring
    • Basic CASE Statement
    • Advanced CASE Statement
    • Using an Access Lock in your SQL
    • Collect Statistics
    • CREATING a Volatile Table with a Primary Index
    • CREATING a Volatile Table that is Partitioned (PPI)
    • CREATING a Volatile Table that is deleted after the Query
    • Finding the Typical Rows per Value for specific column
    • Finding out how much Space you have
    • How much Space you have Per AMP
    • Finding your Space
    • Finding Space Skew in Tables in a Database
    • Finding the Number of rows per AMP for a Column
    • Finding Account Information
    • Ordered Analytics