TP2829
Teradata Physical Database Design Training
In this course, students will delve deeper into the Teradata architecture and cover more advanced topics such as modeling and compression.
Course Details
Duration
1 day
Prerequisites
No prior knowledge is presumed.
Target Audience
Advanced audience who have a desire to learn Teradata Architecture at the most detailed level.
Skills Gained
- Deeper knowledge and understanding of the architecture of their Teradata system at a very advanced level.
Course Outline
- Physical Database Design in Layman's Terms
- What is Parallel Processing?
- The Basics of a Single Computer
- Teradata Parallel Processes Data
- Parallel Architecture
- The Teradata Architecture
- All Teradata Tables are spread across ALL AMPS
- Understand that Teradata can scale to incredible size
- Teradata Parallel Processing
- How Teradata Creates Tables
- Every AMP has the Exact Same Tables
- All Teradata Tables are spread across All AMPs
- The Table Header and the Data Rows are Stored Separately
- An AMP Stores the Rows of a Table inside a Data Block
- To Read a Data Block an AMP Moves the Block into Memory
- A Full Table Scan Means All AMPs must Read All Rows
- The “Achilles Heel” or Slowest Process is Block Transfer
- Each Table has a Primary Index
- A Query Using the Primary Index is a Single AMP Retrieve.
- Using EXPLAIN
- As Rows are added a Data Block will Eventually Split
- A Full Table Scan Means All AMPs must Read All Blocks
- An EXPLAIN that shows a Full Table Scan
- A Primary Index Query uses a Single AMP and Single Block
- Using EXPLAIN in a NUPI Query
- Each AMP Can Have Many Blocks for a Single Table
- A Full Table Scan Means All AMPs must Read All Blocks
- Intelligent Memory (Teradata V14.10)
- Teradata V14.10 Intelligent Memory Gives Data a Temperature
- Data deemed VeryHot stays in each AMP's Intelligent Memory
- Intelligent Memory Stays in Memory
- What is the Goal of a Teradata Physical Database Design?
- Three Options for Teradata Table Design
- There are Three Options to Teradata Table Design
- Teradata V14.10 Intelligent Memory
- How Teradata Creates Traditional Tables
- Each Table has a Primary Index
- A Query Using the Primary Index is a Single AMP Retrieve
- A Primary Index Query uses a Single AMP and Single Block
- How Teradata Creates a PPI Table
- PPI Table Sorting the Rows by Month of Order_Date
- An All AMPs Retrieve By Way of a Single Partition
- Creating a PPI Table with CASE_N
- A Visual of Case_N Partitioning
- An All AMPs Retrieve By Way of a Single Partition
- What does a Columnar Table look like?
- A Comparison of Data for Normal Vs. Columnar
- A Columnar Table is Best for Queries with Few Columns
- Factors When Choosing Table Design
- Teradata Limits for V12, V13 and V14
- Teradata - The Cold Hard Facts
- Teradata Parallel Processing
- Each Table has a Primary Index that is Unique or Non-Unique
- The Hash Map Determines which AMP will own the Row
- A Unique Primary Index Spreads the Data Evenly
- The AMP Adds a Uniqueness Value to Create the Row-ID
- Each AMP Sorts Their Tables by the Row-ID
- A Non-Unique Primary Index Skews the Data
- Comparing the Same Table with Different Primary Indexes
- Unique Primary Index Queries are a Single AMP Retrieve
- A Non-Unique Primary Index is also a Single AMP Retrieve
- Teradata has a No Primary Index Table called a NoPI Table
- There are Normal Tables and then There are Partitioned Tables
- A Visual of One Year of Data with Range_N per Month
- Partitioning is Designed to Eliminate the Full Table Scan
- A Partition # and Row-ID = Row Key
- An AMP Stores its Rows Sorted in only Two Different Ways
- AMPs Moves Their Data Blocks into Memory to Read/Write
- Most Taxing thing for an AMP is Moving Blocks into Memory
- Rows are Stored in Data Blocks which are stored in Cylinders
- Rows for an AMP Stored Inside a Data Block in a Cylinder
- An AMP’s Master Index is Used to Find the Right Cylinder
- The Row Reference Array (RRA) Does the Binary Search?
- A Block Splits into Two Blocks at Maximum Block Size
- Data Blocks Maximum Block Size has Changed (V14.10)
- The New Block Split with Teradata V14.10
- The Block Split with Even More Detail in Teradata V14.10
- Teradata V14.10 Block Split Defaults
- There is One Master Index and Thousands of Cylinder Indexes
- Each Table has a 48-bit TableID
- How Joins Work Under the Covers
- The Joining of Two Tables
- Teradata Moves Joining Rows to the Same AMP
- Imagine Joining Two NoPI Tables that have No Primary Index
- Both Tables are Redistributed to Join Rows on the Same AMP
- How do you join if One Table is Big and One Table is Small?
- Duplicate the Small Table on Every AMP (like a mirror)
- What Could You Do If Two Tables Joined 1000 Times a Day?
- Joining Two Tables with the same PK/FK Primary Index
- A Join With No Redistribution or Duplication
- A Performance Tuning Technique for Large Joins
- The Joining of Two Tables with an Additional WHERE Clause
- Using a Simple Volatile Table
- A Volatile Table with a Primary Index
- Using a Simple Global Temporary Table
- Two Brilliant Techniques for Global Temporary Tables
- The Joining of Two Tables Using a Global Temporary Table
- Teradata V14.10 Join Feature PRPD
- Secondary Indexes – An Alternate Path to the Data
- Creating a Unique Secondary Index (USI)
- What is in a Unique Secondary Index (USI) Subtable?
- A Unique Secondary Index (USI) Subtable is Hashed
- How the Parsing Engine uses the USI Subtable
- A USI is a Two-AMP Operation
- Creating a Non-Unique Secondary Index (NUSI)
- What is in a Unique Secondary Index (USI) Subtable?
- Non-Unique Secondary Index (NUSI) Subtable is AMP Local
- How the Parsing Engine uses the NUSI Subtable
- Creating a Value-Ordered NUSI
- The Hash Map Determines which AMP will own the Row
- A Unique Primary Index Spreads the Data Evenly
- A Picture with a Base Table, USI, and NUSI Subtable
- A Query Using an USI Only Moves Two Blocks
- A Query Using A NUSI Always Uses All AMPs
- Two Non-Unique Secondary Indexes (NUSI) on a Table
- A NUSI BITMAP Query
- A Brilliant Technique for a Unique Secondary Index
- The USI for Partitioned Tables Points to the Row Key
- A Brilliant Technique for a Non-Unique Secondary Index
- The NUSI for Partitioned Tables Points to the Row Key
- Multiple Choice DBA
- What are the Big Four Tactical Queries?
- Physical Design
- The Four Stages of Modeling for Teradata
- The Logical Model
- The Logical Model can be loaded inside Nexus
- First, Second and Third Normal Form
- The Employee_Table and Department_Table can be Joined
- The Employee_Table and Department_Table Join SQL
- The Extended Logical Model Template
- User Access is of Great Importance
- User Access in Layman’s Terms
- User Access for Joins in Layman’s Terms
- The Nexus Shows Users the Table’s Primary Index
- Data Demographics Tell Us if the Column is Worthy
- Data Demographics – Distinct Rows
- Data Demographics – Distinct Rows Query
- Data Demographics – Max Rows Null
- Data Demographics – Max Rows Null Query
- Data Demographics – Max Rows Per Value
- Data Demographics – Max Rows Per Value
- Data Demographics – Typical Rows Per Value
- Typical Rows Per Value – Query 1 (Median)
- Typical Rows Per Value – Query 2 (Median)
- Row_Number With Qualify to get the Typical Rows Per Value
- SQL to Get the Average Rows Per Value for a Column (Mean)
- Getting the Average Values Per Column
- Data Demographics – Change Rating
- Factors When Choosing Teradata Indexes
- Finding Table Sizes
- Finding Skew in The Tables in a Database
- Finding Skew in a Table
- Display the Distribution of a Column Per AMP
- Primary Index Data Demographics Candidate Guidelines
- Primary Index Access Considerations
- The First Step is to Pick All Potential Primary Index Columns
- The Second Step is to Pick All Potential Secondary Indexes
- Answer to 2nd Step to Picking Potential Secondary Indexes
- Now it is time to choose the Primary and Secondary Indexes
- 3rd Step is to Picking your Indexes
- Our Index Picks
- Denormalization
- Denormalization
- Derived Data
- Repeating Groups
- Pre-Joining Tables
- Storing Summary Data with a Trigger
- Summary Tables or Data Marts the Old Way
- Aggregate Join Index the New Way
- New Aggregate Join Index (Teradata V14.10)
- Horizontal Partitioning the Old Way
- Horizontal Partitioning the New Way
- Vertical Partitioning the Old Way
- A Vertical Partitioning Trick that is Old School
- Vertical Partitioning the New Way
- Temporary Tables - A Volatile Table with a Primary Index
- The Joining of Two Tables Using a Volatile Table
- Temporary Tables - Global Temporary Tables
- The Joining of Two Tables Using a Global Temporary Table
- Creating a Multi-Table Join Index
- Visual of a Join Index
- Creating a Single-Table Join Index
- Conceptual of a Single Table Join Index on an AMP
- Single Table Join Index Great For LIKE Clause
- Single Table Join Index with Value Ordered NUSI
- Sparse Join Index
- A Global Multi-Table Join Index
- Creating a Hash Index
- Things to Consider
- Teradata Mode vs. ANSI Mode
- ANSI Mode Transactions
- Teradata Mode Transactions
- Multi-Statement Transaction in BTEQ
- Teradata Mode Default is a Set Table
- Creating a Set Table with a Unique Primary Index
- Creating a Set Table with a Unique Secondary Index
- ANSI mode defaults to a Multiset Table
- Using a Volatile Table to Get Rid of Duplicate Rows
- Creating a Volatile Table
- You Populate a Volatile Table with an INSERT/SELECT
- The Three Steps to Use a Volatile Table
- Why Would You Use the ON COMMIT DELETE ROWS?
- The HELP Volatile Table Command Shows your Volatiles
- A Volatile Table with a Primary Index
- The Joining of Two Tables Using a Volatile Table
- Creating a Secondary Index on a Volatile Table
- You Can Collect Statistics on Volatile Tables
- The New Teradata V14 Way to Collect Statistics
- Creating Partitioned Primary Index (PPI) Volatile Tables
- CREATING A Global Temporary Table
- Using a Simple Global Temporary Table
- Two Brilliant Techniques for Global Temporary Tables
- The Joining of Two Tables Using a Global Temporary Table
- Making an exact copy a Table
- Making a NOT-So-Exact Copy a Table
- Copying a Table
- Troubleshooting Copying and Changing the Primary Index
- Copying only specific columns of a table
- Copying a Table and Keeping the Statistics
- Copying a Table with Statistics
- Copying a table Structure with NO Data but Statistics
- What to COLLECT STATISTICS On?
- A Huge Hint that No Statistics Have Been Collected
- The Basic Syntax for COLLECT STATISTICS
- The New Teradata V14 Way to Collect Statistics
- COLLECT STATISTICS Directly From another Table
- How to Recollect STATISTICS on a Table
- Deep Dive Overhead for each Row
- Why Go Deep inside the Overhead of a Row?
- A Row Layout in Teradata
- Row Length
- Row ID
- How The Row Hash is created for Each Row
- Unique Primary Indexes have Even Distribution
- The AMP adds a Uniqueness Value to Its Rows
- The Row-Hash is 32-bits and so is the Uniqueness Value
- Non-Unique Primary Indexes have Skewed Data
- Flag Byte
- Presence Byte
- Presence Byte is used to show Null Values in each Row
- A Close-up look at the Presence Byte for Nulls
- What Happens when we need more than One Presence Byte?
- Compression
- Important Information about Compression
- Presence Bytes are also used for Compression
- Why One Byte (8 bits) can represent up to 255 Values
- Now that you Understand that 8 Bits can Represent 0 – 255
- The Cost Vs. the Savings
- The Cost List of Compression
- A Deeper Dive Into NULL Values
- Using the DBC Tables in a Compression Experiment
- We then moved all Eight Tables to another Database
- Compression Reports with Nexus and SmartCompress
- We Then Created Two Global Temporary Tables
- We Then Created and Executed our Macro
- Report Comparing Compressed and NonCompressed Tables
- Top SQL Commands Cheat Sheet
- SELECT All Columns from a Table and Sort
- Select Specific Columns and Limiting the Rows
- Changing your Default Database
- Keywords that describe you
- Select TOP Rows in a Rank Order
- A Sample number of rows
- Getting a Sample Percentage of rows
- Find Information about a Database
- Find information about a Table
- Using Aggregates
- Performing a Join
- Performing a Join using ANSI Syntax
- Using Date, Time and Timestamp
- Using Date Functions
- Using the System Calendar
- Using the System Calendar in a Query
- Formatting Data
- Using Rank
- Using a Derived Table
- Using a Subquery
- Correlated Subquery
- Using Substring
- Basic CASE Statement
- Advanced CASE Statement
- Using an Access Lock in your SQL
- Collect Statistics
- CREATING a Volatile Table with a Primary Index
- CREATING a Volatile Table that is Partitioned (PPI)
- CREATING a Volatile Table that is deleted after the Query
- Finding the Typical Rows per Value for specific column
- Finding out how much Space you have
- How much Space you have Per AMP
- Finding your Space
- Finding Space Skew in Tables in a Database
- Finding the Number of rows per AMP for a Column
- Finding Account Information
- Ordered Analytics