TP2828
Teradata Basics Training
In this course, students will learn the basics of Teradata architecture with a focus on what’s important to know from an IT and Developer perspective.
Course Details
Duration
1 day
Prerequisites
No prior knowledge is presumed.
Target Audience
- IT Professionals
- Developers
Skills Gained
- Ability to make strategic decisions regarding their Teradata environment.
Course Outline
- The Teradata Architecture
- What is Parallel Processing?
- The Basics of a Single Computer
- Teradata Parallel Processes Data
- Parallel Architecture
- The Teradata Architecture
- All Teradata Tables are spread across ALL AMPS
- Teradata Systems can Add AMPs for Linear Scalability
- Understand that Teradata can scale to incredible size
- AMPs and Parsing Engines (PEs) live inside SMP Nodes
- Each Node is attached via a Network to a Disk Farm
- Two SMP Nodes Connected Become One MPP System
- There are Many Nodes in a Teradata Cabinet
- Inside a Teradata Node
- The Boardless BYNET and the Physical BYNET
- The Parsing Engine
- The AMPs Responsibilities
- This is the Visual You Want to Know in order to Understand Teradata
- The Primary Index
- The Primary Index is defined when the table is CREATED
- A Unique Primary Index (UPI)
- Primary Index in the WHERE Clause - Single-AMP Retrieve
- Using EXPLAIN
- A Non-Unique Primary Index (NUPI)
- Primary Index in the WHERE Clause - Single-AMP Retrieve
- Using EXPLAIN in a NUPI Query
- Primary Index in the WHERE Clause - Single-AMP Retrieve
- A Full Table Scan is likely on a table with NO Primary Index
- An EXPLAIN that shows a Full Table Scan
- What happens when you forget the Primary Index?
- Why create a table with No Primary Index (NoPI)?
- Hashing of the Primary Index
- The Hashing Formula Facts
- The Hash Map determines which AMP will own the Row
- Placing rows on the AMP
- Placing rows on the AMP Continued
- A Review of the Hashing Process
- Non-Unique Primary Indexes have Skewed Data
- The Uniqueness Value
- The Row Hash and Uniqueness Value make up the Row-ID
- Two Reasons why each AMP Sorts their rows by the Row-ID
- AMPs sort their rows by Row-ID to Group like Data
- AMPs sort their rows by Row-ID to do a Binary Search
- Null Values all Hash to the Same AMP
- Teradata - The Cold Hard Facts
- All Teradata Tables are spread across All AMPs
- The Table Header and the Data Rows are Stored Separately
- An AMP Stores the Rows of a Table inside a Data Block
- To Read a Data Block, an AMP Moves the Block into Memory
- Nothing is done on disk and everything is done in Memory
- Most Taxing thing for an AMP is Moving Blocks into Memory
- A Full Table Scan Means All AMPs must Read All Rows
- The “Achilles Heel and slowest process is Block Transfer
- Each Table has a Primary Index
- A Query Using the Primary Index is a Single AMP Retrieve.
- As Rows are added a Data Block will Eventually Split
- A Full Table Scan Means All AMPs must Read All Blocks
- A Primary Index Query uses a Single AMP and Single Block
- Each AMP Can Have Many Blocks for a Single Table
- A Full Table Scan Means All AMPs must Read All Blocks
- Synchronized Scan (Sync Scan)
- EXPLAIN Using a Synchronized Scan
- Intelligent Memory (Teradata V14.10)
- Teradata V14.10 Intelligent Memory Gives Data a Temperature
- Data deemed VeryHot stays in each AMP's Intelligent Memory
- Intelligent Memory Stays in Memory
- What is the Goal of a Teradata Physical Database Design?
- Inside the AMPs Disk
- Rows are Stored in Data Blocks which are stored in Cylinders
- An AMP's rows are stored inside a Data Block in a Cylinder
- An AMP’s Master Index is used to find the Right Cylinder
- The Row Reference Array (RRA) Does the Binary Search?
- A Block Splits into Two Blocks at Maximum Block Size
- Data Blocks Maximum Block Size has Changed (V14.10)
- The New Block Split with Teradata V14.10
- The Block Split with Even More Detail in Teradata V14.10
- Teradata V14.10 Block Split Defaults
- There is One Master Index and Thousands of Cylinder Indexes
- Blocks Continue to Split as Tables Grow Larger
- FYI – Some Advanced Information about Data Block Headers
- A top down view of Cylinders
- There are Hot, Warm, and Cold Cylinders
- Cylinders are used for Perm, Spool, Temp, and Journals
- Each AMP has Their Own Master Index
- Each Cylinder on an AMP has a Cylinder Index
- A More Detailed Illustration of the Master Index
- A Real-World View of the Master Index
- An Even More Realistic View of an AMP’s Master Index
- The Cylinder Index
- An Even More Realistic View of a Cylinder Index
- How a Query using the Primary Index works
- How the AMPs Do a Full Table Scan
- How an AMP Reads Using a Primary Index
- Partition Primary Index (PPI) Tables
- The Concept behind Partitioning a Table
- Creating a PPI Table with Simple Partitioning
- A Visual Display of Simple Partitioning
- Creating a PPI Table with RANGE_N Partitioning per Month
- A Visual of One Year of Data with Range_N per Month
- A Partition # and Row-ID = Row Key
- An AMP Stores its Rows Sorted in only Two Different Ways
- Creating a PPI Table with RANGE_N Partitioning per Day
- A Visual of Range_N Partitioning Per Day
- Creating a PPI Table with RANGE_N Partitioning per Week
- A Visual of Range_N Partitioning Per Week
- A Clever Range_N Option
- Creating a PPI Table with CASE_N
- A Visual of Case_N Partitioning
- How many partitions do you see?
- Number of PPI Partitions Allowed
- How many partitions do you see?
- NO CASE and UNKNOWN Partitions Together
- A Visual of Case_N Partitioning
- Combining Older Data and Newer Data in PPI
- A Visual for Combining Older Data and Newer Data in PPI
- The SQL on Combining Older Data and Newer Data in PPI
- Multi-Level Partitioning Combining Range_N and Case_N
- A Visual of Multi-Level Partitioning
- The SQL on a Multi-Level Partitioned Primary Index
- NON-Unique Primary Indexes (NUPI) in PPI
- PPI Table with a Unique Primary Index (UPI)
- Tricks for Non-Unique Primary Indexes (NUPI)
- Character Based PPI for RANGE_N
- A Visual for Character-Based PPI for RANGE_N
- The SQL on Character-Based PPI for RANGE_N
- Character-Based PPI for CASE_N
- Dates and Character-Based Multi-Level PPI
- TIMESTAMP Partitioning
- Using CURRENT_DATE to define a PPI
- ALTER to CURRENT_DATE the next year
- ALTER to CURRENT_DATE with Save
- Altering a PPI Table to Add or Drop Partitions
- Deleting a Partition
- Deleting a Partition and saving its contents
- Using the PARTITION Keyword in your SQL
- SQL for RANGE_N
- SQL for CASE_N
- Columnar Tables
- Columnar Tables have NO Primary Index
- This is NOT a NoPI Table
- NoPI Tables Spread rows across all-AMPs Evenly
- NoPI Tables used as Staging Tables for Data Loads
- NoPI Table Capabilities
- NoPI Table Restrictions
- What does a Columnar Table look like?
- Comparing Normal Table vs. Columnar Tables
- Columnar Table Fundamentals
- Columnar can move just One Container to Memory
- Containers on AMPs match up perfectly to rebuild a Row
- Indexes can be used on Columns (Containers)
- Indexes can be used on Columns (Containers)
- Visualize a Columnar Table
- Single-Column vs. Multi-Column Containers
- Comparing Normal Table vs. Columnar Tables
- Columnar Row Hybrid CREATE Statemen
- Review of Row-Based Partition Primary Index (PPI)
- Visual of Row Partitioning (PPI Tables) by Month
- CREATE Statement for both Row and Column Partition
- Visual of Row Partitioning (PPI Tables) and Columnar
- How to Load into a Columnar Table
- Columnar NO AUTO COMPRESS
- Auto Compress in Columnar Tables
- Auto Compress Techniques in Columnar Tables
- When and When NOT to use Columnar Tables
- Did you know?
- Space
- When your System Arrives, there is only User named DBC
- USER DBC
- First Assignment is to create another User just under DBC
- USER DBC
- Perm and Spool Space
- Perm Space is for Permanent Tables
- Spool Space is work space that builds a User’s Answer Sets
- Spool Space is in an AMP’s Memory and on its Disk
- Users are Assigned Spool Space Limits
- What is the Purpose of Spool Limits?
- Why did my query Abort and say “Out of Spool”?
- How can Skewed Data cause me to run “Out of Spool”?
- Why did my Join cause me to run “Out of Spool”?
- Finding out how much Space you have
- Space per AMP on all tables in a Database shows Skew
- What does my system look like when it first arrives?
- DBC owns all the PERM Space in the system on day one
- DBC’s First Assignment is Spool Space
- DBC’s 2nd Assignment is to CREATE Users and Databases
- The Teradata Hierarchy Begins
- The Teradata Hierarchy Continues
- Differences between PERM and SPOOL
- Databases, Users, and Views
- What are Similarities between a DATABASE and a USER?
- What is the Difference between a DATABASE and a USER?
- Objects that take up PERM Space
- The User Environment
- DBC is the only user when the system first arrives
- DBC will Create Databases and Give them Space
- DBC will create some initial Users
- A Typical Teradata Environment
- What are Similarities between a DATABASE and a USER?
- Roles
- Create a Role and then Assign that Role Its Access Rights
- Create a User and Assign them a Default Role
- Granting Access Rights
- There are Three Types of Access Rights
- Description of the Three Types of Access Rights
- Profiles
- Creating a Profile and a User
- ProfileInfoVX, RoleMembers, RoleInfo and UserRoleRights
- Accounts and their Associated Priorities
- Creating a User with Multiple Account Priorities
- Account String Expansion (ASE)
- The DBC.AMPUsage View
- Teradata TASM provides a User Traffic System
- Teradata Viewpoint
- Secondary Indexes
- Creating a Unique Secondary Index (USI)
- What is in a Unique Secondary Index (USI) Subtable?
- A Unique Secondary Index (USI) Subtable is hashed
- How the Parsing Engine uses the USI Subtable
- A USI is a Two-AMP Operation
- Creating a Non-Unique Secondary Index (NUSI)
- What is in a Unique Secondary Index (USI) Subtable?
- Non-Unique Secondary Index (NUSI) Subtable is AMP Local
- How the Parsing Engine uses the NUSI Subtable
- Creating a Value-Ordered NUSI
- The Hash Map Determines which AMP will own the Row
- A Unique Primary Index Spreads the Data Evenly
- A Picture with a Base Table, USI, and NUSI Subtable
- A Query Using an USI Only Moves Two Blocks
- A Query Using A NUSI Always Uses All AMPs
- Two Non-Unique Secondary Indexes (NUSI) on a Table
- A NUSI BITMAP Query (1 of 3)
- A NUSI BITMAP Theory (2 of 3)
- A NUSI Bitmap in Action (3 of 3)
- A Brilliant Technique for a Unique Secondary Index
- The USI for Partitioned Tables Points to the Row Key
- A Brilliant Technique for a Non-Unique Secondary Index
- The NUSI for Partitioned Tables Points to the Row Key
- How the PE Decides on the NUSI or the Full Table Scan
- Multiple Choice DBA
- What are the Big Four Tactical Queries?
- Temporal Tables Create Functions
- Three types of Temporal Tables
- CREATING a Bi-Temporal Table
- PERIOD Data Types
- Bi-Temporal Data Type Standards
- A Look at the Temporal Results
- Normal SQL for Bi-Temporal Tables
- NONSEQUENCED SQL for Temporal Tables
- AS OF SQL for Temporal Tables
- NONSEQUENCED for Both
- Creating Views for Temporal Tables
- How Joins Work Internally
- The Joining of Two Tables
- Teradata Moves Joining Rows to the Same AMP
- Imagine Joining Two NoPI Tables that have No Primary Index
- Both Tables are redistributed to Join Rows on the Same AMP
- How do you join if One Table is Big and One Table is Small?
- Duplicate the Small Table on Every AMP (like a mirror)
- What Could You Do If Two Tables Joined 1000 Times a Day?
- Joining Two Tables with the same PK/FK Primary Index
- A Join with No Redistribution or Duplication
- A Performance Tuning Technique for Large Joins
- The Joining of Two Tables with an Additional WHERE Clause
- Using a Simple Volatile Table
- A Volatile Table with a Primary Index
- Using a Simple Global Temporary Table
- Two Brilliant Techniques for Global Temporary Tables
- The Joining of Two Tables Using a Global Temporary Table
- Teradata V14.10 Join Feature PRPD
- Join Indexes
- Creating a Multi-Table Join Index
- Visual of a Join Index
- Outer Join Multi-Table Join Index
- Visual of a Left Outer Join Index
- Compressed Multi-Table Join Index
- A Visual of a Compressed Multi-Table Join Index
- Creating a Single-Table Join Index
- Conceptual of a Single Table Join Index on an AMP
- Single Table Join Index Great For LIKE Clause
- Single Table Join Index with Value Ordered NUSI
- Aggregate Join Indexes
- Compressed Single-Table Join Index
- Aggregate Join Index
- New Aggregate Join Index (Teradata V14.10)
- Sparse Join Index
- A Global Multi-Table Join Index
- Creating a Hash Index
- Join Index Details
- Collect Statistics
- The Teradata Parsing Engine (Optimizer) is Cost Based
- The Purpose of Collect Statistics
- When Teradata Collects Statistics it creates a Histogram
- The Interval of the Collect Statistics Histogram
- What to COLLECT STATISTICS On?
- Why Collect Statistics?
- How do you know if Statistics were collected on a Table?
- A Huge Hint that No Statistics Have Been Collected
- The Basic Syntax for COLLECT STATISTICS
- The New Teradata V14 Way to Collect Statistics
- Where Does Teradata Keep the Collected Statistics?
- The Official Syntax for COLLECT STATISTICS
- How to Recollect STATISTICS on a Table
- Teradata Always Does a Random AMP Sample
- Random Sample is kept in the Table Header in FSG Cache
- Multiple Random AMP Samplings
- How a Random AMP gets a Table Row count
- Random AMP Estimates for NUSI Secondary Indexes
- USI Random AMP Samples are Not Considered
- There’s No Random AMP Estimate for Non-Indexed Columns
- The PE's Plan if No Statistics Were Collected?
- Stale Statistics Detection and Extrapolation
- Extrapolation for Future Dates
- How to Copy a Table with Data and the Statistics?
- How to Copy a Table with NO Data and the Statistics?
- COLLECT STATISTICS Directly From another Table
- When to COLLECT STATISTICS Using only a SAMPLE
- How to Collect Statistics on a PPI Table on the Partition
- Teradata V12 and V13 Statistics Enhancements
- Teradata V14 Statistics Enhancements
- Teradata V14 Summary Statistics
- Teradata V14 MaxValueLength
- Teradata V14 MaxIntervals
- Teradata V14 Sample N Percent
- Teradata V14.10 Statistics Collection Improvements
- Teradata V14.10 Statistics Collection Improvements
- Teradata V14.10 AutoStats feature
- Teradata Statistics Wizard
- Temporary Tables
- There are three types of Temporary Tables
- CREATING A Derived Table
- Naming the Derived Table
- Aliasing the Column Names in the Derived Table
- Most Derived Tables Are Used To Join To Other Tables
- Multiple Ways to Alias the Columns in a Derived Table
- Column Aliasing Can Default for Normal Columns
- CREATING A Derived Table using the WITH Command
- The Same Derived Query shown Three Different Ways
- Clever Tricks on Aliasing Columns in a Derived Table
- A Derived Table lives only for the lifetime of a single query
- WITH RECURSIVE Derived Table
- Defining the WITH Recursive Derived Table
- Looping Through the Recursive Derived Table
- Looping Through a Second Time
- Looping Through a Third Time
- Looping Through and Adding Nothing Ends the Loop
- Looping Through the WITH Recursive Derived Table
- Creating a Volatile Table
- You Populate a Volatile Table with an INSERT/SELECT
- The Three Steps to Use a Volatile Table
- Why Would You Use the ON COMMIT DELETE ROWS?
- The HELP Volatile Table Command Shows your Volatiles
- A Volatile Table with a Primary Index
- The Joining of Two Tables Using a Volatile Table
- You Can Collect Statistics on Volatile Tables
- The New Teradata V14 Way to Collect Statistics
- Creating Partitioned Primary Index (PPI) Volatile Tables
- Using a Volatile Table to Get Rid of Duplicate Rows
- Using a Simple Global Temporary Table
- Two Brilliant Techniques for Global Temporary Tables
- The Joining of Two Tables Using a Global Temporary Table
- CREATING A Global Temporary Table
- Teradata Load Utilities Introduction
- The Teradata Utilities
- Block Level Utilities
- Row Level Utilities
- Fast Path Inserts Using Insert/Select
- Fast Path Deletes
- Freespace Percent
- Referential Integrity and Load Utility Solutions
- Teradata has a No Primary Index Table called a NoPI Table
- This is NOT Necessarily a NoPI Table
- NoPI Tables Spread rows across all-AMPs Evenly
- NoPI Tables used as Staging Tables for Data Loads
- NoPI Table Capabilities
- Why Would a NoPI Table have a Row-ID?
- BTEQ – Batch Teradata Query Tool
- How to Logon to BTEQ in Interactive Mode
- Running Queries in BTEQ in Interactive Mode
- BTEQ Commands vs BTEQ SQL Statements
- WITH BY Command for Subtotals
- WITH Command for a Grand Total
- WITH and WITH BY Together for Subtotals and Grand Totals
- How to Logon to BTEQ in a SCRIPT
- Running Queries in BTEQ through a Batch Script
- Running a BTEQ Batch Script through the Command Prompt
- Running a BTEQ Batch Script through the Run Command
- Using Nexus to Build Your BTEQ Scripts
- Using Nexus to Build Your BTEQ Scripts
- FastLoad
- Block Level Utility Limits
- FastLoad has Two Phases
- FastLoad Phase 1
- FastLoad Phase 2
- A Sample FastLoad Script Created by Nexus SmartScript
- Executing the FastLoad Script
- The Nexus SmartScript Easily Builds Your Utilities
- The Nexus SmartScript FastLoad Builder
- Create and Execute Your FastLoad Scripts with Nexus
- MultiLoad
- Block Level Utility Limits
- MultiLoad has Five Phases
- MultiLoad has IMPORT and DELETE Tasks
- A Sample MultiLoad Script Created by Nexus SmartScript
- TPump
- TPump is NOT a Block Level Utility and has No Limits
- Limitations of TPump
- A Sample TPump Script Created by Nexus SmartScript
- FastExport
- New Rules for Block Utilities
- FastExport by Default places Null Indicators in Output
- What is TPT?
- TPT Producers Create Streams and Consumers Write Them
- The Four Major Operators of TPT
- TPT can read from multiple source files in Parallel
- TPT can have more Operators than Consumers
- TPT Operators and their Equivalent Load Utility
- How to Run a TPT Script
- Top SQL Commands Cheat Sheet
- SELECT All Columns from a Table and Sort
- Select Specific Columns and Limiting the Rows
- Changing your Default Database
- Keywords that describe you
- Select TOP Rows in a Rank Order
- A Sample number of rows
- Getting a Sample Percentage of rows
- Find Information about a Database
- Find information about a Table
- Using Aggregates
- Performing a Join
- Performing a Join using ANSI Syntax
- Using Date, Time and Timestamp
- Using Date Functions
- Using the System Calendar
- Using the System Calendar in a Query
- Formatting Data
- Using Rank
- Using a Derived Table
- Using a Subquery
- Correlated Subquery
- Using Substring
- Basic CASE Statement
- Advanced CASE Statement
- Using an Access Lock in your SQL
- Collect Statistics
- CREATING a Volatile Table with a Primary Index
- CREATING a Volatile Table that is Partitioned (PPI)
- CREATING a Volatile Table that is deleted after the Query
- Finding the Typical Rows per Value for specific column
- Finding out how much Space you have
- How much Space you have Per AMP
- Finding your Space
- Finding Space Skew in Tables in a Database
- Finding the Number of rows per AMP for a Column
- Finding Account Information
- Ordered Analytics