Advanced Data Warehousing Training

Course #:WA1819

Advanced Data Warehousing Training

This 3-day course introduces experienced students to best industry practices for dealing with difficult data warehouse data structures, databases and processes.

This class is for experienced data warehouse architects and database designers. If you already have hands-on experience and want to refine your data warehousing skills, this is the class for you. The class will describe the most challenging data warehouse design problems the world of data warehousing has faced. You’ll also have the opportunity to share experiences gained on previous projects with other attendees. This class is intense and fast-paced.

What you will learn
  • How to decide the best architecture
  • Proven analytical modeling concepts
  • Use of surrogate keys
  • Deciding what is a dimension, dividing and combining dimensions, leveled dimensions, abstract dimensions, ragged dimensions
  • How to handle changing dimensions
  • How to handle snapshot data
  • How to handle complex dimensions
  • How and when to use aggregates
  • How to handle multiple units of measurement, such as currency reporting, multi-time-zone reporting, etc.
  • How to handle multi-valued dimensions
  • How to handle restatement
  • How to handle snowflake structures
  • Traversing unbalanced hierarchies
  • Experienced developers and administrators involved in data warehousing
  • Experienced business and technical data warehouse team members
  • Our “Data Modeling for the Data Warehouse”, or our "Designing the Data Warehouse", or equivalent experience
  • An understanding of data modeling, especially entity-relationship modeling
  3 days

Outline of Advanced Data Warehousing Training

1. Data Warehouse Architectures

  • Centralized
  • Functional
  • Federated
  • Imbedded data marts
  • Dependent data marts
  • Independent data marts

2. Analytical Modeling Primer

  • Facts and dimensions
  • Different schemas
  • The star schema
  • The snowflake schema
  • The normalized model
  • Pros and cons
  • Query types :
  • Inside out
  • Outside in
  • Joins types :
  • Standard joins
  • Fact-to-fact joins
  • Heterogeneous products
  • Subtyping
  • Heterogeneous products or customers

3. Dimensions

  • Definition of dimensions
  • Dimension levels
  • Dimensional hierarchies
  • Multiple dimensions
  • Behavioral dimensions
  • Ragged dimensions
  • Complex dimensions
  • Huge dimensions

4. Keys

  • Natural vs. surrogate keys
  • Pros and cons of surrogate keys
  • Effect of surrogate keys on ETL

5. Time and History

  • Definition of time series and history
  • Methods for handling time and history
  • Actual date/time stamp
  • Current record only
  • Simple history
  • Delimited history
  • Complex history
  • Period vs. date
  • The calendar
  • Simple calendar
  • Complex calendar
  • Building the calendar
  • Bucketizing time
  • Recency, frequency, periodicity

6. Changing dimensions

  • Slowly and rapidly changing
  • Overwriting
  • Versioned
  • Using Period ID
  • Current and last

7. Dimensions and Hierarchies

  • Dimension hierarchies and networks
  • Alternatives for handling hierarchies
  • Flattened, recursive
  • "Bill of materials", snowflaked

8. Value Banding

  • Importance of
  • Recency, frequency, monetary

9. Aggregation

  • How to improve performance
  • Best practices
  • Rollups and cubes
  • Types of aggregates
  • Aggregate navigation
  • Automatic summary tables
  • Collections of data

10. Operational Data Store (ODS)

  • What it is
  • What it is not
  • Kinds and uses of ODS

11. Case Exercises

We regularly offer classes in these and other cities. Atlanta, Austin, Baltimore, Calgary, Chicago, Cleveland, Dallas, Denver, Detroit, Houston, Jacksonville, Miami, Montreal, New York City, Orlando, Ottawa, Philadelphia, Phoenix, Pittsburgh, Seattle, Toronto, Vancouver, Washington DC.