What you will learn
 
  • How to decide the best architecture
  • Proven analytical modeling concepts
  • Use of surrogate keys
  • Deciding what is a dimension, dividing and combining dimensions, leveled dimensions, abstract dimensions, ragged dimensions
  • How to handle changing dimensions
  • How to handle snapshot data
  • How to handle complex dimensions
  • How and when to use aggregates
  • How to handle multiple units of measurement, such as currency reporting, multi-time-zone reporting, etc.
  • How to handle multi-valued dimensions
  • How to handle restatement
  • How to handle snowflake structures
  • Traversing unbalanced hierarchies
 
Audience
 
  • Experienced developers and administrators involved in data warehousing
  • Experienced business and technical data warehouse team members
 
Prerequisites
 
  • Our “Data Modeling for the Data Warehouse”, or our "Designing the Data Warehouse", or equivalent experience
  • An understanding of data modeling, especially entity-relationship modeling
 
Duration
  3 days

Outline for Advanced Data Warehousing

1. Data Warehouse Architectures

  • Centralized
  • Functional
  • Federated
  • Imbedded data marts
  • Dependent data marts
  • Independent data marts

2. Analytical Modeling Primer

  • Facts and dimensions
  • Different schemas
  • The star schema
  • The snowflake schema
  • The normalized model
  • Pros and cons
  • Query types :
  • Inside out
  • Outside in
  • Joins types :
  • Standard joins
  • Fact-to-fact joins
  • Heterogeneous products
  • Subtyping
  • Heterogeneous products or customers

3. Dimensions

  • Definition of dimensions
  • Dimension levels
  • Dimensional hierarchies
  • Multiple dimensions
  • Behavioral dimensions
  • Ragged dimensions
  • Complex dimensions
  • Huge dimensions

4. Keys

  • Natural vs. surrogate keys
  • Pros and cons of surrogate keys
  • Effect of surrogate keys on ETL

5. Time and History

  • Definition of time series and history
  • Methods for handling time and history
  • Actual date/time stamp
  • Current record only
  • Simple history
  • Delimited history
  • Complex history
  • Period vs. date
  • The calendar
  • Simple calendar
  • Complex calendar
  • Building the calendar
  • Bucketizing time
  • Recency, frequency, periodicity

6. Changing dimensions

  • Slowly and rapidly changing
  • Overwriting
  • Versioned
  • Using Period ID
  • Current and last

7. Dimensions and Hierarchies

  • Dimension hierarchies and networks
  • Alternatives for handling hierarchies
  • Flattened, recursive
  • "Bill of materials", snowflaked

8. Value Banding

  • Importance of
  • Recency, frequency, monetary

9. Aggregation

  • How to improve performance
  • Best practices
  • Rollups and cubes
  • Types of aggregates
  • Aggregate navigation
  • Automatic summary tables
  • Collections of data

10. Operational Data Store (ODS)

  • What it is
  • What it is not
  • Kinds and uses of ODS

11. Case Exercises