What you will learn

Among the most important things you will learn at this workshop is how to design the overall process of the data warehouse. The seminar will teach how to effectively accomplish this, while ensuring a quality data warehouse design.

This course will cover five main areas:

  • What is unique about a Data Warehouse project?
  • What are valid different architectures or topologies for a data warehouse?
  • How do you design the Extract-Transform-Load (ETL) process?
  • What are the main considerations in designing the data structures?
  • How do you deliver productive BI tools and applications?

This is a workshop, not just lecture. Students will perform a number of exercises throughout the life cycle of the data warehouse to drive home complete understanding of the data warehouse development process and its issues.

3 days

Outline for Designing the Data Warehouse Workshop

1. Introduction to Data Warehousing

  • Scope and levels of modeling
  • Kinds of data
  • The framework for data modeling
  • Challenges in data management
  • Five major characteristics of data warehouse
  • Data Models:
  • Corporate
  • High level
  • Detailed level
  • Operational
  • Decision support
  • Types and technologies of data warehousing

2. Data Warehouse Architectures

  • Centralized DW
  • Functional DW
  • Federated DW
  • Independent Data Marts
  • Dependent Data Marts

3. Data Warehouse Methodology

  • Explanation of methodology steps
  • Iterative nature of development

4. Information Gathering

  • Facilitated sessions
  • Interviews
  • Information gathering techniques
    • Events
    • Objectives
    • Queries
    • Goals
    • Decisions
    • Problems

5. Data Store Layer

  • Building the Data Warehouse Model
  • Facts, dimensions
  • Summarized data
  • Levels of Data In the Enterprise
  • Base grains
  • Intermediate Summaries
  • Specialized summaries

6. Modeling Time and History

  • Short term and long term view
  • Four ways of handling time and date
  • Time-series data
  • Capturing business changes
  • Importance of representing the business time dimension

7. ETL Layer

  • Defining transformation requirements
  • Defining transformation rules
  • The transformation requirements spreadsheet
  • Building transformation processes
  • Enforcing controls in the ETL process
  • Designing the transformation process
  • Complete coverage transformation types
  • Dealing with change data
  • Supporting surrogate keys
  • Near-real time transformation

8. BI Layer

  • Designing the BI interface
  • Matching the BI interface to the user
  • Types of BI technologies and design
  • Types of reporting
  • OLAP in all its forms:
    • MOLAP
    • HOLAP
    • DOLAP
    • ROLAP
  • Data sparsity and density
  • Data explosion due to calculations, rollups and summaries

9. Data Warehouse Technology

  • Categories of warehouse tools
  • Review of major products

10. Important Considerations and Issues

  • System load
  • Denormalization and performance
  • Archiving and purging
  • Data distribution and replication
  • Change control
  • Copy management
  • Alternative Models For Copied Data

11. Managing Data Warehouse Projects

  • Data warehouse project structure
  • Managing multiple data warehouse projects
  • Data distribution issues

12. Summary and Conclusion

  • Selected warehouse projects
  • Critical Success Factors

13. Case Studies

  • Selected mini-exercises
  • Complete group case study (moderately sized)
  • Complete individual case study (large)

14. Glossary

15. Explanatory Texts