Duration: 1 Hour
Presenter: Dr. Gunnar Kleemann
Categories: Data Science , Programming
Data Engineering involves the design, installation, and maintenance of systems that handle large amounts of data. Organizations depend on these systems to capture, store, and analyze data to gain insights into their customers and operations. Data engineers are responsible for building and optimizing the infrastructure that supports these systems. This requires a deep understanding of databases, data warehouses, and data pipelines, as well as programming and data modeling skills.
The process of going from a legacy data ecosystem to a modern, scalable data ecosystem is called Digital Transformation and relies heavily on data engineering skills and practices.
Join us as experienced data scientist, consultant, and instructor Dr. Gunnar Kleemann describes how data engineering is distinct from data science and discusses why skilled data engineers are fundamental to the modern data ecosystem. In this 1-hour session, Gunnar covers:
- The data lifecycle and common data challenges
- The data engineering pipelines and tool choices (on-prem versus cloud)
- Differentiating data engineering from data science
- The role of the data engineer and the skills they need
- Data engineer upskilling pathways
At the conclusion, Gunnar will welcome your questions.
Prerequisites:
It is recommended that attendees have:
- Basic Python programming experience, including use of Jupyter notebooks
- An understanding of basic UNIX/Linux administration and command line use
Duration: 60 minutes
Cost: Free!
About the Presenter:
Dr. Gunnar Kleemann has been a professional instructor for 27 years and runs Austin Capital Data, a data science consultancy. He serves as a corporate consultant, boot camp instructor, and faculty lecturer for the UC Berkeley Masters in Data Sciences (MIDS) Program.
Gunnar’s data science consulting work includes building customized tools to enable high throughput analytics and digital transformation in areas including predictive factory maintenance, genetic pathway discovery, drug discovery, automatic document analytics, and automated behavioral analysis. Gunnar teaches Python, statistics, and data engineering for data science applications at Accelebrate.