Module 1 - Introduction to Data Engineering
Topics:
The role of a data engineer
Data engineering challenges
Introduction to BigQuery
Data lakes and data warehouses
Transactional databases versus data warehouses
Partnering effectively with other data teams
Managing data access and governance
Build production-ready pipelines
Google Cloud customer case study
Objectives:
Discuss the role of a data engineer.
Discuss benefits of doing data engineering in the cloud.
Discuss challenges of data engineering practice and how building data pipelines in the cloud helps to address these.
Review and understand the purpose of a data lake versus a data warehouse, and when to use which.
Module 2 - Building a Data Lake
Topics:
Introduction to data lakes
Data storage and ETL options on Google Cloud
Building a data lake by using Cloud Storage
Securing Cloud Storage
Storing all sorts of data types
Cloud SQL as your OLTP system
Objectives:
Discuss why Cloud Storage is a great option to build a data lake on Google Cloud.
Explain how to use Cloud SQL for a relational data lake.
Module 3 - Building a Data Warehouse
Topics:
The modern data warehouse
Introduction to BigQuery
Getting started with BigQuery
Loading data into BigQuery
Exploring schemas
Schema design
Nested and repeated fields
Optimizing with partitioning and clustering
Objectives:
Discuss the requirements of a modern warehouse.
Explain why BigQuery is the scalable data warehousing solution on Google Cloud.
Discuss the core concepts of BigQuery and review options of loading data into BigQuery.