Image
The two main components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment.

Modernizing Data Lakes and Data Warehouses with Google Cloud (MDLDW) Objectives

  • Differentiate between data lakes and data warehouses.
  • Explore use-cases for each type of storage and the available data lake and warehouse solutions on Google Cloud.
  • Discuss the role of a data engineer and the benefits of a successful data pipeline to business operations.
  • Examine why data engineering should be done in a cloud environment.

Need Assistance Finding the Right Training Solution

Our Consultants are here to assist you

Key Point of Training Programs

  • Modernizing Data Lakes and Data Warehouses with Google Cloud (MDLDW) Prerequisites

    Who should attend
    This course is intended for developers who are responsible for querying datasets, visualizing query results, and creating reports.

    Specific job roles include:

    Data engineer
    Data analyst
    Database administrators
    Big data architects
    Certifications
    This course is part of the following Certifications:

    Google Cloud Certified Professional Data Engineer
    Prerequisites
    Basic proficiency with a common query language such as SQL.

  • Modernizing Data Lakes and Data Warehouses with Google Cloud (MDLDW) Course Format

    Live Virtual Course

  • Modernizing Data Lakes and Data Warehouses with Google Cloud (MDLDW) Outline

    Module 1 - Introduction to Data Engineering
    Topics:

    The role of a data engineer
    Data engineering challenges
    Introduction to BigQuery
    Data lakes and data warehouses
    Transactional databases versus data warehouses
    Partnering effectively with other data teams
    Managing data access and governance
    Build production-ready pipelines
    Google Cloud customer case study
    Objectives:

    Discuss the role of a data engineer.
    Discuss benefits of doing data engineering in the cloud.
    Discuss challenges of data engineering practice and how building data pipelines in the cloud helps to address these.
    Review and understand the purpose of a data lake versus a data warehouse, and when to use which.
    Module 2 - Building a Data Lake
    Topics:

    Introduction to data lakes
    Data storage and ETL options on Google Cloud
    Building a data lake by using Cloud Storage
    Securing Cloud Storage
    Storing all sorts of data types
    Cloud SQL as your OLTP system
    Objectives:

    Discuss why Cloud Storage is a great option to build a data lake on Google Cloud.
    Explain how to use Cloud SQL for a relational data lake.
    Module 3 - Building a Data Warehouse
    Topics:

    The modern data warehouse
    Introduction to BigQuery
    Getting started with BigQuery
    Loading data into BigQuery
    Exploring schemas
    Schema design
    Nested and repeated fields
    Optimizing with partitioning and clustering
    Objectives:

    Discuss the requirements of a modern warehouse.
    Explain why BigQuery is the scalable data warehousing solution on Google Cloud.
    Discuss the core concepts of BigQuery and review options of loading data into BigQuery.

    Have a Question About This Course?