Transferring Data with ETL
This course covers creating and orchestrating ETL pipelines using the industry’s best practices and tools: Python, SQL, Apache Spark, and Apache Airflow.
ETL stands for extract, transform, and load. It’s a collection of processes that combine data from various sources and load them into data warehouses or other data repositories. ETL is crucial for providing data used for business intelligence and analytics.
In this course, you’ll experiment with extracting data from various database solutions such as MySQL, PostgreSQL, and MongoDB. You’ll use query and scripting languages like SQL, Python, and Apache Spark to process data and load it to data repositories or cloud solutions like Google’s GCP. Finally, you’ll learn how to schedule your ETL pipelines using cronjobs or automate and monitor them using open-source tools like Apache Airflow and Python’s pandas library.
After completing this course, you’ll have a strong grasp of various methods, tools, and techniques for transferring data from a source to its destination using ETL pipelines.
There are no reviews yet.