Getting Started with Apache Spark on Databricks

Add to wishlistAdded to wishlistRemoved from wishlist 0

Add to compare+

Duration	1h 52m
level	Beginner
Course Creator	Janani Ravi
Last Updated	25-Oct-21

Pluralsight

Category: Data Engineering

This course will introduce you to analytical queries and big data processing using Apache Spark on Azure Databricks. You will learn how to work with Spark transformations, actions, visualizations, and functions using the Databricks Runtime.

Add your review

Description
Reviews (0)

Azure Databricks allows you to work with big data processing and queries using the Apache Spark unified analytics engine. With Azure Databricks you can set up your Apache Spark environment in minutes, autoscale your processing, and collaborate and share projects in an interactive workspace. In this course, Getting Started with Apache Spark on Databricks, you will learn the components of the Apache Spark analytics engine which allows you to process batch as well as streaming data using a unified API. First, you will learn how the Spark architecture is configured for big data processing, you will then learn how the Databricks Runtime on Azure makes it very easy to work with Apache Spark on the Azure Cloud Platform and will explore the basic concepts and terminology for the technologies used in Azure Databricks. Next, you will learn the workings and nuances of Resilient Distributed Datasets also known as RDDs which is the core data structure used for big data processing in Apache Spark. You will see that RDDs are the data structures on top of which Spark Data frames are built. You will study the two types of operations that can be performed on Data frames – namely transformations and actions and understand the difference between them. You’ll also learn how Databricks allows you to explore and visualize your data using the display() function that leverages native Python libraries for visualizations. Finally, you will get hands-on experience with big data processing operations such as projection, filtering, and aggregation operations. Along the way, you will learn how you can read data from an external source such as Azure Cloud Storage and how you can use built-in functions in Apache Spark to transform your data. When you are finished with this course you will have the skills and ability to work with basic transformations, visualizations, and aggregations using Apache Spark on Azure Databricks.
Author Name: Janani Ravi
Author Description:
Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing … more

Course Overview
2mins
Overview of Apache Spark on Databricks
34mins
Transformations, Actions, and Visualizations
41mins
Modify Data Using Spark Functions
34mins

User Reviews

0.0 out of 5

★★★★★

Write a review

There are no reviews yet.

Be the first to review “Getting Started with Apache Spark on Databricks” Cancel reply

Getting Started with Apache Spark on Databricks

Description
Reviews (0)

Start Course

All Categories

Getting Started with Apache Spark on Databricks

Table of Contents

User Reviews

Be the first to review “Getting Started with Apache Spark on Databricks” Cancel reply

COURSE PROVIDERS

CATEGORIES

Quick Links

Contact Us

Compare items

All Categories

Getting Started with Apache Spark on Databricks

Table of Contents

User Reviews

Be the first to review “Getting Started with Apache Spark on Databricks” Cancel reply

Related Products

Explore fundamentals of large-scale analytics

Automate Data Pipelines

Data Ingestion with Kafka and Kafka Streaming

Introduction to Azure Data Lake Storage Gen2

Use Azure Synapse serverless SQL pool to query files in a data lake

Use Delta Lake in Azure Synapse Analytics

COURSE PROVIDERS

CATEGORIES

Quick Links

Contact Us

Compare items