Applying the Lambda Architecture with Spark, Kafka, and Cassandra
This course introduces how to build robust, scalable, real-time big data systems using a variety of Apache Spark’s APIs, including the Streaming, DataFrame, SQL, and DataSources APIs, integrated with Apache Kafka, HDFS and Apache Cassandra.
This course aims to get beyond all the hype in the big data world and focus on what really works for building robust, highly-scalable batch and real-time systems. In this course, Applying the Lambda Architecture with Spark, Kafka, and Cassandra, you’ll string together different technologies that fit well and have been designed by some of the companies with the most demanding data requirements (such as Facebook, Twitter, and LinkedIn) to companies that are leading the way in the design of data processing frameworks, like Apache Spark, which plays an integral role throughout this course. You’ll look at each individual component and work out details about their architecture that make them good fits for building a system based on the Lambda Architecture. You’ll continue to build out a full application from scratch, starting with a small application that simulates the production of data in a stream, all the way to addressing global state, non-associative calculations, application upgrades and restarts, and finally presenting real-time and batch views in Cassandra. When you’re finished with this course, you’ll be ready to hit the ground running with these technologies to build better data systems than ever.
Author Name: Ahmad Alkilani
Author Description:
Ahmad Alkilani is a Data Architect specializing in the implementation of high-performance compute platforms, data warehouses and BI systems. Author of ForestFlow, an LFAI policy-based machine learning model server. Ahmad enjoys over 16 years of broad IT experience from traditional ODBMS to large-scale big data systems and No-SQL databases. He enjoys speaking at various user groups and national conferences. When not tinkering with new code or consulting on projects, Ahmad takes pleasure in spen… more
Table of Contents
- Course Overview
2mins - A Modern Big Data Architecture
51mins - Batch Layer with Apache Spark
64mins - Speed Layer with Spark Streaming
60mins - Advanced Streaming Operations
70mins - Streaming Ingest with Kafka and Spark Streaming
82mins - Persisting with Cassandra
32mins
There are no reviews yet.