Streaming Analytics

Description

“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including AmazonEBayNASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You'll learn those same techniques, using your own Windows system right at home. It's easier than you might think, and you'll be learning from an ex-engineer and senior manager from Amazon and IMDb.

Spark works best when using the Scala programming language, and this course includes a crash-course in Scala to get you up to speed quickly. For those more familiar with Python however, a Python version of this class is also available: "Taming Big Data with Apache Spark and Python - Hands On".

Learn and master the art of framing data analysis problems as Spark problems through over 20 hands-on examples, and then scale them up to run on cloud computing services in this course.

  • Learn the concepts of Spark's Resilient Distributed Datastores

  • Get a crash course in the Scala programming language

  • Develop and run Spark jobs quickly using Scala

  • Translate complex analysis problems into iterative or multi-stage Spark scripts

  • Scale up to larger data sets using Amazon's Elastic MapReduce service

  • Understand how Hadoop YARN distributes Spark across computing clusters

  • Practice using other Spark technologies, like Spark SQL, DataFrames, DataSets, Spark Streaming, and GraphX

What will you learn
  • Define and understand the basics oops concepts and REPL in scala programming

  • Obtain the output by performing the various spark operations using spark shell.

  • Perform and analyse with RDD functions to load and save data

  • Implement the spark streaming data in real time

  • Extract the data by spark SQL and visualize to analyse through GraphX library in spark


Requirements
  • Some prior programming or scripting experience is required. A crash course in Scala is included, but you need to know the fundamentals of programming in order to pick it up.
  • You will need a desktop PC and an Internet connection. The course is created with Windows in mind, but users comfortable with MacOS or Linux can use the same tools.
  • The software needed for this course is freely available, and I'll walk you through downloading and installing it.

Lessons

  • 36 Lessons
  • 37:22:08 Hours
  • Introduction to Big Data Analytics I00:49:59
  • Introduction to Big Data Analytics II00:59:41
  • Fundamentals Architecture of Hadoop and Ecosystems00:57:23
  • Linux Basic Commands00:52:44
  • Hadoop Commands Basic00:56:46
  • Hadoop Map Reduce00:50:45
  • Haddop WordCount MapReduce01:03:38
  • Assignment I
  • Part 1 Revision01:29:59
  • Spark Eco System00:50:33
  • Introduction to Scala00:34:05
  • Scala REPL00:57:02
  • Scala REPL Collection00:50:04
  • Scala REPL Arrary Collection00:33:10
  • Running Spark in IDE Environment Setup01:22:12
  • Transformation Functions and SBT Layout01:27:59
  • SBT Word Count00:28:08
  • Transformation Persistence and Spark Submit01:30:02
  • Spark Submit Word Count01:29:45
  • Transformation Functions and SBT Layout
  • Introduction to SparkSQL01:18:11
  • Spark Aggregate Function in Structured Data01:11:07
  • Spark Broadcast and Accumulator Variables01:26:54
  • WordCount Average of Key Values01:26:58
  • WordCount and Swap Key Value01:26:27
  • Word Count - Saved document with Streaming Data01:28:17
  • Load Json and Parqute File01:29:01
  • SparkSQL Commands01:20:29
  • SparkSQL Join in Spark Shell and IDE 01:27:47
  • Introduction to Spark MLlib00:57:51
  • Introduction to Spark MLlib continued00:28:52
  • Spark MLlib Part 101:31:09
  • Spark MLlib Part 200:57:51
  • Introduction to Spark Streaming01:27:54
  • Streaming Network Word Count01:19:25

About instructor

Instructor
Name : Surenther I AP KCE
Reviews : 11 Reviews
Student : 309 Students
Courses : 10 Courses

Reviews

4
Based on 1 Reviews
1 Stars
2 Stars
3 Stars
4 Stars
5 Stars

Guna Arul - Thu, 01-Jan-1970