Big Data Analytics
Description
Design distributed systems that manage "big data" using Hadoop and related technologies.
Use HDFS and MapReduce for storing and analyzing data at scale.
Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
Analyze relational data using Hive and MySQL
Analyze non-relational data using HBase, Cassandra, and MongoDB
Query data interactively with Drill, Phoenix, and Presto
Choose an appropriate data storage technology for your application
Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
Consume streaming data using Spark Streaming, Flink, and Storm
What will you learn
-
learn, Hadoop tutorial with MapReduce, HDFS, Spark, Flink, Hive, HBase, MongoDB, Cassandra, Kafka +
Requirements
- Absolutely no experience is required. We will start from the basics and gradually build up your knowledge with clear and concise step by step instructions
Lessons
- 25 Lessons
- 05:10:09 Hours
- Syllabus - 17FD22
- Modules Overview 02:09:44
- Big data Overview01:19:59
- Data Analytics lifecycle
- Data Discovery
- Data modeling
- Data Analytics Lifecycle - Data Science and Big Data Analytics_ Discovering, Analyzing, Visualizing and Presenting Data
- Backbone.js in MongoDB
- Big Data Management
- Big Data Key points
- Hadoop and Replication Factor 00:47:02
- HDFS & HDFS Commands
- Hadoop YARN
- MapReduce
- Mongo DB
- MongoDB query language
- Node.js in MongoDB
- Building a social network
- Building a social network with Mongo DB00:00:00
- MongoDB Short Notes00:32:12
- User interface00:09:48
- Cassandra
- CQL data types
- CRUD operations00:11:24
- Hive HQL UDF Data Type