Stay home stay safe-- Due to the Covid-19 outbreak all our course will be conducted online. For more inform please contact-9619341713 / 7045481713

  • +91-9619341713-Airoli/ 7045481713-Mahape/ 7045911713-Kalyan

Learn Basic Big Data Hadoop

Learn Basic & Advanced Big Data Hadoop From Start For Beginner

Courses of Content

Courses
Course Overview

  Big data is a collection of large datasets that cannot be processed using traditional computing techniques. It is not a single technique or a tool, rather it has become a complete subject, which involves various tools, technqiues and frameworks.

Requirements

  • Programming
  • Quantitative Skills
  • Multiple Technologies
  • Understanding of Business & Outcomes
  • Interpretation of Data

Learn Basic Introduction Hadoop Training Started
  • Limitation of existing solution for Big data problem
  • How hadoop Solves Big data problem
  • Hadoop Eco-System Component
  • Hadoop Architecture
  • Concept of Hadoop Distributed file system(HDFS)
  • Design of HDFS
  • Common challenges
  • Best practices for scaling with your data
  • Configuring HDFS
  • Interacting with HDFS
  • HDFS permission and Security
  • Additional HDFS Tasks
  • Data Flow (Anatomy of a File Read, Anatomy of a File Write, Coherency Model)
  • Hadoop Archives
  • What is Map Reduce?
  • Data Types used in Hadoop
  • Concept of Mappers
  • Concept of Reducers
  • The Execution Framework architecture
  • Concept of Partioners
  • Concept of Combiners
  • Hadoop Cluster Architecture
  • MapReduce types
  • Input Formats (Input Splits and Records, Text Input, Binary Input, Multiple Inputs)
  • OutPut Formats (TextOutput, BinaryOutPut, Multiple Output).
  • Writing Programs for MapReduce
  • Installation of Hadoop
  • Running a sample program
  • Storage HDFS
  • Name Node HA & Node Manager
  • Cluster specification
  • Hadoop Configuration (Environment Settings, Hadoop Daemon- Properties, Addresses and Ports)
  • Basic Linux and HDFS Commands
  • Setup a Hadoop Cluster
  • What is PIG?What is PIG?
  • Installing and Running Pig
  • Grunt
  • Pig's Data Model
  • Pig Latin
  • Developing & Testing Pig Latin Scripts
  • Writing Evaluation
  • Filter
  • Loads & Store Functions
  • What is HIVE ?
  • Hive Architecture
  • Running Hive
  • Pig's Data Model
  • Comparison with Traditional Database (Schema on Read versus Write, Updates, Transactions and Indexes)
  • HiveQL (Data Types, Operators and Functions)
  • Tables (Managed and External Tables, Partitions and Buckets, Storage Formats, Importing Data)
  • Altering Tables, Dropping Tables
  • Querying Data (Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries & Views
  • Map and Reduce site Join to optimize Query
  • User Defined Functions
  • Appending Data into existing Hive Table
  • Custom Map/Reduce in Hive
  • Perform Data Analytics using Pig and Hive
  • What is HBASE?
  • Client API- Basics
  • Client API- Advanced Features
  • Client API - Administrative Features
  • Available Client
  • Architecture
  • MapReduce Integration
  • Advanced Usage
  • Advanced Indexing
  • Impelment HBASE
  • What is SQOOP?
  • Database Imports
  • Importing Large Objects
  • Performing Exports
  • Exports- A Deeper look
  • What is ZooKeeper?
  • The Zookeeper Service (Data Modal, Operations, Implementation,Consistency, Sessions, States)
  • Building Applications with Zookeeper (Zookeeper in Production)
  • What is Oozie?
  • OOZIE Installation
  • Running an OOZIE EXAMPLE
  • OOZIE WEBCONSOLE
  • Expression Language Funtions
  • OOZIE WORKFLOW EXAMPLE(Java Code,PIG,Hive)
  • Control Flow nodes
  • Action Node Properties(Map Reduce,Hive,Pig,java)
  • What is Ambari?
  • Why Ambari is needed?
  • Hands on with examples
Learn Basic Introduction Hadoop Admin Started
  • Big Data
  • 3Vs
  • Role of Hadoop in Big data
  • Hadoop and its ecosystem
  • Overview of other Big Data Systems
  • Requirements in Hadoop
  • Use Cases of Hadoop
  • Defining key design assumptions and architecture
  • Configuring and setting up the file system
  • Issuing commands from the console
  • Reading and writing files
  • Introducing the computing daemons
  • Dissecting a MapReduce job
  • Selecting appropriate hardware
  • Designing a scalable cluster
  • Sqoop Installations and Basics
  • Importing Data from Oracle to HDFS
  • Advance Imports
  • Real Time UseCase
  • Exporting Data from HDFS to Oracle
  • Running Sqoop in Cloudera
  • Installing Hadoop daemons
  • Optimizing the network architecture
  • Setting basic configuration parameters
  • Configuring block allocation, redundancy and replication
  • Installing and setting up the MapReduce environment
  • Delivering redundant load balancing via Rack Awareness
  • Starting and stopping Hadoop daemons
  • Monitoring HDFS status
  • Adding and removing data nodes
  • Managing MapReduce jobs
  • Tracking progress with monitoring tools
  • Commissioning and decommissioning compute nodes
  • Importing and exporting relational information with Sqoop
  • Coping with inevitable hardware failures
  • Securing your Hadoop cluster
Learn Basic Spark Started
  • Limitations of MapReduce in Hadoop Objectives
  • Batch vs. Real-time analytics
  • Application of stream processing
  • How to install Spark
  • Spark vs. Hadoop Eco-system
  • Features of Scala
  • Basic data types and literals used
  • List the operators and methods used in Scala
  • Concepts of Scala
  • Features of RDDs
  • How to create RDDs
  • RDD operations and methods
  • How to run a Spark project with SBT
  • Explain RDD functions and describe how to write different codes in Scala
  • Explain the importance and features of SparkSQL
  • Describe methods to convert RDDs to DataFrames
  • Explain concepts of SparkSQL
  • Describe the concept of hive integration
  • Concepts of Spark Streaming
  • Describe basic and advanced sources
  • Explain how stateful operations work
  • Explain window and join operations
  • Explain the use cases and techniques of Machine Learning (ML)
  • Describe the key concepts of Spark ML
  • Explain the concept of an ML Dataset, and ML algorithm, model selection via cross validation
  • Explain the key concepts of Spark GraphX programming
  • Limitations of the Graph Parallel system
  • Describe the operations with a graph
  • Graph system optimizations
  • Scala - Environment Setup • Scala - Basic Syntax
  • Scala - Data Types
  • Scala - Variables
  • Scala - Classes & Objects
  • Scala - Access Modifiers
  • Scala - Operators
  • Scala - IF ELSE
  • Scala - Loop Statements
  • Scala - Functions
  • Scala - Closures
  • Scala - Strings
  • Scala - Arrays
  • Scala - Collections
  • Scala - Traits
  • Scala - Pattern Matching
  • Scala - Regular Expressions
  • Scala - Exception Handling
  • Scala - Extractors
  • Scala - Files I/O