Data Engineering : Module 3
- Course Duration2 Days
- Course StartEnrollment Monthly
Description
Course Description:
A Data Engineer is someone with specialized skills in creating software solutions around data. Their skills are predominantly based around Hadoop, Spark, and the open source Big Data ecosystem projects. Data Engineers come from a Software Engineering background and program in Java, Scala, or Python.
A Data Engineer has realized the need to go from being a general Software Engineer and specialize in Big Data as a Data Engineer. This is because Big Data is changing and they need to keep up with the changes. Also, there is a copious amount of knowledge that a Data Engineer needs to know and there isn’t enough time to keep up with Big Data and other general software topics.
A qualified Data Engineer’s value is to know the right tool for the job. They understand the subtle differences in use cases and between technologies, and they can create data pipelines. This course will take you the right skills for the job.
Learning Outcomes:
By the end of this course, you should be able to:
- Explore Hadoop fundamentals
- Understand Hadoop Architectures and Concepts
- Build Hadoop ecosystem.
- Explore advanced Hadoop concepts.
Key Objectives:
- Learn to set up and use Hadoop and Spark
- Learn to use Pandas for Data Analysis on hadoop
- Visualise data using python libraries deployed on Hadoop.
- Implement Machine Learning Algorithms on Spark.
Course Outline:
Module 1: Introduction to Hadoop
Module 2: Hadoop Architecture and Concepts
Module 3: MapReduce
Module 4: Introduction to Hadoop Ecosystem
Module 5: Advanced Spark Concepts
Module 6: Data Ingestions
Our Partners
Institutions we have partnered with or Worked with previously