Data Engineering : Module 3

  • Course Duration2 Days
  • Course StartEnrollment Monthly

Description

Course Description:

A Data Engineer is someone with specialized skills in creating software solutions around data. Their skills are predominantly based around Hadoop, Spark, and the open source Big Data ecosystem projects. Data Engineers come from a Software Engineering background and program in Java, Scala, or Python.
A Data Engineer has realized the need to go from being a general Software Engineer and specialize in Big Data as a Data Engineer. This is because Big Data is changing and they need to keep up with the changes. Also, there is a copious amount of knowledge that a Data Engineer needs to know and there isn’t enough time to keep up with Big Data and other general software topics.

A qualified Data Engineer’s value is to know the right tool for the job. They understand the subtle differences in use cases and between technologies, and they can create data pipelines. This course will take you the right skills for the job.

Learning Outcomes:

By the end of this course, you should be able to:

  • Explore Hadoop fundamentals
  • Understand Hadoop Architectures and Concepts
  • Build Hadoop ecosystem.
  • Explore advanced Hadoop concepts.

Key Objectives:

  • Learn to set up and use Hadoop and Spark
  • Learn to use Pandas for Data Analysis on      hadoop
  • Visualise data using python libraries deployed on Hadoop.
  • Implement Machine Learning Algorithms on Spark.

Course Outline:

Module 1: Introduction to Hadoop

Module 2: Hadoop Architecture and Concepts

Module 3: MapReduce

Module 4: Introduction to Hadoop Ecosystem

Module 5: Advanced Spark Concepts

Module 6: Data Ingestions

Our Partners

Institutions we have partnered with or Worked with previously

MapR Technologies
Kaggle
Dataiku
Nita
Kenya Tourism Board
Barclays
British American Tobacco
Coop Bank
Craft Silicon
CRDB Bank
ICPAK
IPSOS
KAM
Lapfund
National Land Comission
NSSF Uganda
Reinsuance
Safaricom
URA