Focus: Big Data Platforms


Lecturers: Prof. Dott. Ing. Roberto V. ZicariTodor Ivanov, Marten Rosselli and Dr. Karsten Tolle


Course start/end:  Wednesday, 15.04.2015 to Wednesday, 15.07.2015

Time and Location:

Tuesday, 10:15 – 11:45, Robert-Mayer-Straße 11-15, Room SR 11 (Informatikgebäude)

Wednesday, 10:15 – 11:45, Robert-Mayer-Straße 11-15, Room SR 307 (Informatikgebäude)

Languages: The languages of the lecture are English and German.
Credit Points: Students can receive 6 CPs. Link in QIS/LFS

Course Description: The Big Data landscape is one which is continuously evolving as new technologies emerge and existing technologies mature. This is a comprehensive course covering the Hadoop architecture and the Hadoop ecosystem of tools. These technologies are at the foundation of the Big Data movement, and they facilitate scalable management and processing of vast quantities of data.


Announcements

21.04.2015 – THE DATANAUTS CONTEST – Become a TEDx Speaker by submitting an Idea untill April 30th 2015.

28.07.2015 – All student presentations are available for download on the bottom.


Assignments

29.04.2015 – Assignments 1 + 2

03.06.2015 – Assignment 3

30.06.2015 – Assignment 4


Course Schedule (preliminary)

Introduction – Prof. Dott. Ing. Roberto V. Zicari
Hadoop – Todor Ivanov
GraphDBs – Dr. Karsten Tolle
NoSQL – Marten Rosselli
Presentations – Students

 

Date Title Materials
15.04.2015 Introduction to Big Data
21.04.2015
  • Course Introduction
  • The Motivation for Hadoop
  • Hadoop Basic Concepts
  • Hadoop Solutions

[wpdm_package id=’8667′]

 

[wpdm_package id=’8791′]

22.04.2015 & 28.04.2015
  • The Hadoop Ecosystem
  • Managing Your Hadoop Solution
  • Introduction to MapReduce
  • Hadoop Clusters

[wpdm_package id=’8718′]

[wpdm_package id=’8794′]

29.04.2015
  • Writing a MapReduce Program in Java
  • Unit Testing MapReduce Programs

[wpdm_package id=’8881′]

[wpdm_package id=’8883′]
05.05.2015 & 06.05.2015
  • Hadoop Tools for Data Acquisition: Sqoop and Flume
  • Creating Workflows with Oozie
  • Introduction to Pig

[wpdm_package id=’9051′]

[wpdm_package id=’9055′]

[wpdm_package id=’9210′]

12.05.2015 [wpdm_package id=’9389′]
13.05.2015 [wpdm_package id=’9440′]
19.05.2015
  • Basic Data Analysis with Pig
  • Processing Complex Data with Pig

[wpdm_package id=’9241′]

[wpdm_package id=’9243′]

20.05.2015
  • Multi-Dataset Operations with Pig
  • Extending Pig
  • Pig Troubleshooting and Optimization

[wpdm_package id=’9249′]

[wpdm_package id=’9246′]

26.05.2015 Student Presentations 1
27.05.2015 Student Presentations 2
02.06.2015 Student Presentations 3
03.06.2015
  • Introduction to Hive
  • Relational Data Analysis with Hive
  • Hive Data Management

[wpdm_package id=’9803′]

[wpdm_package id=’9811′]

09.06.2015
  • Text Processing with Hive
  • Hive Optimization
  • Extending Hive

[wpdm_package id=’9805′]

[wpdm_package id=’9813′]

10.06.2015 In-Memory HANA Platform
 [wpdm_package id=’10210′]Required preparation for next lecture:[wpdm_package id=’10212′][wpdm_package id=’10214′][wpdm_package id=’10216′]
16.06.2015 No Lecture
17.06.2015 No Lecture
23.06.2015 In-Memory HANA Platform – Exercise
 [wpdm_package id=’10507′][wpdm_package id=’10509′]
24.06.2015 In-Memory HANA Platform – Exercise
 [wpdm_package id=’10554′][wpdm_package id=’10688′]
30.06.2015 Accenture Guest Lecture Dr. Uwe Pleban: Predictive Analytics
01.07.2015
  • Introduction to Impala

[wpdm_package id=’10714′]

[wpdm_package id=’10718′]

07.07.2015 Accenture Guest Lecture Marco Seravalli: HBase
 Please bring your Notebook with the Cloudera Virtual Machine with you.[wpdm_package id=’11081′]Material for Exercises:

[wpdm_package id=’10899′]

[wpdm_package id=’10905′]

08.07.2015 Student Presentations 3
14.07.2015 Student Presentations 4
15.07.2015 Student Presentations 5
[wpdm_package id=’11475′]

 

Additional Materials


Hadoop Ecosystem:

Big Data Engineering Lecture (in German) – Lars George (EMEA Chief Architect @ Cloudera)

IBM Big Data University –  free online courses on Big Data technologies

Awesome Big Data – A curated list of awesome big data frameworks, resources and other awesomeness.