Goethe University Frankfurt
(C) Big Data Laboratory. Design By Tea Sets

Principles of E-Commerce I (Summer Semester 2015)

Focus: Big Data Platforms


Lecturers: Prof. Dott. Ing. Roberto V. ZicariTodor Ivanov, Marten Rosselli and Dr. Karsten Tolle


Course start/end:  Wednesday, 15.04.2015 to Wednesday, 15.07.2015

Time and Location:

Tuesday, 10:15 – 11:45, Robert-Mayer-Straße 11-15, Room SR 11 (Informatikgebäude)

Wednesday, 10:15 – 11:45, Robert-Mayer-Straße 11-15, Room SR 307 (Informatikgebäude)

Languages: The languages of the lecture are English and German.
Credit Points: Students can receive 6 CPs. Link in QIS/LFS

Course Description: The Big Data landscape is one which is continuously evolving as new technologies emerge and existing technologies mature. This is a comprehensive course covering the Hadoop architecture and the Hadoop ecosystem of tools. These technologies are at the foundation of the Big Data movement, and they facilitate scalable management and processing of vast quantities of data.


Announcements

21.04.2015 – THE DATANAUTS CONTEST – Become a TEDx Speaker by submitting an Idea untill April 30th 2015.

28.07.2015 – All student presentations are available for download on the bottom.


Assignments

29.04.2015 – Assignments 1 + 2

03.06.2015 – Assignment 3

30.06.2015 – Assignment 4


Course Schedule (preliminary)

Introduction – Prof. Dott. Ing. Roberto V. Zicari
Hadoop – Todor Ivanov
GraphDBs – Dr. Karsten Tolle
NoSQL – Marten Rosselli
Presentations – Students

 

Date Title Materials
15.04.2015 Introduction to Big Data
21.04.2015
  • Course Introduction
  • The Motivation for Hadoop
  • Hadoop Basic Concepts
  • Hadoop Solutions

 

22.04.2015 & 28.04.2015
  • The Hadoop Ecosystem
  • Managing Your Hadoop Solution
  • Introduction to MapReduce
  • Hadoop Clusters


29.04.2015
  • Writing a MapReduce Program in Java
  • Unit Testing MapReduce Programs

05.05.2015 & 06.05.2015
  • Hadoop Tools for Data Acquisition: Sqoop and Flume
  • Creating Workflows with Oozie
  • Introduction to Pig


12.05.2015
13.05.2015

19.05.2015
  • Basic Data Analysis with Pig
  • Processing Complex Data with Pig


20.05.2015
  • Multi-Dataset Operations with Pig
  • Extending Pig
  • Pig Troubleshooting and Optimization


26.05.2015 Student Presentations 1
27.05.2015 Student Presentations 2
02.06.2015 Student Presentations 3
03.06.2015
  • Introduction to Hive
  • Relational Data Analysis with Hive
  • Hive Data Management


09.06.2015
  • Text Processing with Hive
  • Hive Optimization
  • Extending Hive

10.06.2015 In-Memory HANA Platform
 
Required preparation for next lecture:
16.06.2015 No Lecture
17.06.2015 No Lecture
23.06.2015 In-Memory HANA Platform – Exercise
 
24.06.2015 In-Memory HANA Platform – Exercise
 
30.06.2015 Accenture Guest Lecture Dr. Uwe Pleban: Predictive Analytics
01.07.2015
  • Introduction to Impala

07.07.2015 Accenture Guest Lecture Marco Seravalli: HBase
 Please bring your Notebook with the Cloudera Virtual Machine with you.
Material for Exercises:

08.07.2015 Student Presentations 3
14.07.2015 Student Presentations 4
15.07.2015 Student Presentations 5

 

Additional Materials


Hadoop Ecosystem:

Big Data Engineering Lecture (in German) – Lars George (EMEA Chief Architect @ Cloudera)

IBM Big Data University –  free online courses on Big Data technologies

Awesome Big Data – A curated list of awesome big data frameworks, resources and other awesomeness.