Goethe University Frankfurt
(C) Big Data Laboratory. Design By Tea Sets

Web Business: Data Challenges WS 2016

(Winter Semester 2016/2017)

B-WB, M-WB, M-PoE

Lecturers: Prof. Dott. Ing. Roberto V. ZicariDr. Karsten TolleTodor Ivanov, Marten Rosselli and Kim Hee.


Course start: Thursday 27. Oct. 2016 (kickoff)

Time and Location:

Thursday  14:00 – 16:00   Hörsaaltrakt Bockenheim – H III

Friday       10:00 – 12:00   Hörsaaltrakt Bockenheim – H III

Languages: The languages of the lecture are English and German.

Credit Points: Students can receive 6 CPs. Link in QIS/LFS

Course Description: Students will take part to two Data Challenges. One offered by Deutsche Bahn AG and one offered by ING-DiBa Bank.

Eligibility: Bachelor Students, Master Students, and PhD students across multiple disciplines are encouraged to attend the kickoff and to sign up for one Data Challenge.

Students in Computer Science, Mathematics, Data Science, Information Systems, Business Computer Science, and others will form teams of 2 to explore the questions posed.  Team members are required to attend the kick-off lecture to sign-up for this project.

Course Registration:  You have to register for the Data Challenge WS 2016 in the form below. Registration is preferable in teams of 2 persons. Deadline for registration is Thursday, 20.10.2016.


Important Note: This project is in two phases, with Phase One taking place Fall 2016.  Successful teams will be selected to continue in Phase Two, which will be scheduled in Spring 2017.  


Data Challenges:

Among the teams that successfully complete both phases of the project, winners will be awarded a price:


Project Description:

The project consists of two phases: Phase I will be held during the Fall Semester 2016. Phase 2 will take place during Spring 2017. The proposed timeline and details of these stages are:

Phase 1:


-Teams will be asked to address one of the Data Challenges offered.

Specifics will be addressed at introductory lectures. Teams will then work independently to create a proposal of a novel idea that satisfies the data challenge chosen.

-Deliverable: A mid-term presentation of the project idea, where it is required that:

  1. teams clearly state objectives,
  2. general description of the way they intend to implement the idea using the data available for the challenge chosen.

Phase 2:

Teams that submitted a successful presentation at Phase I will be then asked to implement the idea and present it at the end of Phase II mid February 2017.  (Exact dates and detailed agenda to be reviewed at the kickoff)


Course Schedule (preliminary)

Date Topic Materials
27.10.2016 – 14:00-16:00
03.11.2016 – 14:00-16:00 Challenge presentation by Deutsche Bahn RIS-ML und Reisendeninformation (slides)
04.11.2016 – 10:00-12:00
10.11.2016 – 14:00-16:00 How to: business understanding, requirement analysis, …
11.11.2016 – 10:00-12:00 How to handle data? – Data Tools
17.11.2016 – 14:00-16:00 Innovation methods:

18.11.2016 – 10:00-12:00 Mobile Application Development, Q&A session
24.11.2016 – 14:00-16:00 Q&A session
25.11.2016 – 10:00-12:00  Deadline of Submission 1 / Q&A session
01.12.2016 – 14:00-16:00 Phase 1 – presentations – ING DiBa Challenges
02.12.2016 – 10:00-12:00 Phase 1 – presentations – Deutsche Bahn Challenge
08.12.2016 – 14:00-16:00 Defining Phase 2/ Q&A hour
09.12.2016 – 10:00-12:00 Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
15.12.2016 – 14:00-16:00 Status check for Deutsche Bahn teams DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
16.12.2016 – 10:00-12:00 Status check for ING-DiBa teams DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
22.12.2016 Holiday
23.12.2016 Holiday
12.01.2017  – 14:00-16:00 Deutsche Bahn teams – Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
13.01.2017 – 10:00-12:00 ING-DiBa teams – Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
19.01.2017 – 14:00-16:00 Deutsche Bahn teams – Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
20.01.2017 – 10:00-12:00  ING-DiBa teams – Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
26.01.2017 – 14:00-16:00 Status check for Deutsche Bahn teams DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
27.01.2017 – 10:00-12:00 Status check for ING-DiBa teams DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
02.02.2017 – 14:00-16:00 Deutsche Bahn teams – Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
03.02.2017 – 10:00-12:00   ING-DiBa teams – Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
09.02.2017 – 14:00-16:00   Mentoring hours DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
10.02.2017 – 10:00-12:00 Deadline of Submission 2/ Mentoring hours  DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)
16.02.2017 – 14:00-16:00 Final presentations for Deutsche Bahn Challenge Hörsaaltrakt Bockenheim – H III
17.02.2017 – 10:00-12:00 Final presentations for ING-DiBa Challenge Hörsaaltrakt Bockenheim – H I

 


Resources

Ethics and Data

Legal Implications of Data

Data Privacy

Elevator Pitch

Elevator Pitch- 5 minutes Presentation

Machine Learning

  • Machine Learning Course at Stanford by Andrew Ng, Chief Scientist of Baidu; Chairman and Co-Founder of Coursera; Stanford CS faculty.
  • Non technical 5-part series on introductory machine learning by Alex Castrounis, Product Leader and Technologist.
    • Part 1 – definition of machine learning and most widely used machine learning algorithms.
    • Part 2 – model performance, data selectionpre-processing, splittingfeature selection and feature engineering.
    • Part 3 – model variancebias, overfitting, model complexitydimensionality reduction, model evaluationperformance, tuningvalidationensemble learning, and resampling methods.
    • Part 4 - model performance and error analysis 
    • Part 5 – unsupervised learning, predictive analyticsartificial intelligencestatistical learning, and data mining.

Open Source Tools

  • Apache Hadoop is a project developing open-source software for reliable, scalable, distributed computing.
  • Apache Spark is a fast and general engine for large-scale data processing.
  • Apache Flink is an open-source platform for distributed stream and batch data processing.

Advanced AI Tools

  • TensorFlow  is an open source software library for numerical computation using data flow graphs.
  • The Microsoft Cognitive Toolkit: A free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain.

Making App

Chat Bot 

Enterpreneurship