(Winter Semester 2016/2017)
B-WB, M-WB, M-PoE
Course start: Thursday 27. Oct. 2016 (kickoff)
Time and Location:
Thursday 14:00 – 16:00 Hörsaaltrakt Bockenheim – H III
Friday 10:00 – 12:00 Hörsaaltrakt Bockenheim – H III
Languages: The languages of the lecture are English and German.
Credit Points: Students can receive 6 CPs. Link in QIS/LFS
Course Description: Students will take part to two Data Challenges. One offered by Deutsche Bahn AG and one offered by ING-DiBa Bank.
Eligibility: Bachelor Students, Master Students, and PhD students across multiple disciplines are encouraged to attend the kickoff and to sign up for one Data Challenge.
Students in Computer Science, Mathematics, Data Science, Information Systems, Business Computer Science, and others will form teams of 2 to explore the questions posed. Team members are required to attend the kick-off lecture to sign-up for this project.
Course Registration: You have to register for the Data Challenge WS 2016 in the form below. Registration is preferable in teams of 2 persons. Deadline for registration is Thursday, 20.10.2016.
Important Note: This project is in two phases, with Phase One taking place Fall 2016. Successful teams will be selected to continue in Phase Two, which will be scheduled in Spring 2017.
- Deutsche Bahn Data Challenge: Mobility of the future (Download pdf)
- ING-DiBa Data Challenge: Future of Financial Data (Download pdf in English, Download pdf in German)
Among the teams that successfully complete both phases of the project, winners will be awarded a price:
The project consists of two phases: Phase I will be held during the Fall Semester 2016. Phase 2 will take place during Spring 2017. The proposed timeline and details of these stages are:
-Teams will be asked to address one of the Data Challenges offered.
Specifics will be addressed at introductory lectures. Teams will then work independently to create a proposal of a novel idea that satisfies the data challenge chosen.
-Deliverable: A mid-term presentation of the project idea, where it is required that:
- teams clearly state objectives,
- general description of the way they intend to implement the idea using the data available for the challenge chosen.
Teams that submitted a successful presentation at Phase I will be then asked to implement the idea and present it at the end of Phase II mid February 2017. (Exact dates and detailed agenda to be reviewed at the kickoff)
Course Schedule (preliminary)
|27.10.2016 – 14:00-16:00|
|03.11.2016 – 14:00-16:00||Challenge presentation by Deutsche Bahn||RIS-ML und Reisendeninformation (slides)|
|04.11.2016 – 10:00-12:00|
|10.11.2016 – 14:00-16:00||How to: business understanding, requirement analysis, …|
|11.11.2016 – 10:00-12:00||How to handle data? – Data Tools|
|17.11.2016 – 14:00-16:00||Innovation methods:|
|18.11.2016 – 10:00-12:00||Mobile Application Development, Q&A session|
|24.11.2016 – 14:00-16:00||Q&A session|
|25.11.2016 – 10:00-12:00||Deadline of Submission 1 / Q&A session|
|01.12.2016 – 14:00-16:00||Phase 1 – presentations – ING DiBa Challenges|
|02.12.2016 – 10:00-12:00||Phase 1 – presentations – Deutsche Bahn Challenge|
|08.12.2016 – 14:00-16:00||Defining Phase 2/ Q&A hour|
|09.12.2016 – 10:00-12:00||Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|15.12.2016 – 14:00-16:00||Status check for Deutsche Bahn teams||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|16.12.2016 – 10:00-12:00||Status check for ING-DiBa teams||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|12.01.2017 – 14:00-16:00||Deutsche Bahn teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|13.01.2017 – 10:00-12:00||ING-DiBa teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|19.01.2017 – 14:00-16:00||Deutsche Bahn teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|20.01.2017 – 10:00-12:00||ING-DiBa teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|26.01.2017 – 14:00-16:00||Deutsche Bahn teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|27.01.2017 – 10:00-12:00||ING-DiBa teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|02.02.2017 – 14:00-16:00||Deutsche Bahn teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|03.02.2017 – 10:00-12:00||ING-DiBa teams – Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|09.02.2017 – 14:00-16:00||Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|10.02.2017 – 10:00-12:00||Deadline of Submission 2/ Mentoring hours||DBIS, Room 501, Robert-Mayer-Str. 10 (5th floor)|
|xx.02.2017||Final presentation for each challenge|
Ethics and Data
- Perspectives on Big Data, Ethics, and Society. May 23, 2016 / By Jacob Metcalf, Emily F. Keller Danah Boyd
- Council for Big Data, Ethics, and Society – In collaboration with the National Science Foundation, the Council for Big Data, Ethics, and Society was started in 2014 to provide critical social and cultural perspectives on big data initiatives. The Council brings together researchers from diverse disciplines — from anthropology and philosophy to economics and law – to address issues such as security, privacy, equality, and access in order to help guard against the repetition of known mistakes and inadequate preparation. Through public commentary, events, white papers, and direct engagement with data analytics projects, the Council will develop frameworks to help researchers, practitioners, and the public understand the social, ethical, legal, and policy issues that underpin the big data phenomenon
- Ethical Issues in the Big Data Industry, MIS Quartely Executive
- The Social, Cultural, & Ethical Dimensions of “Big Data”, March 17, 2014 – New York, NY
Legal Implications of Data
- An intro to the legal implications of Big Data (short video)
- Privacy in the Age of Big Data, The Stanford Law Review
- Stanford explores case for code of ethics to tackle big data’s deluge in higher education
- Big Data and Large Numbers of People: the Need for Group Privacy by Prof. Luciano Floridi, Oxford Internet Institute, University of Oxford
- Navigating US-EMEA Data Privacy Rules By Kevin Petrie, Technology Evangelist at Attunity
- Big Data Privacy Isn’t Just for Data Geeks and Privacy Freaks Anymore by Tamara Dull, Director of Emerging Technologies for SAS Best Practices
- Privacy considerations & responsibilities in the era of Big Data & Internet of Things by Ramkumar Ravichandran, Director, Analytics, Visa Inc.
- ENABLING BIG DATA THROUGH EUROPE’S NEW DATA PROTECTION REGULATION by Viktor Mayer-Schönberger, Professor of Internet Governance and Regulation, University of Oxford & Yann Padova, Former Secretary General of the French Data Protection Authority (CNIL), now Commissioner with the French Energy Regulator (CRE).
- Pitch Canvas might help to prepare your presentation
- Elevator Pitch Guide – short video
- How to Structure and Deliver an Elevator Pitch in 1 Minute
- Crafting an Elevator Pitch
Elevator Pitch- 5 minutes Presentation
- Video: 2014 Elevator Pitch Winner, University of Dayton Business Plan Competition
- Video: 2015 Elevator Pitch Winner, University of Dayton Business Plan Competition
- Video: 16th Annual UC Berkeley Startup Competition Finals
- Machine Learning Course at Stanford by Andrew Ng, Chief Scientist of Baidu; Chairman and Co-Founder of Coursera; Stanford CS faculty.
- Non technical 5-part series on introductory machine learning by Alex Castrounis, Product Leader and Technologist.
- Part 1 – definition of machine learning and most widely used machine learning algorithms.
- Part 2 – model performance, data selection, pre-processing,
splitting, feature selection and feature engineering.
- Part 3 – model variance, bias, overfitting, model complexity, dimensionality reduction, model evaluation, performance,
tuning, validation, ensemble learning, and resampling methods.
- Part 4 - model performance and error analysis
- Part 5 – unsupervised learning, predictive analytics, artificial intelligence, statistical learning, and data mining.
Open Source Tools
- Apache Hadoop is a project developing open-source software for reliable, scalable, distributed computing.
- Apache Spark is a fast and general engine for large-scale data processing.
- Apache Flink is an open-source platform for distributed stream and batch data processing.
Advanced AI Tools
- TensorFlow is an open source software library for numerical computation using data flow graphs.
- The Microsoft Cognitive Toolkit: A free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain.
- Hartley Brody: Facebook Messenger Bot Tutorial: Step-by-Step Instructions for Building a Basic Facebook Chat Bot.
- The Top 10 Mistakes of Entrepreneurs video, Guy Kawasaki, former chief evangelist of Apple and co-founder of Garage Technology Ventures.