Web Business: Data Challenges SS 2018

lightbulb-2692247_640

Data Challenges 2018

(Summer Semester 2018)

B-WB, M-WB, M-PoE

Lecturers: Prof. Dott. Ing. Roberto V. Zicari, Dr. Karsten Tolle, Kim Hee, Todor Ivanov and Naveed Mushtaq

Mentors:

Name	Topics
Daniel Amthor	technology, business models, smart cities
Adam Azani	startups, innovation, business models
Prof. Nils Bertschinger	machine learning
Björn Braun	new technologies, design thing / UCD, market research, business model design
Jonas De Paolis	winner of ING-DiBa Data Challenge 2017, technology, business models
Sead Izberovic	data security, data management, technologies
Prof. Dr. Udo Kebschull	data security, networks
Alex Klein	technology, web applications, software engineering, startups, innovation
Klara Kletzka	social, societal innovation
Patrick Klose	winner of the DB Data Challenge 2017; Artificial Intelligence (esp. Reinforcement Learning) and Software Development
Hevin Özmen	urban transportation
Nicolas Pfeuffer	winner of DB Data Challenge 2017; Artificial Intelligence (esp. Conversational Agents) and Software Development
Dr. Manfred Spindler	business angels Frankfurt, business models, innovation
Prof. Roser Valenti	innovation, research
Rut Waldenfels	transport, business models

Course start: Thursday 19. April. 2018 (kickoff)

Time and Location:

Thursday 14:00 – 16:00 Hörsaaltrakt Bockenheim – H III, Gräfstraße 50-54, Frankfurt. (Campus Bockenheim), Goethe University Frankfurt. See map here: http://www.bigdata.uni-frankfurt.de/about/

Friday 10:00 – 12:00 Hörsaaltrakt Bockenheim – H II, Gräfstraße 50-54, Frankfurt. (Campus Bockenheim), Goethe University Frankfurt. See map here: http://www.bigdata.uni-frankfurt.de/about/

Languages: The languages of the lecture are English and German.

Credit Points: Students can receive 6 CPs. Link in QIS/LFS

Course Description: Students will take part to two Data Challenges. One offered by Deutsche Bahn AG and one offered by Procter & Gamble (P&G).

Eligibility: Bachelor Students, Master Students, and PhD students across multiple disciplines are encouraged to attend the kickoff and to sign up for one Data Challenge.

Students in Computer Science, Data Science, Information Systems, Business Computer Science, Mathematics, Economics, Marketing, Psychology, and other disciplines will form teams of two to explore the questions posed. Team members are required to attend the kick-off lecture to sign-up for this project.

Important Note: This project is in two phases, with Phase One and Phase Two taking place Summer Semester 2018.

Among the teams that successfully complete both phases of the project, winners will be awarded a prize.

Deutsche Bahn and Frankfurt Big Data Lab Data Challenges 2018:

„Smart Cities, Smart Life“

The design of our future urban life will depend fundamentally on the constitution of our future mobility in ever faster growing or ever faster evacuating urban regions.

The Deutsche Bahn AG, as one of the few comprehensive mobility supplier, wants to shape the actual development towards networked, smart and sustainable mobility.

Against this background the development and the construction of digital and networked possibilities to offer comfortable, payable and simultaneously environmentally friendly solutions of urban mobility and logistics will gain great importance.

This means in our opinion the following:

How will be organized smart and integrated mobility on different modes of transport in the future
How will be shaped smart logistics and how could we offer new infrastructural services
How will become reality a new smart mobility by the use of nodes and transfer points and the remodeling of third places
Which concepts for a „mobility on demand” or certain possibilities of autonomous driving actually could be demonstrated prototypically

On the basis of the open Data pool of Deutsche Bahn AG, students of the Frankfurt BIG DATA LAB should create ideas and patterns of possible solutions that could accomplish to be regarded as possible indications for a smart mobility of the future.

For first insights on actual solutions look at the following reference implementations from Deutsche Bahn AG:

DB Challenge Prizes:

Weekend trip to Berlin with visit of our DB MindBox + Gold Trophy
Voucher of ICE (class 1) railway trip to a main city destination in Germany + Silver Trophy
Value Voucher & All-in-One Charger + Bronze Trophy

Procter & Gamble (P&G) and Frankfurt Big Data Lab Data Challenges 2018:

„Smart Logistics, Smart Supply Chain“

At P&G, everything we do starts by winning with consumers and shoppers.

Our aspiration is to serve the world’s consumers better than our best competitors, in every category and every country where we choose to compete — creating superior shareholder value in the process. P&G is focused on four key areas of transformation to deliver balanced growth and leadership value creation: streamlining and strengthening our product portfolio, improving productivity and our cost structure, building the foundation for stronger top-line growth, and strengthening our organization and culture. We are organizing our portfolio around 10 product categories and about 65 brands — approximately half of which have sales of more than $500 million each year.

In a dynamic manufacturing and retail environment, data and analytics offer unique opportunities to better understand and serve consumers. Shoppers expect personalized experiences – just the right information and inspiration, tailored to their needs, and the products that they want, easily available offline & online. Are you up to the challenge to transform our go-to-market and serve consumers and shoppers in the Frankfurt area best?

Bring your expertise in data and technology, your curiosity to discover the world of fast moving consumer goods, and your innovative ideas to delight consumers. The challenges we want to address are:

How can we leverage historical data and self-improving algorithms to better predict future product demand?

The focus will be on one or a set of German customers. You will have access to historical ordering and shipment data, supply chain details (distribution centers, retailer outlets) as well as consumer demand (product offtake). Other factors (such as weather forecast) might influence the demand for certain products. The algorithm should predict the future order volumes and patterns, and ideally be able to learn from new data being introduced. The outcome will be an improved supply chain, transport efficiency and increased shopper satisfaction via reduced out of stocks.

How should the retail landscape transform itself to better serve consumer demand in Frankfurt?

Frankfurt is a dynamic and diverse city. We know the current retail landscape (outlets location and types of stores) as well as demographic attributes of the different neighborhoods. Are shopper needs currently being met? How should offline and online shopping offerings evolve to meet the ever-changing demands?

P&G Challenge Prizes:

2-days trip to Geneva incl. visit of P&G headquarters + Gold Trophy
2-days trip to Cologne incl. tickets for DMEXCO and visit of a P&G plant + Silver Trophy
P&G Product Prize (e.g. OralB Genius Toothbrush Set) + Bronze Trophy

Project Description:

The project consists of two phases: Phase I and Phase II will be held during the Spring Semester 2018. The proposed timeline and details of these stages are:

Phase 1: 

-Teams will be asked to address one of the Data Challenges offered.

Specifics will be addressed at introductory lectures. Teams will then work independently to create a proposal of a novel idea that satisfies the data challenge chosen.

-Deliverable: A mid-term presentation of the project idea, where it is required that:

teams clearly state objectives,
general description of the way they intend to implement the idea using the data available for the challenge chosen.

Phase 2:

Teams that submitted a successful presentation at Phase I will be then asked to implement the idea and present it at the end of Phase II. (Exact dates and detailed agenda to be reviewed at the kickoff)

Course Schedule (preliminary)

Date	Topic	Materials
19.04.2018 – 14:00-16:00	Kick Off Meeting Data Challenges: Deutsche Bahn and Procter & Gamble (P&G)	Kickoff_Presentation_v8 Deutsche Bahn Presentation P&G Presentation
26.04.2018 – 14:00-16:00	Data Challenge presentation by Deutsche Bahn	DB Open Data RMV Open Data Portal Public Transport Statistics in Frankfurt
27.04.2018 – 10:00-12:00	Data Challenge presentation by Procter & Gamble (P&G)
03.05.2018 – 14:00-16:00	Jürgen Kohnen (P&G) ‘empowerment training’ + Q & A
04.05.2018 – 10:00-12:00	How to do: business understanding, requirement analysis + How to handle data? – Data Tools + Q & A
11.05.2018 – 10:00-12:00	Meeting the Mentors + Q & A
submission deadline for P&G Challenges: 14.05.2018 – 23:55	Phase I – by e-mail to: dc@dbis.cs.uni-frankfurt.de
17.05.2018 – 14:00-16:00	No Lecture
18.05.2018 – 10:00-12:00	Student presentations – Procter & Gamble (P&G)
submission deadline for Deutsche Bahn Challenges: 21.05.2018 – 23:55	Phase I – by e-mail to: dc@dbis.cs.uni-frankfurt.de
24.05.2018 – 14:00-16:00	Student presentations – Deutsche Bahn
25.05.2018 – 10:00-12:00
01.06.2018 – 10:00-12:00
07.06.2018 – 14:00-16:00
08.06.2018 – 10:00-12:00	P&G Challenge 2 – meeting with Roland (at DBIS)
14.06.2018 – 14:00-16:00	P&G Challenge 1 – meeting with Torben (at DBIS)
15.06.2018 – 10:00-12:00	Meet the Mentors + all student teams
21.06.2018 – 14:00-16:00
22.06.2018 – 10:00-12:00	Milestone check for all teams + DBIS
28.06.2018 – 14:00-16:00
29.06.2018 – 10:00-12:00	Meet the Mentors + all student teams
05.07.2018 – 14:00-16:00
06.07.2018 – 10:00-12:00
12.07.2018 – 14:00-16:00	Final presentations Deutsche Bahn + Award Ceremony at Deutsche Bahn (Skydeck)	Each team will have 15 minutes (+ Q&A) for final presentation.
13.07.2018 – 10:00-12:00	Final presentations Procter & Gamble (P&G) + Award Ceremony at Procter & Gamble (P&G)	Each team will have 15 minutes (+ Q&A) for final presentation.

Resources

UC Berkeley DATA-resources – many course materials on Python, NumPy, Pandas, SciKitLearn, MatPlotLib, TensorFlow, Machine Learning and more

Mobility

Project Shared Streets creates data standards around public resources such as curbs and streets (ranging from pick-up and dropoff volumes by hour to permitted uses), helping both city planners and the private sector innovate more quickly and with a common definition of the urban environment. Autonomous vehicles, whether fleet-operated or privately-owned, will rely heavily on these curated data sources to be good urban citizens, by complying with regulations on permitted uses, times, and speeds. – http://sharedstreets.io
The project HubCab– gathered 170 million taxi trips by over 13,000 Medallion taxis in New York City, with GPS coordinates of all pickup and drop off points and corresponding times. – http://hubcab.org/#13.00/40.7219/-73.9484
Turo Extras, a set of features which enable Turo hosts to provide additional items along with their cars, ranging from outdoor and recreation equipment to convenience services: https://explore.turo.com/discover-extras/
On Smart Cities and Mobility. Q&A with Praveen Subramani: http://www.odbms.org/2018/05/on-smart-cities-and-mobility-qa-with-praveen-subramani/
On Data and Transportation. Q&A with Carlo Ratti: http://www.odbms.org/2018/04/on-data-and-transportation-qa-with-carlo-ratti/

Ethics and Data

Perspectives on Big Data, Ethics, and Society. May 23, 2016 / By Jacob Metcalf, Emily F. Keller Danah Boyd
Council for Big Data, Ethics, and Society – In collaboration with the National Science Foundation, the Council for Big Data, Ethics, and Society was started in 2014 to provide critical social and cultural perspectives on big data initiatives. The Council brings together researchers from diverse disciplines — from anthropology and philosophy to economics and law – to address issues such as security, privacy, equality, and access in order to help guard against the repetition of known mistakes and inadequate preparation. Through public commentary, events, white papers, and direct engagement with data analytics projects, the Council will develop frameworks to help researchers, practitioners, and the public understand the social, ethical, legal, and policy issues that underpin the big data phenomenon
Ethical Issues in the Big Data Industry, MIS Quartely Executive
The Social, Cultural, & Ethical Dimensions of “Big Data”, March 17, 2014 – New York, NY

Legal Implications of Data

Data Privacy

Big Data and Large Numbers of People: the Need for Group Privac y by Prof. Luciano Floridi, Oxford Internet Institute, University of Oxford
Navigating US-EMEA Data Privacy Rules By Kevin Petrie, Technology Evangelist at Attunity
Big Data Privacy Isn’t Just for Data Geeks and Privacy Freaks Anymore by Tamara Dull, Director of Emerging Technologies for SAS Best Practices
Privacy considerations & responsibilities in the era of Big Data & Internet of Things by Ramkumar Ravichandran, Director, Analytics, Visa Inc.
ENABLING BIG DATA THROUGH EUROPE’S NEW DATA PROTECTION REGULATION by Viktor Mayer-Schönberger, Professor of Internet Governance and Regulation, University of Oxford & Yann Padova, Former Secretary General of the French Data Protection Authority (CNIL), now Commissioner with the French Energy Regulator (CRE).

Elevator Pitch

Elevator Pitch- 5 minutes Presentation

Machine Learning

Machine Learning Course at Stanford by Andrew Ng, Chief Scientist of Baidu; Chairman and Co-Founder of Coursera; Stanford CS faculty.
Non technical 5-part series on introductory machine learning by Alex Castrounis, Product Leader and Technologist.
- Part 1 – definition of machine learning and most widely used machine learning algorithms.
- Part 2 – model performance, data selection, pre-processing, splitting, feature selection and feature engineering.
- Part 3 – model variance, bias, overfitting, model complexity, dimensionality reduction, model evaluation, performance, tuning, validation, ensemble learning, and resampling methods.
- Part 4 – model performance and error analysis
- Part 5 – unsupervised learning, predictive analytics, artificial intelligence, statistical learning, and data mining.
Downloadable CRC Press Free Book on „Explorations in Artificial Intelligence and Machine Learning” (LINK to CRC Web site- registration required) with 7 chapters:
- An Introduction to Machine Learning
- The Bayesian Approach to Machine Learning
- A Revealing Introduction to Hidden Markov Models
- Introduction to Reinforcement Learning
- Deep Learning for Feature Representation
- Neural Networks and Deep Learning
- AI-Completeness: The Problem Domain of Super-intelligent Machines

Open Source Tool

Apache Hadoop is a project developing open-source software for reliable, scalable, distributed computing.
Apache Spark is a fast and general engine for large-scale data processing.
Apache Flink is an open-source platform for distributed stream and batch data processing.

Advanced AI Tools

TensorFlow is an open source software library for numerical computation using data flow graphs.
The Microsoft Cognitive Toolkit: A free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain.

Making App

Making a Flask app using a PostgreSQL database and deploying to Heroku

Chat Bot

Hartley Brody: Facebook Messenger Bot Tutorial: Step-by-Step Instructions for Building a Basic Facebook Chat Bot. 15 June 2016

Enterpreneurship

The Top 10 Mistakes of Entrepreneurs video, Guy Kawasaki, former chief evangelist of Apple and co-founder of Garage Technology Ventures.