Goethe University Frankfurt
(C) Big Data Laboratory. Design By Tea Sets

Introduction to Data Science SS 2017

Lecture: Introduction to Data Science

(Summer Semester 2017)


Lecturers: Dr. Jochen L. Leidner and Kim Hee

Please register now! (This is an informal registration to estimate the number of audiences)

Course start: Monday 29. May. 2017

Time and Location:

Lecture week 1: May 29-31, 2017, 13:00-16:00; Hörsaaltrakt Bockenheim – H IV
Exercise week 1: June 6-7, 2019, 13:00-16:00; Hörsaaltrakt Bockenheim – H IV
Lecture week 2: June 12-14, 2017, 13:00-16:00; Hörsaaltrakt Bockenheim – H IV
Exercise week 2: June 19-20, 2017, 13:00-16:00; Hörsaaltrakt Bockenheim – H IV
Exam: To be defined

Languages: The language of the lecture is English

Credit Points: Students can receive 5 CPs point. Link in QIS/LFS

Assessment: by written exam.

Eligibility: Master Students in Computer Science, Bio informatics and Business informatics (Wirtschaftsinformatik, Vertiefungsbereich Informatik) are encouraged to attend

Prerequisites: programming skills, knowledge of Python, algorithms and data structures

Course Description: The goal of this compact course is to give participants a first gentle introduction and solid conceptual grounding in what has been called ‘data science’, i.e. experimental work that is data-driven and empirical. The focus is on methodology, defining an experimental protocol, devising hypotheses, thinking about measuring success, but also on more practical approaches like basic machine learning methods (both supervised and unsupervised) and natural language processing approaches (like part-of-speech tagging, named entity recognition/classification/resolution, and parsing) and the introduction to popular tools. The course also demonstrates some practical applications of the techniques shown, and deepens the students’ skills via practical exercises.

The lecture is delivered over 4 weeks of calendar time and consists of 2 three-day blocks of 3 hours of lectures followed by 2 days of 2.5 hours of exercises/tutorials each). It targets Master’s level students. By the end of the course, participants will be able to analyze data-sets, and to create their own predictive classifieds and visualizations.

Course Schedule (preliminary)


Date Topic Materials
29.05.2017 – 13:00-16:00 structured and unstructured
profiling data sets
30.05.2017 – 13:00-16:00 hypothesis testing
descriptive v. predictive analytics
machine learning I: clustering
31.05.2017 – 13:00-16:00 machine learning II: classification
machine learning III: regression
Web crawling & mining
06.06.2017 – 13:00-16:00 practice tools and techniques I
07.06.2017 – 13:00-16:00 practice tools and techniques II
12.06.2017 – 13:00-16:00 experimental protocol
evaluation measures
data science tools
13.06.2017 – 13:00-16:00 inter-rater agreement
data science economics: value creation
14.06.2017 – 13:00-16:00 visualization & presentation
planning your data science project
data science & ethics.
19.06.2017 – 13:00-16:00 mini group project I
20.06.2017 – 13:00-16:00 mini group project II