Data Science and Big Data Analytics (WS2015)

Teaching Staff: Steffen Herbold, Fabian Korte, Michael Göttsche

Dates, Modules, etc.

  • Lecture: Tuesdays, 14:15-15:45 o'clock, Room 0.101 (first session Oct. 27th) - Please note that the room has changed!
  • Exercise: Thursdays, 13:15-14:45 o'clock, Room -1.111 (first session mid of November / to be determined)
  • Exam: Friday, February 19th, 14:00-16:00 - Room MN 14 (in the Geology)
  • Modules: M.Inf.1151
  • The lecture is a live stream using Adobe Connect:


This lecture requires registration during the first session. The maximum number of participants is 30. Registration and active participation in the exercise is mandatory in order to be allowed to participate in the final exam.


The main topic of this lecture is data science, i.e., methods to extract information from data with a scientific approach. We approach this topic from a practical side in this lecture. This means, that we concern ourselves directly with what algorithms do, and where they should be applied. The details of the algorithms and the theory behind them are not part of this lecture. Methods considered in this lecture include:

  • k-means clustering
  • Linear regression
  • Logistic regression
  • Naive bayes
  • Decision trees
  • Text analysis

Additionally, we will consider the analysis of Big Data. In this context, we will consider the following topics:

  • MapReduce
  • Hadoop
  • Languages for Hadoop
  • Mahout


The materials for this course are distributed via Stud.IP.

2024 © Software Engineering For Distributed Systems Group

Main menu 2