Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
 

Database Systems Seminar II (with TU Darmstadt)

The Database Systems Seminar II is the second iteration of the joint research seminar series with the Data Management Lab from TU Darmstadt (led by Prof. Carsten Binnig). Originally started in the Winter Semester 2020/21, in this semester we will continue to present our current research work and discuss novel ideas in the following areas:

  • Data Management on Modern Hardware
  • Stream processing
  • Interactive Data Exploration & ML
  • End-to-end machine learning
  • Natural language interfaces for databases
  • Benchmarking data processing systems
  • Trusted data management

During the seminar we also host invited talks by distinguished speakers from both academia and industry. On this page you can find the schedule, abstracts and the recorded presentations of the talks.

Schedule

03.05.2021 Nils Böschen How to best use your GPU? The answer is OLTP, not OLAP!
17.05.2021 Lawrence Benson Benchmarking Persistent Memory for Database Access
07.06.2021 Tiemo Bang The Full Story of 1000 Cores: An Autopsy of Concurrency Control on Real(ly) Large Multi-Socket Hardware
21.06.2021 Wang Yue Desis: General Distributed Window Aggregation
19.07.2021 Sanjay Krishnan (University of Chicago) Learned Data Synopses: The Good, the Bad, the Ugly

19.07.2021 - Guest Talk: Sanjay Krishnan (University of Chicago)

Learned Data Synopses: The Good, the Bad, the Ugly

Abstract : Summarizing a large dataset with a reduced-size data synopsis has applications from database query optimization to approximate query processing. Increasingly, data synopsis approaches leverage the inherent compression properties of machine learning (ML) models to achieve state-of-the-art results. This talk will deconstruct this trend to understand the key mechanisms behind machine learning's recent success in a historically well-established area of research. (The Good) I present a series of results that suggest ML models are astonishingly accurate at many different types of high-dimensional data summarization. (The Bad) I show that in "medium-dimensional" regimes it is possible to design new classical data synopsis techniques that meet of exceed the performance of ML models. (The Ugly) I discuss the under-appreciated reliability-gap between ML models and classical data summarization techniques.

Bio: Sanjay Krishnan is an Assistant Professor of Computer Science at the University of Chicago. His research studies the intersection of machine learning and database systems. Sanjay completed his PhD and Master’s Degree at UC Berkeley in Computer Science in 2018. Sanjay's work has received a number of awards including the 2016 SIGMOD Best Demonstration award, 2015 IEEE GHTC Best Paper award, and Sage Scholar award.

Research webpage: http://sanjayk.io/?src=%2F~skr%2F