Projects

I am and have been involved in several funded research projects in areas such as big data systems, scientific workflows, stream processing, distributed IoT systems, smart city applications, and software development tools.

Ongoing Projects

I am currently involved in the following funded research projects, which I also (co-)developed:

casper logo

Casper: Carbon-Aware Scalable Processing in Elastic Clusters (2025-2027, funded by the Engineering and Physical Sciences Research Council - EPSRC)

Aim: The project will investigate how the execution of scalable batch processing applications can be aligned with the availability of low-carbon energy. More specifically, it sets out to pioneer how individual processing steps of large-scale cluster applications (e.g. Spark dataflows or Nextflow workflows) can be managed dynamically based on continuously updated estimates of application performance, cloud resource availability, and energy carbon intensity.

Involvement: I am the project lead and will be working closely with the PDRA/RA researcher and PhD student on the project to develop new application performance prediction methods for carbon-aware scheduling and scaling.

Consortium: The industry and academic partners of this individual EPSRC New Investigator Award project at Glasgow are: AWS, BBC R&D, and Humboldt University of Berlin.

Link: Funder Tableau tool (APP5942)

c5 logo

C5: Collaborative and Cross-Context Cluster Configuration for Distributed Data-Parallel Processing (2023-2026, funded by the German Research Foundation - DFG)

Aim: Sharing of runtime information (such as job runtimes, resource usage, and application events) across different execution contexts (such as particular cluster resources) presents a significant opportunity for performance modeling and adaptive resource management, allowing resource allocation for large-scale distributed computing jobs to be tuned even when the availability of context-specific runtime data is limited. Therefore, this research project aims to develop new methods for cross-context cluster configuration optimization based on shared performance data and models.

Involvement: I am involved as a researcher co-investigator, working closely with the project's RA and student assistant to develop new collaborative methods for reusing performance data and models across execution contexts for efficient distributed data processing.

Consortium: The research project is an individual project in Prof. Odej Kao's Distributed and Operating Systems group at Technische Universität Berlin with me as a partner at the University of Glasgow, having moved to Glasgow after the submission of the project proposal.

Link: Funder database

fonda logo

FONDA: Foundations of Workflows for Large-Scale Scientific Data Analysis (2020-2024, funded by the German Research Foundation - DFG)

Aim: The Collaborative Research Center FONDA (SFB 1404) investigates methods for a more productive development, execution, and maintenance of data analysis workflows for large scientific data sets. The long-term goal is to develop methods and tools to achieve substantial reductions in the development time and development costs of Data Analysis Workflows. Towards this goal, the project aims at finding new abstractions, models, and algorithms that can eventually form the basis of a new class of future data analysis workflow infrastructures.

Involvement: I am part of a subproject on "Scheduling and Adaptive Execution of Data Analysis Workflows across Heterogeneous Infrastructures" (between the groups of Prof. Odej Kao at TU Berlin and Prof. Henning Meyerhenke at HU Berlin). I am further part of the Thesis Advisory Committee of three PhD students (primarily supervised by Prof. Ulf Leser at HU Berlin and Prof. Odej Kao at TU Berlin). I was also a guest professor in the Department of Computer Science at HU Berlin, substituting for Prof. Ulf Leser in the context of the project in the winter semester of 2021/2022.

Consortium: The project involves 19 research groups from eight research organizations, mostly located in Berlin and Brandenburg. It is coordinated by Humboldt-Universität zu Berlin and includes Technische Universität Berlin, Freie Universität Berlin, Universität Potsdam, Zuse-Institut Berlin, Heinrich-Hertz-Institut Berlin, and Charité – Berlin University of Medicine, among other institutions.

Link: Project website

Completed Projects

I helped develop and conduct the following funded research projects in the past:

ide3a logo

ide3a: International Alliance for Digital e-learning, e-mobility and e-research in Academia (2020-2023, funded by the German Academic Exchange Service - DAAD)

Aim: The project develops new tools for the digitalization and internationalization of teaching and learning, with a particular focus on interdisciplinary research-based learning and short-term mobility in international blended learning formats. It further conducts teaching and research on the digitalization of critical infrastructures like water networks, energy grids, sensor networks, and other interconnected urban systems.

Involvement: As one of the four core partners of the two PIs I helped teach distributed IoT systems, sensor networks, and urban infrastructure management. I co-led an international winter school on "Smart Sensing" and lectured on the "Distributed IoT Systems". I also work closely with the project’s RAs and contributed to research on (co-)simulation and testing of urban infrastructure systems.

Consortium: The project involved a multidisciplinary and international consortium of five European partner universities, led by Technische Universität Berlin and including the Norwegian University of Science and Technology, Politecnico di Milano, Cracow University of Technology, and Dublin City University.

Link: Project website

bifold logo

BIFOLD: The Berlin Institute for the Foundations of Learning and Data (2020-2022, funded by the German Federal Ministry of Education and Research - BMBF)

Aim: The research center conducts foundational research addressing challenges in artificial intelligence and data science, with a particular focus on data management, scalable data processing, and machine learning as well as their intersection. It further aims to educate future talent and create high-impact knowledge exchange.

Involvement: I was part of a subproject on "Adaptive Monitoring and Fault Tolerance for Distributed Analytics Pipelines in Critical IoT-Sensor Systems", investigating new methods for the adaptive use of heterogeneous, distributed resources for the efficient processing of sensor data streams with Prof. Odej Kao and one RA at TU Berlin. The goal was to capture the key characteristics of constantly changing computing environments and workloads and have distributed processing systems adapt appropriately to major changes, so the required quality of service can be provided as far as possible and even in case of failures.

Consortium: The research center is managed by the Technische Universität Berlin and involves researchers from numerous organizations in the Berlin/Brandenburg area of Germany, including Freie Universität Berlin and German Research Center for Artificial Intelligence, for instance.

Link: Project website

optima logo

OPTIMA: Data Integration for Predictive Control and Operational Optimization of Pumps in Wastewater Systems (2019-2022, funded by the European Regional Development Fund - ERDF)

Aim: The goal of the joint project was to integrate data from various sources (distributed sensors, weather forecasts, historic operational data) to operate wastewater pumps energy-efficiently in normal operation and reduce sewage overflow volumes in extraordinary heavy-load events. For this, the project developed a predictive control system for wastewater pumps that anticipates loads and learns pump-specific behavior.

Involvement: I led a subproject with an RA and a student assistant on data integration and platform development, providing a central reliable system between sensors, forecasting components, and the pump control system. This central data analysis platform was made up of scalable and reliable messaging, stream processing, and data management systems.

Consortium: The project involved a multidisciplinary consortium with two research groups from Technische Universität Berlin, the engineering office Ingenieurgesellschaft Prof. Dr. Sieker mbH, the applied research institute Fraunhofer FOKUS, and the water utility Berliner Wasserbetriebe (as associated partner).

Link: Key publication

telemed logo

Telemed5000: Artificial Intelligence to Improve Remote Patient Management for Heart Failure Patients (2019-2022, funded by the German Federal Ministry for Economic Affairs and Energy - BMWi)

Aim: The aim of the joint project was to develop an intelligent system for the telemedical co-management of several thousand cardiological risk patients.

Involvement: I was an associated partner, working closely with the Operating Systems and Middleware group of Prof. Andreas Polze at Hasso Plattner Institute on the project's core decision support system and machine learning application. This allows estimating a risk score based on a patient's vital parameters for sorting all cases every day to help practitioners focus their limited capacities on the most severe cases.

Consortium: The project involved a multidisciplinary consortium, led by Charité – Berlin University of Medicine, and included researchers from several research institutes (Hasso Plattner Institute, Austrian Institute of Technology, Fraunhofer IAIS), enterprises (GETEMED, SYNIOS), and associated partners (Technische Universität Berlin, Deutsches Herzzentrum Berlin).

Link: Project website

watergridsense logo

WaterGridSense: Condition Monitoring and Prognosis in Water and Wastewater Networks Using Distributed Sensors (2018-2021, funded by the German Federal Ministry of Education and Research - BMBF)

Aim: The project developed a small, self-sufficient, and configurable sensor platform for use in rainwater and wastewater systems. The sensor platform connected wirelessly to LoRaWAN gateways and provided time series data on the condition of the wastewater network. Using scalable reliable systems, the sensor data was aggregated and analyzed to present status information in a utility's operations and maintenance platform.

Involvement: I led a subproject on "Sensor Data Analysis (Platform, Status Monitoring, and Prediction)" with two RAs and student assistants, developing a novel sensor streaming data analysis platform and a set of monitoring applications. The platform continuously receives raw sensor data from distributed sensors and reliably processes them for display in a monitoring platform for end users. It supports use cases such as detecting sewage overflow events, predicting clogged street drains, and identifying infiltration from unforeseen connections to wastewater systems based on sensor data streams from stationary and floating sensor platforms.

Consortium: The project was a collaborative effort between water utilities (Hamburg Wasser, Berliner Wasser Betrieb), industry (Ingenieurgesellschaft Prof. Dr. Sieker, Walter Tecyard, Funke Kunststoffe, ACO Severin Ahlmann), and research institutions (Technische Universität Berlin, HAW Hamburg).

Link: Project website

bundesdruckerei logo

Reliable Resource Management of Critical Distributed IoT Sensor Systems (2018-2019, sponsored by the Bundesdruckerei Group)

Aim: The goal of the project was to conceive, implement, and evaluate new approaches for the adaptive and dynamic management of distributed and heterogeneous yet trustworthy resources in complex sensor systems.

Involvement: We developed and demonstrated a novel method for obtaining a hardware fingerprint from an analog sensor that can be used for sensor authentication and hardware integrity checks. The approach exploits the characteristic behavior of analog circuits, which is revealed by applying a fixed-frequency alternating current to a sensor while recording its output voltage. By comparing such a fingerprint against reference measurements recorded prior to a sensor's deployment, it can be determined whether sensing hardware connected to an IoT device has been changed by environmental effects or with malicious intent.

Consortium: This was a collaborative research project between the Innovation Department of the Bundesdruckerei Group and the Complex and Distributed Systems group at Technische Universität Berlin, with an interdisciplinary team that included three experts from the industry partner and three RAs at TU Berlin.

Link: Key publication

Other Previous Projects

I contributed to the following funded research projects as a research assistant:

bbdc logo

BBDC: Berlin Big Data Center (2014-2017, funded by the German Federal Ministry of Education and Research - BMBF)

Aim: The project's aim was to develop highly innovative technologies to organize vast amounts of data and to derive informed decisions from such data in order to create economic and social value, with a focus on three exemplary economically, scientifically, and socially relevant application areas: materials science, medicine, and information marketplaces.

Involvement: I contributed to a subproject on "Adaptive Data and Control Flows", aiming to develop a distributed execution engine for distributed data processing that recognizes key characteristics of underlying architectures (CPU, network topology, memory distribution) to implement data partitioning and resource allocation in a way that jobs stay within specified time bounds. For this, we developed new methods that adaptively place data blocks and execution containers for recurring distributed dataflow jobs to balance local data access with distributed job execution.

Consortium: The competence center was managed by the Technische Universität Berlin and involved numerous groups from research organizations in the Berlin/Brandenburg area of Germany, including Freie Universität Berlin, Humboldt-Universität Berlin, Zuse-Institut Berlin, German Research Center for Artificial Intelligence, Fraunhofer FOKUS, and Charité – Berlin University of Medicine, among others.

Link: The project website was discontinued when the project evolved into BIFOLD.

stratosphere logo

Stratosphere II: Advanced Analytics for Big Data (2014-2017, funded by the German Research Foundation - DFG)

Aim: In its second phase of funding, the Research Unit Stratosphere (FOR 1306) set out to considerably advance the state-of-the-art in designing and building systems for executing complex data analysis programs on massive datasets, building upon the Stratosphere I system (which was also used as the basis for the successful open-source dataflow system Apache Flink) with its runtime system, compiler, optimizer, and scripting languages.

Involvement: I was the RA in a subproject with the aim to develop a "Scalable, Massively-Parallel Runtime System with Predictable Performance", investigating new approaches for dynamic resource allocation for distributed dataflow systems. In particular, we developed novel methods for capturing the scale-out behavior of distributed dataflow jobs, selecting similar previous executions of a job as a basis for accurate performance models, and continuously adjusting resource allocations based on scale-out models to ensure jobs meet their runtime targets.

Consortium: The project involved a consortium of five research groups at Technische Universität Berlin, Hasso Plattner Institute (University of Potsdam), and Humboldt Universität zu Berlin.

Link: Project website

optima logo

HPI-Stanford Design Thinking Research Program (2013-2014, funded by the 'Hasso-Plattner-Förderstiftung' (Hasso Plattner Trust))

Aim: The program strives to apply rigorous academic methods to understand how and why Design Thinking innovation works and fails. Researchers in the program study the complex interaction between members of multi-disciplinary teams challenged to deliver design innovations. An important feature of the domain is the need for creative collaboration across spatial, temporal, and cultural boundaries. In the context of disciplinary diversity, it is investigated how Design Thinking meshes with traditional engineering and management approaches, specifically, why the structure of successful design teams differs substantially from traditional corporate team structures.

Involvement: I was a student research assistant in the 2013/2014 subproject on "From Problem Prevention to Graceful Recovery: Recovery Tools As Enabler For Trial and Error in Program Design ", working on object versioning as a generic approach to preserve access to previous development and application states in explorative programming systems. In close collaboration with SAP Labs, we implemented a working prototype of the approach on top of JavaScript/ECMAScript 6 for the Lively Kernel, a dynamic programming system for the Web designed by Dan Ingalls.

Consortium: There were 14 subprojects (seven at Hasso Plattner Institute, seven at Stanford University) in 2013/2014. I was part of the subproject and team of Prof. Robert Hirschfeld at Hasso Plattner Institute.

Link: Project website