starting point:
There are increasingly many applications that make use of large amounts of data. This includes scientific data processing in research organizations as well as big data applications in industry. Meanwhile, there are also more and more data-intensive applications in context of the distributed sensors and devices of the Internet of Things.
We believe there is immense potential in using data-driven methods and machine learning across domains and disciplines. Though, if we are not careful, this development will further increase computing’s environmental footprint. This footprint is already estimated to contribute 2 to 3 percent of the global carbon emissions – very much rivaling aviation – and is projected to dramatically rise further over the next decades, not least because of the trend towards more data-intensive applications.
research agenda:
The central aim of our research is to reduce the carbon footprint of large-scale data processing applications on today's diverse computing infrastructures.
Towards this goal, we work on methods, systems, and tools that support a more resource-efficient and sustainable use of modern distributed computing infrastructures (including, for instance, edge and cloud resources) for data-intensive applications (including, for example, machine learning, data analytics, and stream processing), while other requirements (such as for an application's performance or dependability) are still taken into account as well.
Our guiding principles for this are:
- compute when and where green energy is available (which can, for instance, translate to carbon-aware scheduling, scaling, and resource allocation) as the carbon intensity of grids and renewable energy generation often varies
- allocate resources for high resource utilization and highly utilize allocated resources (which can, for example, translate to "right-sizing", server consolidation, co-location of jobs with complementary resource demands, and bottleneck mitigation) as any energy is wasted with mostly idling resources
- save computation and communication through distributed and dynamic architectures (which can, for instance, translate to edge computing, adaptively offloading tasks and scaling out to more nodes only when necessary, effective caching, or distributed learning) as it is sometimes possible to do the same with less in complex software systems
Much of the work builds upon years of research on adaptive resource management for data-intensive applications (e.g. for big data analytics, scientific workflows, and machine learning) at and with TU Berlin.
On this basis, we currently focus on investigating and making use of:
- profiling and predicting application and infrastructure performance as well as power consumption
- forecasting of computational loads, carbon emissions, and the availability of renewable energy
- carbon-optimized resource allocation, dynamic scheduling and scaling, and automatic system tuning
Another focus is on techniques and tools that support the monitoring, testing, and benchmarking of the behavior and impact of data-intensive applications.
selected results:
- FedZero: Leveraging Renewable Excess Energy in Federated Learning. 14th ACM International Conference on Future Energy Systems (e-Energy'24).
- Our implementation in Flower and the code for our experiments is available on GitHub
- Lotaru: Locally Predicting Workflow Task Runtimes for Resource Management on Heterogeneous Infrastructures. Future Generation Computer Systems 150. 2024.
- Extended article of work first presented at SSDBM'22, showing heterogeneous cluster scheduling, cloud cost estimation, and carbon-aware scheduling based on predicted runtimes
- Karasu: A Collaborative Approach to Efficient Cluster Configuration for Big Data Analytics. 42nd IEEE International Performance Computing and Communications Conference (IPCCC'23).
- Our prototype implementation in Python with PyTorch and BoTorch is available on GitHub
- Cucumber: Renewable-Aware Admission Control for Delay-Tolerant Cloud and Edge Workloads. 28th International European Conference on Parallel and Distributed Computing (Euro-Par'22).
- Phoebe: QoS-Aware Distributed Stream Processing through Anticipating Dynamic Workloads. 20th IEEE International Conference on Web Services (ICWS'22).
- Let's Wait Awhile: How Temporal Workload Shifting Can Reduce Carbon Emissions in the Cloud. 22nd ACM/IFIP International Middleware Conference (Middleware'21).
- Towards a Staging Environment for the Internet of Things. 19th IEEE International Conference on Pervasive Computing and Communications (PerCom'21).
- LEAF: Simulating Large Energy-Aware Fog Computing Environments. 5th IEEE International Conference on Fog and Edge Computing (ICFEC'21).
- Our simulator implementations (Python, Java) have been used in research and teaching at several universities (e.g. TU Berlin, UofG, Edinburgh Napier, PoliMi, UCLM, IIT Guwahati, IIT Patna, and SJTU).
Several of these works are featured in my talk in our Low-Carbon and Sustainable Computing seminar series and the recording of the talk is available online.
involved students:
- PhD students:
- L4/L5 project students:
- 2024/2025: Andres La Riva Perez, Danial Tariq, Isabella Gard, Magnus Reid, Matthew Waters
- 2023/2024: Andres La Riva Perez, James Nurdin, John Wilson, Karl Hartmann, Ricky Arthurs
- 2022/2023: James Sharma, Ricky Arthurs, Rishabh Mathur
- interns:
- 2023/2024: Frederik Glitzner, Hubert Dymarkowski
- 2022/2023: Domonkos Revesz, Niovi Lampiri
visiting researchers:
academic collaborations:
industry collaborations:
research infrastructure:
For our research, we have access to a CPU cluster, a GPU cluster, and a variety of individual devices (e.g. different server architectures, IoT systems, energy meters) in Glasgow. In addition, we are regularly able to make use of other compute infrastructure (e.g. public/private cloud services) through grants and collaborations.