Thesis to Tangible: How MPI-SWS fellowship Shaped by my Undergraduate Research

3 minute read

Published:

During the early fall of 2023, from July to October, I had the privilege of serving as a Research Fellow in Dr. Laurent Bindschaedler’s Data System research group at Max Planck Institute for Software Systems in Saarbrücken, Germany. It was an immersive and enlightening experience, involving work on cutting-edge research challenges alongside some of the most distinguished researchers and PhD students, and also making some amazing friends from across the globe. In this blog , I will revisit the most memorable aspects of this extraordinary journey.

this is pic of max planck institute

MPI was generous enough to pay for my flights as well as my accomodation there; along with my travel from Frankfurt to Saarbrucken by train (ICE travel is not cheap!); and then from the station to the university by Taxi .However, a drawback was the limited English speakers in Saarbruecken, as it isn’t exactly a ‘tourist’ city. In the photo you could see Campus E1-5 which is MPI-SWS. There is something really refreshing about the work culture, The use of glass-walled rooms and an ‘open-door’ policy encourage interaction amongst the Ph.D. students,Interns and Professors. People you pass will inevitably wish you everyday. Everyone go for lunch together, and talk about almost everything over lunch, and later coffee.

this is pic of max planck institute

In the second week of August, Max-Planck Institute for Software Systems (Germany), Cornell University (U.S.A), and University of Maryland (U.S.A) arranges a week-long pre-doctoral research school called CMMRS. This is a flag ship event by Max Planck Society, that specifically targets senior year undergraduates and Master students encouraging them to pursue Ph.D. and research career. During this summer school I learned about advanced concepts of algorithmic decision making with a focus on social questions like privacy and security from Yixin Zou. It was an absolute honor to be selected for such prestigious event and discuss my research interests with leaders from industry and academia.

this is pic of max planck institute

Glimpse of Data-workflow.

In this section,I will give a small walk-through of my on-going Satellite-Data-Augmentation project. This project is incubated by Data-system research group in MPI-SWS under the supervision of Dr. Laurent Bindschaedler.

this is pic of max planck institute

Above picture depicts an unstructed text data files, which includes tweets indexed according the dates. We scraped the tweets using third-party tool which takes a dates and keyword as a input.

this is pic of max planck institute this is pic of max planck institute this is pic of max planck institute

From the unstructured text files we have been successfull in extracting the information of interest like city-state-country,audio,video, images and geo-tags from the tweet’s body and its metadata. I personally had overseen ETL process, we employed OpenAPI spec and pydantic model to extract text data in correct format.

this only one example of data source, telegram channels is also used to extract data, you can play around with telegram code, repo is linked below.

this is pic of max planck institute

Satellite imagery of recent Wildfires in Hawaii: The blue dots 🔵 indicate the geolocation of tweets which is Maui and Lahaina. while the red dots 🔴 indicate thermal imagery and geographical presence of wildfire. The color codes are used to clearly differentiate human-settlement and natural calamity, for this experiment we have used data from sentinel-hub api.

System Performance.

As for the user interface part, we build a mobile and desktop responsive facebook’s messanger alike UI which can be navigated with greater ease on phones than other applications in public domain. The inference engine of the system which is build using in-house LLMA2 large language model, has shown remarkable capability to analyze data, even more it can perform machine learning tasks like regression analysis on data. Just like chatGPT or BardAI user can can prompt question, will get output explainable in human language with extra support features like graphs,maps and pictures.

this is pic of max planck institute
this is pic of max planck institute


this is pic of max planck institute

NOTE: This is an ongoing project and it will be opensourced in April 2024 meanwhile you can clone the initial github repo from here

MPI’s soccer team is always Bundesliga ready! 😉

this is pic of max planck institute