If you wish to propose your own topic in a related field, please feel free to do so and contact us!
Open theses topics for the 2023/2024 study year
List of Supervisors
Edge-Cloud Computing Continuum
Cloud Cost Visibility: A Single-Window Solution (M,B)
Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, Oracle Cloud, and IBM Cloud are among the major players in the cloud service provider market. They offer a wide spectrum of services to meet the diverse requirements and use cases of their clients. However, businesses often deploy their solutions in a multi-cloud environment, where some services are hosted on one cloud platform while others are deployed on another. In such scenarios, obtaining an overall picture of cost utilization becomes challenging from the client’s perspective.
This thesis focuses on building a single-window solution that provides comprehensive visibility into overall cloud cost utilization across multiple cloud providers. To achieve this, it is essential for the student to be familiar with cloud services, concepts, and terminologies. The student will investigate the cost models of each cloud provider, design a common cost model, and transform the cloud-specific cost models into a unified framework.
[dehury@ut.ee]
Optimizing Cost of Cloud Service Deployment (M)
Business X aims to develop and deploy a small-scale web application on the cloud, consisting of various modules such as a Flask application, database, API gateway, or serverless functions. These modules can be deployed in different environments, including managed Kubernetes, serverless platforms, on-demand virtual machines (VMs), reserved VMs, or on-premise infrastructure. The cost of deployment varies depending on the chosen environment. This thesis will analyze the cost of each deployment option and propose a strategy to optimize the overall cost for deploying the entire web application.
[dehury@ut.ee]
From Data to Decisions: Knowledge Discoverability in Edge Infrastructure (M)
The knowledge/intelligence discoverability in the edge computing infrastructure pertains to the capacity of edge devices to autonomously discern and act upon patterns and insights from localized data processing. By situating computation closer to data sources, such as IoT devices, edge computing facilitates real-time analytics, mitigating the latency inherent in centralized cloud systems. This immediate processing empowers edge devices to make informed, autonomous decisions based on the knowledge they extract. For instance, a smart traffic management system at an intersection can leverage edge computing to analyze traffic patterns in real-time and autonomously adjust signal timings to optimize flow, without relaying data to a central server. Over time, these devices not only accumulate domain-specific knowledge but can also collaboratively share insights with other edge nodes or central systems, enhancing the collective intelligence of the entire network. Thus, edge infrastructures transition from mere data processing nodes to pivotal hubs of dynamic knowledge and adaptability. This thesis will start with evaluating the current discoverability protocols and their applicability in edge infrastructure.
[dehury@ut.ee]
Deep Reinforcement Learning in Optimising Kubernetes Workload Controllers (M)
This thesis delves into the complex dynamics of cloud computing clusters leveraging container technologies, particularly focusing on the Kubernetes tool and underlined container engine, containerd/Docker. As modern cloud infrastructures gravitate towards container orchestration, Kubernetes and its components, including the cluster autoscaler and workload controller, have gained paramount importance. While the existing system provides a foundational structure, this thesis focuses on the integration of Deep Reinforcement Learning (DRL) to enhance the adaptability and efficiency of Kubernetes workload controllers. By applying DRL, we aim to achieve an intelligent, self-adjusting system that can autonomously optimize container resource allocation and workload distribution, ensuring cost-effective scalability and heightened system performance.
[dehury@ut.ee]
SecondMe: Your Digital Twin (M)
Imagine a situation where you’ve forgotten the details of your last trip to the US. You want to recall your experiences from that trip: the places you visited, the restaurants where you dined, and the difficulties you faced. You might want to remember exactly what you did on the same day last year or revisit the events you attended in a specific area two years ago. Perhaps you want to recall the people you met at a particular event a year back or stay informed about events happening at the same institutes. All this information might be scattered across emails, calendars, and memories.
In the modern age of information overload, it’s easy to forget past experiences or upcoming events of interest. Now, imagine having a digital version of yourself, let’s call it SecondMe, that assists you at every step of your life. SecondMe sees everything you do, tracks where you visit, logs whom you meet, and keeps a record of all your emails. Whenever you need information, you can simply ask SecondMe for detailed insights.
This isn’t just another social network or professional application. It’s an application that represents you, helping you remember and manage your life seamlessly.
In this thesis, you will implement the initial architecture, including the database, how to capture the data (from email, from photos, from location history, etc), language, etc.. At the moment, designing the front end is not of a high priority and up to the student to decide.
[dehury@ut.ee]
More Topics can be found in this Google Doc: https://docs.google.com/document/d/14SPgOTF6f8nw8bpxl-3RMbDxwELranK1_bYBK2jzoy8/edit?usp=sharing
Contact: chinmaya.dehury@ut.ee, Delta building r3040
Go to top
Internet of Things topics
Synthetic IoT data generator for large scale IoT Device simulation (M) (Pelle Jakovits)
The student should evaluate extending open source tool for generating real-like IoT data based on existing captured data traces and “play it back” to simulate real data in an IoT network. The goal is to design and create a solution which processes existing data traces and can generate similar behaving data stream with high-volume and frequency. It should also support customizing the generated data stream, including volume and frequency, structure (like ratios between the different types of sub-streams), randomizing certain fields of records.
Reliability and performance of Open Source IoT platforms (B, M) (Pelle Jakovits)
The goal of this topic is to analyse the reliability and performance of open source IoT platforms focusing on solutions that support large use cases (e.g not home automation but rather Smart City, devices deployed over large geographical locations). The main aspects to focus on are the performance, stability and the scope of features (e.g device integration, integration with external services, data processing and analytics, extensibility). The main questions to answer are:
- Which OS solutions in the market are the most suitable for supporting large Smart City use cases.
- How easy is to integrate new IoT devices.
- Are open source IoT frameworks mature enough for production systems and are they feasible alternatives to cloud-managed subscription-based IoT platforms.
[Already Taken] Solution for sharing and publishing smart city open data for Tartu (B, M) (Pelle Jakovits)
More than a TB of data has been collected by Tartu Smart City data repository in the Cumulocity IoT platform. A previous thesis has developed tool(s) for collecting and publishing smart city data sets from the Cumulocity platform. The goal of this thesis is to further develop the approach and deploy a working solution that is able to handle the sharing and publishing of large amount of Smart City data. The solution should be able to handle data that is being updated continuously by sensors or other devices and systems deployed in the city and give the data owner tools and means to control how the data is made available.
Predictive maintenance of IoT devices (M) (Pelle Jakovits)
The goal of this topic is to investigate the existing approaches and solutions for predictive maintenance. Study what data must be collected about the environment and failures. Design and test a solution for one IoT use case. For example, predicting the failures of the SD cards in IoT devices. You will set up IoT devices, generate synthetic failure workloads to collect data, and use machine learning techniques to predict future failures.
Root Cause Analysis or IoT device network failures (B, M) (Pelle Jakovits)
The goal is to find patterns in data that describe problematic IoT devices. The initial use case is water metering devices that send data packets and have a data transmission credit – which represents the quality of data transmission connectivity. The presence of a problem is indicated by a continuous decrease in the credit for sending data or reaching zero. Some of the questions to investigate are: Based on all the data points, what is the pattern of the problem device? What parameters define the problematic devices?
Automated meta-data generator and clustering for IoT data sources (B, M) (Pelle Jakovits)
Extending the work done in a previous thesis, “Knowledge Graphs for Cataloging and Making Sense of Smart City Data”, the goal of this thesis is to design and test an approach that would automatically classify IoT data series into the correct types based on the data values and generate metadata describing the data stream based on the data values and additional textual information (e.g. IoT device names and descriptions, units, location etc. )
Automated time-series data quality monitoring and enhancement for Smart City data (M) (Pelle Jakovits)
Extending the work done in a previous thesis (“Anomaly Detection and Imputation for Tartu Traffic Sensors”), the goal of this thesis is to design, test, and deploy a software framework that automatically collects data, builds data enhancement (cleaning & imputation) model, and applies the model to enhance the IoT data stream in real-time (or in micro-baches).
Automating the configuration and deployment of data migration pipelines (B, M) (Pelle Jakovits)
The goal of this topic is to design, test, and validate a solution that can configure and deploy data migration pipelines for continuous data migration between different database systems. For example, from an IoT database to an SQL database or a data warehouse. While there exist many open source solutions (e.g. NiFi, Dagster, Airflow) that can orchestrate data pipelines and support large number of data sources, data format transformation steps and other intermediate data manipulation actions, developing new pipelines still requires significant effort in development, configuration and validation. The goal of this thesis is to investigate how to reduce the manually developed implementation effort needed to develop additional pipelines by automating as much of the process as possible.