If you wish to propose your own topic in a related field, please feel free to do so and contact us!

Open theses topics for the 2025/2026 study year

List of Supervisors

Synthetic IoT data generator for large-scale IoT Device simulation (M) (Pelle Jakovits)

The student should evaluate extending open source tool for generating real-like IoT data based on existing captured data traces and “play it back” to simulate real data in an IoT network. The goal is to design and create a solution which processes existing data traces and can generate similar behaving data stream with high-volume and frequency. It should also support customizing the generated data stream, including volume and frequency, structure (like ratios between the different types of sub-streams), randomizing certain fields of records.

Reliability and performance of Open Source IoT platforms (B, M) (Pelle Jakovits)

The goal of this topic is to analyse the reliability and performance of open source IoT platforms focusing on solutions that support large use cases (e.g not home automation but rather Smart City, devices deployed over large geographical locations). The main aspects to focus on are the performance, stability and the scope of features (e.g device integration, integration with external services, data processing and analytics, extensibility). The main questions to answer are:

  • Which OS solutions in the market are the most suitable for supporting large Smart City use cases?
  • How easy is it to integrate new IoT devices?
  • Are open source IoT frameworks mature enough for production systems, and are they feasible alternatives to cloud-managed subscription-based IoT platforms?

Root Cause Analysis or IoT device network failures (B, M) (Pelle Jakovits)

The goal is to find patterns in data that describe problematic IoT devices. The initial use case is water metering devices that send data packets and have a data transmission credit – which represents the quality of data transmission connectivity. The presence of a problem is indicated by a continuous decrease in the credit for sending data or reaching zero. Some of the questions to investigate are: Based on all the data points, what is the pattern of the problem device? What parameters define the problematic devices?

Automated metadata enhancer for IoT data sources in Smart City (M) (Pelle Jakovits)

A common issue with Smart City data is that their metadata is often lacking, and there is no systematic metadata mapping and documentation. Investigate what methods exist for enhancing the metadata of already collected data to make it more usable, structured, and manageable.

Extending the work done in a previous thesis, “Knowledge Graphs for Cataloging and Making Sense of Smart City Data”, the goal of this thesis is to design and test an approach that would automatically classify IoT data series into the correct types based on the data values and generate metadata describing the data stream based on the data values and additional textual information (e.g. IoT device names and descriptions, units, location etc.)

Synthetic data generator for large-scale cybersecurity trainings (M) (Pelle Jakovits)

Evaluate open-source tools for generating real-like IoT data based on existing captured data traces and “play it back” to simulate real data in an IoT network. Design and create an automated solution which processes existing data traces and can generate similar behaving data streams.

Automated Grader for Cloud Computing course (B/M) (Pelle Jakovits)

A grading system based on automated testing or monitoring frameworks needs to be designed and implemented for usage in the Cloud Computing course for students and course organizers.

Automated Grader for the “Development of Web Services and Distributed Systems” course (B/M) (Pelle Jakovits)

A grading system based on automated testing or monitoring frameworks needs to be designed and implemented for usage in the Development of web services and distributed systems course for students and course organizers.

Cost optimization of Serverless applications (M) (Pelle Jakovits)

Serverless (or FaaS) applications enable the event-based execution model for Cloud services, where you do not need to pay for the idle run time, but instead pay for how long it takes for functions to be executed and how much memory is allocated. By optimizing memory consumption or runtime, it is possible to reduce the cost of Serverless applications. Investigate what characteristics affect the cost of Serverless applications in the Cloud the most, compare to non-Serverless approaches and find ways to reduce the cost of Serverless applications. 

Estimating the energy consumption of buildings in Smart Cities (M) (Pelle Jakovits)

The goal of this thesis is to use building metadata from different data sources, like Estonian state registries, to estimate how much energy is consumed by the building. Taking into account the type of construction, materials, age of the building, and registered heating and cooling systems inside the building or its apartments. This topic can build upon an earlier thesis that designed a working solution for estimating the energy production of Smart City buildings. We also have data available from a number of Tartu City Government buildings and SmartEnCity project buildings.

Real-time visualisation of mobility data for Smart City data platform (B/M) (Pelle Jakovits)

The goal is to create a Map based visualizer of Smart City data that could be used for visualizing device trajectories and data changes. It should work together with an existing API based Smart City platform, which is used to store large amounts of data sent by Smart City and IoT devices. Students should investigate existing solutions, collect and document requirements, design a working solution, and deploy it in the university Cloud.

Web interface for configuring Dagster data integration pipelines  (B/M) (Pelle Jakovits)

The goal is to create a Web frontend for defining data integrations between external APIs and an API based Smart City database, which is used to store large amounts of data sent by Smart City and IoT devices. Students should investigate existing solutions, collect and document requirements, design a working solution, and deploy it in the university Cloud. This topic can build upon an earlier thesis that designed a working solution for implementing and automating Dagster-based data integration pipelines.

Edge Intelligence Observability: Lightweight and Hybrid Monitoring Approaches  (M) (Vinayak Khavasi)

Observability—the ability to measure the internal state of a system by observing external outputs—is a cornerstone of reliable distributed computing. In the context of edge intelligence, observability becomes uniquely challenging due to the constrained resources, intermittent connectivity, and real-time requirements of edge devices. Traditional observability and MLOps tools such as MLFlow, EvidentlyAI, and Alibi Detect offer comprehensive functionality but are often too resource-heavy for microcontrollers or lightweight IoT devices. This thesis explores the design and evaluation of lightweight observability methods tailored for edge AI, focusing on hybrid architectures that combine on-device minimal monitoring with edge gateway/local aggregation strategies. By synthesizing existing open-source tools with custom statistical metrics (e.g., mean, variance, categorical counts, error rates) and streaming libraries such as River, the goal is to propose an efficient and practical framework that ensures reliable performance monitoring, drift detection, and anomaly identification in resource-constrained environments.

Industrial IoT Edge AI Observability: Predictive maintenance via observability of embedded AI models in manufacturing equipment.  (M) (Vinayak Khavasi)

The benefits of observability in AI models extend to enhanced decision-making capabilities, as it provides insights into system performance and potential areas for improvement, fostering a proactive maintenance culture. Additionally, observability facilitates the identification of anomalous patterns in equipment behavior, enabling timely interventions that can prevent costly breakdowns and extend asset lifespan. This proactive approach not only minimizes unplanned downtime but also optimizes resource allocation, ultimately leading to significant cost savings and improved operational efficiency in manufacturing environments. The integration of AI models into manufacturing processes not only enhances predictive maintenance but also supports the broader goals of Industry 4.0 by improving efficiency and connectivity across systems.

Observability in Edge AI vision models for autonomous vehicles  (M) (Vinayak Khavasi)

Investigate observability in Edge AI vision models for autonomous vehicles, with a focus on ensuring safety and reliability through runtime monitoring. The work should aim to identify effective observability metrics (e.g., drift detection, uncertainty estimation, anomaly monitoring) and develop lightweight monitoring mechanisms suitable for resource-constrained edge devices. By exploring trade-offs between performance, latency, and safety, the study should propose an observability framework/architecture that can detect critical failures, adapt to real-world distribution shifts (e.g., weather, lighting, sensor degradation), and provide interpretable safety signals for both system operators and regulators.

Unified Observability Framework for Edge AI  (M) (Vinayak Khavasi)

Unified observability for edge AI represents a critical convergence of distributed systems monitoring, machine learning operations (MLOps), and edge computing infrastructure. Synthesize current research and industry practices to provide a comprehensive framework for implementing observability solutions that span the edge-to-cloud continuum.
Explain how an unified approach combines traditional observability pillars (metrics, logs, traces) with AI-specific monitoring requirements (model performance, data drift, inference quality) and edge-specific constraints (resource limitations, network intermittency, privacy requirements).

Log fingerprint recognition using computer vision methods (M) (Andres Namm)

Objective: to test and compare existing computer vision methods for detecting unique surface patterns (cross-sectional annual rings, cracks, bark marks) of logs. Public datasets such as Biomtrace (photos of cut edges of logs) and ATECH2024 (image processing and matching models) are used. Methods:

  • segmentation of cross-sectional surfaces,
  • feature extraction and comparison (e.g. SuperPoint, LightGlue),
  • testing on photos and data of both individual logs and log stacks,
  • evaluation of results in terms of accuracy and robustness.

Result: the work provides an overview of how well existing methods work in determining the individual identity of logs both individually and in stacks, and how suitable they may be in supply chain traceability solutions.

Harvesteri andmeid töötleva veebirakenduse ehitamine (TypeScript/Python) (B) (Andres Namm)

Töö eesmärk on ehitada avatud lähtekoodiga veebirakendus, mis suudab 

  1. lugeda ja valideerida harvesteri StanForD-formaadi faile (nt APT/PIN, PRI/PRD, STM, DRF)
  2. kuvada nende sisu kasutajaliideses (tabelid, puiduassortimendid, mõõtmised, geoasukoht, tootmislogid) 
  3. võimaldada põhilisi töötlusi: filtreerimine, agregeerimine, eksport, lihtne kvaliteedikontroll (nt puidu diameetri-pikkuse kontrollid) ja valikuline konverteerimine teistesse formaatidesse (CSV/GeoJSON). Rakendus võib olla teostatud kas TypeScripti (Node.js + React) või Pythoni (FastAPI + Streamlit) ökosüsteemis. 

Go to top