The student should evaluate extending open source tool for generating real-like IoT data based on existing captured data traces and “play it back” to simulate real data in an IoT network. The goal is to design and create a solution which processes existing data traces and can generate similar behaving data stream with high-volume and frequency. It should also support customizing the generated data stream, including volume and frequency, structure (like ratios between the different types of sub-streams), randomizing certain fields of records.
The goal of this topic is to analyse the reliability and performance of open source IoT platforms focusing on solutions that support large use cases (e.g not home automation but rather Smart City, devices deployed over large geographical locations). The main aspects to focus on are the performance, stability and the scope of features (e.g device integration, integration with external services, data processing and analytics, extensibility). The main questions to answer are:
The goal is to find patterns in data that describe problematic IoT devices. The initial use case is water metering devices that send data packets and have a data transmission credit – which represents the quality of data transmission connectivity. The presence of a problem is indicated by a continuous decrease in the credit for sending data or reaching zero. Some of the questions to investigate are: Based on all the data points, what is the pattern of the problem device? What parameters define the problematic devices?
A common issue with Smart City data is that their metadata is often lacking, and there is no systematic metadata mapping and documentation. Investigate what methods exist for enhancing the metadata of already collected data to make it more usable, structured, and manageable.
Extending the work done in a previous thesis, “Knowledge Graphs for Cataloging and Making Sense of Smart City Data”, the goal of this thesis is to design and test an approach that would automatically classify IoT data series into the correct types based on the data values and generate metadata describing the data stream based on the data values and additional textual information (e.g. IoT device names and descriptions, units, location etc.)
A grading system based on automated testing or monitoring frameworks needs to be designed and implemented for usage in the Cloud Computing course for students and course organizers.
A grading system based on automated testing or monitoring frameworks needs to be designed and implemented for usage in the Development of web services and distributed systems course for students and course organizers.
Serverless (or FaaS) applications enable the event-based execution model for Cloud services, where you do not need to pay for the idle run time, but instead pay for how long it takes for functions to be executed and how much memory is allocated. By optimizing memory consumption or runtime, it is possible to reduce the cost of Serverless applications. Investigate what characteristics affect the cost of Serverless applications in the Cloud the most, compare to non-Serverless approaches and find ways to reduce the cost of Serverless applications.
The goal of this thesis is to use building metadata from different data sources, like Estonian state registries, to estimate how much energy is consumed by the building. Taking into account the type of construction, materials, age of the building, and registered heating and cooling systems inside the building or its apartments. This topic can build upon an earlier thesis that designed a working solution for estimating the energy production of Smart City buildings. We also have data available from a number of Tartu City Government buildings and SmartEnCity project buildings.
The goal is to create a Map based visualizer of Smart City data that could be used for visualizing device trajectories and data changes. It should work together with an existing API based Smart City platform, which is used to store large amounts of data sent by Smart City and IoT devices. Students should investigate existing solutions, collect and document requirements, design a working solution, and deploy it in the university Cloud.
The goal is to create a Web frontend for defining data integrations between external APIs and an API based Smart City database, which is used to store large amounts of data sent by Smart City and IoT devices. Students should investigate existing solutions, collect and document requirements, design a working solution, and deploy it in the university Cloud. This topic can build upon an earlier thesis that designed a working solution for implementing and automating Dagster-based data integration pipelines.
Observability—the ability to measure the internal state of a system by observing external outputs—is a cornerstone of reliable distributed computing. In the context of edge intelligence, observability becomes uniquely challenging due to the constrained resources, intermittent connectivity, and real-time requirements of edge devices. Traditional observability and MLOps tools such as MLFlow, EvidentlyAI, and Alibi Detect offer comprehensive functionality but are often too resource-heavy for microcontrollers or lightweight IoT devices. This thesis explores the design and evaluation of lightweight observability methods tailored for edge AI, focusing on hybrid architectures that combine on-device minimal monitoring with edge gateway/local aggregation strategies. By synthesizing existing open-source tools with custom statistical metrics (e.g., mean, variance, categorical counts, error rates) and streaming libraries such as River, the goal is to propose an efficient and practical framework that ensures reliable performance monitoring, drift detection, and anomaly identification in resource-constrained environments.
The benefits of observability in AI models extend to enhanced decision-making capabilities, as it provides insights into system performance and potential areas for improvement, fostering a proactive maintenance culture. Additionally, observability facilitates the identification of anomalous patterns in equipment behavior, enabling timely interventions that can prevent costly breakdowns and extend asset lifespan. This proactive approach not only minimizes unplanned downtime but also optimizes resource allocation, ultimately leading to significant cost savings and improved operational efficiency in manufacturing environments. The integration of AI models into manufacturing processes not only enhances predictive maintenance but also supports the broader goals of Industry 4.0 by improving efficiency and connectivity across systems.
Investigate observability in Edge AI vision models for autonomous vehicles, with a focus on ensuring safety and reliability through runtime monitoring. The work should aim to identify effective observability metrics (e.g., drift detection, uncertainty estimation, anomaly monitoring) and develop lightweight monitoring mechanisms suitable for resource-constrained edge devices. By exploring trade-offs between performance, latency, and safety, the study should propose an observability framework/architecture that can detect critical failures, adapt to real-world distribution shifts (e.g., weather, lighting, sensor degradation), and provide interpretable safety signals for both system operators and regulators.
Unified observability for edge AI represents a critical convergence of distributed systems monitoring, machine learning operations (MLOps), and edge computing infrastructure. Synthesize current research and industry practices to provide a comprehensive framework for implementing observability solutions that span the edge-to-cloud continuum.
Explain how an unified approach combines traditional observability pillars (metrics, logs, traces) with AI-specific monitoring requirements (model performance, data drift, inference quality) and edge-specific constraints (resource limitations, network intermittency, privacy requirements).
Objective: to test and compare existing computer vision methods for detecting unique surface patterns (cross-sectional annual rings, cracks, bark marks) of logs. Public datasets such as Biomtrace (photos of cut edges of logs) and ATECH2024 (image processing and matching models) are used. Methods:
Result: the work provides an overview of how well existing methods work in determining the individual identity of logs both individually and in stacks, and how suitable they may be in supply chain traceability solutions.
Töö eesmärk on ehitada avatud lähtekoodiga veebirakendus, mis suudab