Theses

There are several other research topics which are not advertised here. If you are generally interested in doing a thesis on Mobile Applications, Cloud Computing or Internet of Things, write a mail to Prof. Satish Srirama and talk to him personally to choose a topic.

Open theses topics for the 2019/2020 study year

All of the following topics can be laid out either as BsC or as MsC thesis (advisors are shown in brackets). B means Bachelor, M means Master. A number in front of one of these letters means that there are several theses offered in this topic

Cloud Computing

(Satish Srirama)

  • Blockchains in cloud computing
    • Distributed immutable ledger deployed in a decentralized network
    • Relies on cryptography to meet security constraints
    • Where the advances in blockchain will assist cloud computing?
  • Edge Analytics (Satish Srirama)
    • Data processing on resource constrained devices
    • Edge analytics for real-time stream data processing
    • Fog topology management

Fog Computing for processing IoT Applications

(Mainak Adhikari, mainak@ut.ee)

1. Model-Based IoT Interoperability on Edge/Fog Computing In the Internet of Things (IoT)

There is a clear need for a high level of interoperability between independently developed systems, often from different applications. Here, we want to design an on-demand, and low-latency based strategy for the IoT in a Fog/Edge environment.

2. Deep Reinforcement Learning for Offloading IoT applications in Edge Computing

A new era for the development of RL mechanism has introduced, called Deep Reinforcement Learning (DRL) to train the offloading gateway devices with different QoS parameters which improves the performance and learning speed. In this work, we want to prepare an intelligent IoT gateway for finding an optimal computing node and also decide when to offload the application on Fog/Edge environment.

3. Quality Testing for validating functional and non-functional requirements in Serverless Environment

Quality testing tools is used to validate non-functional requirements such as business logic encoded in microservices and serverless FaaS and data pipelines. A challenge is to be able to automate the process of inference of representative workloads from given traces and historical data accounting for advanced properties of the data .

4. Auto-scaling and Resource Provisioning of Data Pipeline in Serverless Environment

The auto-scaling can measure the capability of the cloud servers and scaling out or scaling down the resources automatically based on the status of the requests. It addresses two research challenges: i) cost efficiency by allocating the required resources and ii) time efficiency by allocating the applications to the available resources with minimum deployment time.

Cloud Resource Management with AI

(Chinmaya Kumar Dehury, chinmaya.dehury@ut.ee)

1.  Agent AI behind Cloud Management.

In today’s digital world, Artificial Intelligence (AI) is everywhere, such as Education, Healthcare, Agriculture, and Defense. Behind the stage, the cloud is providing resources to AI. But, who manages the cloud? Is it just a software package or human? Can we combine the intelligence of both to handle the large pool of resources in the cloud?

  • Cloud resource management and Agent AI
  • Survey of AI tools in cloud resource management.
  • How far Agent AI already penetrated managing cloud resources?
  • What are the current challenges?
  • Making Agent AI more intelligent.

2.  AI based cloud resource failure prediction.

As we know, today’s business is offering cloud-based services to its users, such as Office 365, Netflix, Spotify, Snapchat, Pokémon, etc. The cloud service providers such as Google, Amazon, Microsoft are losing billions due to the cloud outage. So the goal of this topic would be to predict the failures by using AI tools.

  • Understanding the Cloud resource failure.
  • Finding the reasons behind any failure.
  • Gather the dataset related to cloud resource failure.
  • Apply ML tools for failure prediction.

3.  Predicting Cloud service demands.

As we know, most of the frequently used apps such as Instagram, Twitter, Spotify, etc are deployed on cloud environment. Sometimes the usage of such applications is very high and sometimes the usage is very low. But can we predict how heavy an app will be used in the next few hours? In short, what would be the future demand for a cloud-based service? This is the question, we will answer in this topic.

  • Find out how the cloud resources are allocated to an app/service.
  • Gather the dataset related to the resource usage of different cloud-based applications
  • Apply AI tools to predict and verify the result using the dataset.

4.  Understanding Cloud usage data.

In this topic, we will look into the cloud server usage data, such as number of VMs deployed, percentage of server usage, resource utilizaiton of VMs and physical servers etc. We will gather the data from different cloud service providers, such as Google, Delft University of technology, etc.

  • Gathering the related dataset from 4-5 cloud service providers.
  • Understand the data and their limitations.
  • Apply ML/Scientific tools to understand how the cloud servers are performing.
  • Analyze the data to acquire hidden information

5.  Data pipeline in hybrid cloud

  • Learn how data are uploaded from the user’s devices to cloud infrastructure.
  • Understand the concept of the data pipeline and ETL.
  • Recent updates on data pipeline in the commercial cloud service provider.
  • Recent literature survey on data pipeline frameworks/architectures.
  • Research challenges in the data pipeline.
  • Implement data pipeline architecture in private and public cloud.

6.  Reinforcement learning in cloud resource distribution

Reinforcement learning is one of three ML paradigms. Here a software agent takes actions by understanding the environment and its experience. For example, finding a path from one location to other, solving a knight-prince problem, etc. There are several frameworks to address different kinds of problem. In this topic, we will study different RL frameworks and will follow the RL approach in order to distribute the cloud resource among different services/users.

  • Understanding the fundamental concept of Reinforcement Learning and Cloud resource distribution
  • Survey of different RL frameworks (such as OpenAI Gym, DeepMind Lab, Amazon SageMaker RL, Dopamine, etc.)
  • Apply the RL approach to distribute the cloud resources among services/users

Large Scale Data Processing

(Pelle Jakovits)

Synthetic IoT data generator for large scale IoT Device simulation (M) (Pelle Jakovits)

This topic is related to the Cyber defence simulation of Internet of Things and Mobile Networks in the Cyber Range project.
The student should evaluate extending open source tool for generating real-like IoT data based on existing captured data traces and “play it back” to simulate real data in an IoT network. The goal is to design and create a solution which processes existing data traces and can generate similar behaving data stream with high-volume and frequency. It should also support customizing the generated data stream, including volume and frequency, structure (like ratios between the different types of sub-streams), randomizing certain fields of records.

IoT data analytics for detecting anomalous devices and situations (M) (Pelle Jakovits)

This topic is related to the Cyber defence simulation of Internet of Things and Mobile Networks in the Cyber Range project. The goal is to design a solution for analyzing the IoT data streams for detecting outliers, potential anomalous situations and suspicious devices.

IoT data analytics for real-time visitor count estimation in the DELTA building (B, M) (Pelle Jakovits)

The Delta Building is a new building to house the Institute of Computer Science. Its construction is to be finished in 2020. There are plans for a number of different modern sensors to be placed in the building. The Computer Graphics and Virtual Reality lab’s students are working on a real-time visualization of the people and activities inside the building. For that purpose there is a desire to know how many people occupy each room (including the hallways) at any given moment. The goal of this topic is to study the state-of-the-art of sensor analytics or image processing (or fusion) and to create a usable approach for real-time visitor count estimation in lecture rooms.

From SQL queries to Structured Streaming applications (B) (Pelle Jakovits)

Structured Streaming is a new stream processing abstraction built on top of the Apache Spark SQL engine. The goal of this topic is to study this stream data processing approach and compare its usability, fault tolerance and performance to more classical streaming approaches. The thesis should give an overview of its advantages and disadvantages, demonstrate how to adapt typical stream processing applications to it and investigate how easy it would be to take arbitrary Spark SQL, Dataframe or Hive SQL based applications and convert them into Streaming applications using Spark Structured Streaming.

Optimizing the performance of Apache Spark Streaming applications (M) (Pelle Jakovits)

The goal of this topic is to investigate what characteristics have a significant effect on the performance of Spark Streaming applications and provide guidelines and best practices on how to create and configure Streaming applications in Apache Spark to achieve optimal performance in different scenarios.

(NB! Already Taken) Stream data processing on resource constrained devices (B/M) (Pelle Jakovits)

With the ever increasing amount of data that needs to be collected from IoT data sources, it becomes more and more expensive to simply stream all the data to a cloud-side data processing platform. Depending on specific scenarios, it may be beneficial to (pre-)process the data as close to its source as possible. However, there are limitations on how powerful computing resources are available near the data sources. The goal of this thesis is to evaluate existing solutions for streaming data processing which allow performing part of the data processing nearer to the source, give an overview of their usability, advantages and disadvantages and analyse their effectiveness in comparison to more classical stream data processing frameworks such as Apache Spark or Storm.

Distributed Serverless Data Processing in IoT networks (M)  (Pelle Jakovits)

The goal of this topic is to study how efficiently Serverless technologies can be utilized to process data streams in multi layer (Fog computing)  IoT networks in a distributed manner and compare the efficiency, reliability and security of this approach in comparison to the typical Cloud centric data processing.

Service mesh based management of data streams in IoT networks (M) Pelle Jakovits)

Service mesh solutions (such as Istio) have a potential to greatly reduce the complexity of managing distributed IoT applications and their data. The goal of this thesis is to investigate open source service mesh solutions and to design a proof-of-concept data flow management solution for controlling the data flows inside distributed IoT applications.

Applied IoT, System and Security topics

(Alo Peets, alo.peets@ut.ee)

  1. Create smart home, office, city demo use-cases that would be displayed in our new IoT lab in DELTA building in 2020. Exact ideas, hardware and outcome should be negotiated and agreed upon with supervisor. – BSc/MSc
  2. Bring Your own topic related to IoT solutions, applied security, IT systems management (devops), personalized applied medicine, real world big data analysis. – BSc/MSc

Supervised by Alo Peets, alo.peets@ut.ee

Serverless and Fog Computing

(Shivananda Poojara, poojara@ut.ee Delta R3033)

1. Intelligent Epilepsy seizure prediction using fog/edge computing environments.

It aims to design an system to predict the occurrences of seizure in epilepsy patients by collecting various data from mobile phone and other behavioural parameters. Prediction can make use of existing novel machine learning algorithms. To detect the seizures on the fly, may be at work or travelling or at any point of time, fog environment can be used as computation infrastructure to process the prediction tasks. This type of applications are sensitive, requires more attention to get earlier results and respond immediately. So, service latency and reliability parameters are important to be considered to schedule the computation tasks on fog environment. It also aims to design, develop efficient scheduling algorithm on fog environment and use the power of serverless.

Other Topics

(Satish SriramaPelle Jakovits)

Developing mobile applications for different domains (B/M) (Satish Srirama) – Topic is dropped

Description: You will be developing mobile applications dealing with different problems. We also have topics from Ecology department, dealing with developing multi-platform mobile applications for collecting and displaying data related to plants and animals.

Migrating an enterprise application from a relational back-end to NoSQL data store. (M) (Satish Srirama)

Description: we are interested in migrating enterprise applications to cloud scale solutions, because we want to develop methodologies for migrating applications from relational to non-relational data stores. Non-relational data stores, also called NoSQL databases, are a better fit for the cloud, because they have been designed for horizontal scaling. Generally they abandon the relational model in favor of simpler key-value based data models. Example of NoSQL databases include Riak, MongoDB, Cassandra and Neo4J.

Remote management of containers in IoT Devices (B) (Pelle Jakovits)

The goal of this topic is to investigate how to utilize cloud based IoT platforms (such as Cumulocity) to manage a large number of IoT computing devices, such as Raspberry Pi’s. Student should create software for integrating any computer running Docker with Cumulocity IoT platform. Such software should display information about the currently running containers, support deploying and configuring Docker containers, remote management of their life-cycle and executing arbitrary commands inside the deployed containers.

Automatic integration of IoT devices using MQTT (M) (Pelle Jakovits)

The goal of this topic is to seamlessly integrate MQTT “speaking” devices to an IoT platform (Such as Cumulocity) by using an intermediate Agent or Middleware which takes care of authentication, data delivery and synchronization, IoT platform configuration and other tasks and issues related to device integration. Using an intermediate agent also has a potential for augmenting the data and services that are provided by IoT devices — for example by injecting additional information about the current location, state and service quality of such devices.