Personal tools
You are here: Home Theses

Open theses topics for the 2019/2020 study year

There are several other research topics which are not advertised here. If you are generally interested in doing a thesis on Mobile Applications, Cloud Computing or Internet of Things, write a mail to Prof. Satish Srirama and talk to him personally to choose a topic.

Available Topics to choose from

All of the following topics can be laid out either as BsC or as MsC thesis (advisors are shown in brackets). B means Bachelor, M means Master. A number in front of one of these letters means that there are several theses offered in this topic

 

Cloud Computing

(Satish Srirama)

  • Blockchains in cloud computing

–       Distributed immutable ledger deployed in a decentralized network

–       Relies on cryptography to meet security constraints

–       Where the advances in blockchain will assist cloud computing?


  • Edge Analytics (Satish Srirama)

–       Data processing on resource constrained devices

–       Edge analytics for real-time stream data processing

–       Fog topology management

 

 

Fog Computing for processing IoT Applications

(Mainak Adhikari, mainak@ut.ee)

1. Topic Name: Model-Based IoT Interoperability on Edge/Fog Computing

In the Internet of Things (IoT), there is a clear need for a high level of interoperability between independently developed systems, often from different applications. Here, we want to design an on-demand, and low-latency based strategy for the IoT in a Fog/Edge environment.


2. Topic Name: Deep Reinforcement Learning for Offloading IoT applications in Edge Computing

A new era for the development of RL mechanism has introduced, called Deep Reinforcement Learning (DRL) to train the offloading gateway devices with different QoS parameters which improves the performance and learning speed. In this work, we want to prepare an intelligent IoT gateway for finding an optimal computing node and also decide when to offload the application on Fog/Edge environment.

3. Topic Name: Quality Testing for validating functional and non-functional requirements in Serverless Environment Quality testing tools is used to validate non-functional requirements such as business logic encoded in microservices and serverless FaaS and data pipelines. A challenge is to be able to automate the process of inference of representative workloads from given traces and historical data accounting for advanced properties of the data .

4. Topic Name: Auto-scaling and Resource Provisioning of Data Pipeline in Serverless Environment

The auto-scaling can measure the capability of the cloud servers and scaling out or scaling down the resources automatically based on the status of the requests. It addresses two research challenges: i) cost efficiency by allocating the required resources and ii) time efficiency by allocating the applications to the available resources with minimum deployment time.

 

Cloud Resource Management with AI

(Chinmaya Kumar Dehury, chinmaya.dehury@ut.ee)


  • 1.  Agent AI behind Cloud Management.

In today's digital world, Artificial Intelligence (AI) is everywhere, such as Education, Healthcare, Agriculture, and Defense. Behind the stage, the cloud is providing resources to AI. But, who manages the cloud? Is it just a software package or human? Can we combine the intelligence of both to handle the large pool of resources in the cloud?

-      Cloud resource management and Agent AI

-      Survey of AI tools in cloud resource management.

-      How far Agent AI already penetrated managing cloud resources?

-      What are the current challenges?

-      Making Agent AI more intelligent.

 

  • 2.  AI based cloud resource failure prediction.

As we know, today's business is offering cloud-based services to its users, such as Office 365, Netflix, Spotify, Snapchat, Pokémon, etc. The cloud service providers such as Google, Amazon, Microsoft are losing billions due to the cloud outage. So the goal of this topic would be to predict the failures by using AI tools.

-      Understanding the Cloud resource failure.

-      Finding the reasons behind any failure.

-      Gather the dataset related to cloud resource failure.

-      Apply ML tools for failure prediction.

 

  • 3.  Predicting Cloud service demands.

As we know, most of the frequently used apps such as Instagram, Twitter, Spotify, etc are deployed on cloud environment. Sometimes the usage of such applications is very high and sometimes the usage is very low. But can we predict how heavy an app will be used in the next few hours? In short, what would be the future demand for a cloud-based service? This is the question, we will answer in this topic.

-      Find out how the cloud resources are allocated to an app/service.

-      Gather the dataset related to the resource usage of different cloud-based applications

-      Apply AI tools to predict and verify the result using the dataset.

 

  • 4.  Understanding Cloud usage data.

In this topic, we will look into the cloud server usage data, such as number of VMs deployed, percentage of server usage, resource utilizaiton of VMs and physical servers etc. We will gather the data from different cloud service providers, such as Google, Delft University of technology, etc.

-      Gathering the related dataset from 4-5 cloud service providers.

-      Understand the data and their limitations.

-      Apply ML/Scientific tools to understand how the cloud servers are performing.

-      Analyze the data to acquire hidden information

 

  • 5.  Data pipeline in hybrid cloud

-      Learn how data are uploaded from the user's devices to cloud infrastructure.

-      Understand the concept of the data pipeline and ETL.

-      Recent updates on data pipeline in the commercial cloud service provider.

-      Recent literature survey on data pipeline frameworks/architectures.

-      Research challenges in the data pipeline.

-      Implement data pipeline architecture in private and public cloud.

 

  • 6.  Reinforcement learning in cloud resource distribution

Reinforcement learning is one of three ML paradigms. Here a software agent takes actions by understanding the environment and its experience. For example, finding a path from one location to other, solving a knight-prince problem, etc. There are several frameworks to address different kinds of problem. In this topic, we will study different RL frameworks and will follow the RL approach in order to distribute the cloud resource among different services/users.

-      Understanding the fundamental concept of Reinforcement Learning and Cloud resource distribution

-      Survey of different RL frameworks (such as OpenAI Gym, DeepMind Lab, Amazon SageMaker RL, Dopamine, etc.)

-      Apply the RL approach to distribute the cloud resources among services/users

 

Large Scale Data Processing

(Pelle Jakovits)

 

  • (NB! Already Taken) IoT data analytics for real-time visitor count estimation in the DELTA building (B, M) (Pelle Jakovits)

The Delta Building is a new building to house the Institute of Computer Science. Its construction is to be finished in 2020. There are plans for a number of different modern sensors to be placed in the building. The Computer Graphics and Virtual Reality lab’s students are working on a real-time visualization of the people and activities inside the building. For that purpose there is a desire to know how many people occupy each room (including the hallways) at any given moment. The goal of this topic is to study the state-of-the-art of sensor analytics or image processing (or fusion) and to create a usable approach for real-time visitor count estimation in lecture rooms.

 

  • (NB! Already Taken) From SQL queries to Structured Streaming applications (B) (Pelle Jakovits)

Structured Streaming is a new stream processing abstraction built on top of the Apache Spark SQL engine. The goal of this topic is to study this stream data processing approach and compare its usability, fault tolerance and performance to more classical streaming approaches. The thesis should give an overview of its advantages and disadvantages, demonstrate how to adapt typical stream processing applications to it and investigate how easy it would be to take arbitrary Spark SQL, Dataframe or Hive SQL based applications and convert them into Streaming applications using Spark Structured Streaming.

 

  • Optimizing the performance of Apache Spark Streaming applications (M) (Pelle Jakovits)

The goal of this topic is to investigate what characteristics have a significant effect on the performance of Spark Streaming applications and provide guidelines and best practices on how to create and configure Streaming applications in Apache Spark to achieve optimal performance in different scenarios.

 

  • Real time vs micro-batching in streaming data processing: performance and guidelines (B/M) (Pelle Jakovits)

Typically, stream processing frameworks buffer incoming data and process them in small batches. But newer stream processing frameworks (such as Apache Storm) allow processing any incoming data objects in real time. The goal of the thesis is to compare  the performance of real-time vs micro-batching stream data processing frameworks under different scenarios and to investigate which data or application specific characteristics must be considered when choosing between them for specific use cases.

 

  • (NB! Already Taken) Stream data processing on resource constrained devices (B/M) (Pelle Jakovits)

With the ever increasing amount of data that needs to be collected from IoT data sources, it becomes more and more expensive to simply stream all the data to a cloud-side data processing platform. Depending on specific scenarios, it may be beneficial to (pre-)process the data as close to its source as possible. However, there are limitations on how powerful computing resources are available near the data sources. The goal of this thesis is to evaluate existing solutions for streaming data processing which allow performing part of the data processing nearer to the source, give an overview of their usability, advantages and disadvantages and analyse their effectiveness in comparison to more classical stream data processing frameworks such as Apache Spark or Storm.

 

  • Distributed Serverless Data Processing in IoT networks (M)  (Pelle Jakovits)

The goal of this topic is to study how efficiently Serverless technologies can be utilized to process data streams in multi layer (Fog computing)  IoT networks in a distributed manner and compare the efficiency, reliability and security of this approach in comparison to the typical Cloud centric data processing.

 

 

  • (NB! Already Taken) Service mesh based management of data streams in IoT networks (M) Pelle Jakovits)

Service mesh solutions (such as Istio) have a potential to greatly reduce the complexity of managing distributed IoT applications and their data. The goal of this thesis is to investigate open source service mesh solutions and to design a proof-of-concept data flow management solution for controlling the data flows inside distributed IoT applications.

 

  • (NB! Already Taken) A Pandas plugin for integrating data from the Cumulocity Cloud IoT platform (B) (Jakob Mass / Alo Peets)

As part of the SmartEnCity project, Tartu city government is collectiong various real-time data from the city, e.g. bus movement, street lights, apartment building heating & ventilation systems, electric bikes and electric cars, vehicle & pedestrian counters. All of this data is stored on Cumulocity software. Cumulocity provides a REST API for data access. In this thesis, the student should produce a library for Python Pandas which can read data from Cumulocity and structure it reasonably given Cumulocity's data model. As a result, a data scientsit who is  used to doing their work with Pandas and has knowledge about Cumulocitys Data model (Events, Measurements, Alarms, etc) should be able to start experimenting with the SmartEnCity data without having to spend too much time on the technical integration tasks to the Cumulocity API.

Applied IoT, System and Security topics

(Alo Peets, alo.peets@ut.ee, Ülikooli17-323)

  1. Creation of malicious Android (iOS) App that tests how much personal data different security setting allow applications to access (and process). More precisely, the idea would be that after the user runs the app it scans the phone and uploads to demo server last picture, last SMS, last e-mail, last location, last app used etc.. Ideally the app would be used in security teaching and security related demos.  – MSc
  2. Pycom has created a fipy development board for easy IoT testing. It includes 5 networks in one perfectly-formed, same-small-foot-print-as-WiPy- LoPy-and SiPy, IoT development board. MicroPython enabled. Featuring WiFi, Bluetooth, LoRa, Sigfox and dual LTE-M (CAT M1 and NBIoT) the FiPy gives access to all the world’s LPWAN networks on one tiny board. Possible student would have to get the board running and do real world comparative testing between different LPWAN services. https://pycom.io/product/fipy/ - BSc/MSc
  3. Create hardware and software solution that enables air quality monitoring (CO2, VOC, temp, humidity) in rooms or in whole building. Over the counter hardware (ESP32, SCD30, CCS811) could be used to create an IoT product that could push air quality data to central server for easy review and monitoring. - BSc/MSc
  4. Create smart home, office, city demo use-cases that would be displayed in our new IoT lab in DELTA building in 2020. Exact ideas, hardware and outcome should be negotiated and agreed upon with supervisor. - BSc/MSc
  5. Bring Your own topic related to IoT solutions, applied security, IT systems management (devops), personalized applied medicine, real world big data analysis. - BSc/MSc

Supervised by Alo Peets, alo.peets@ut.ee

 

 

Serverless and Fog Computing

(Shivananda Poojara, poojara@ut.ee Ulikooli 17-323)

1. Intelligent Epilepsy seizure prediction using fog/edge computing environments.

It aims to design an system to predict the occurrences of seizure in epilepsy patients by collecting various data from mobile phone and other behavioural parameters. Prediction can make use of existing novel machine learning algorithms. To detect the seizures on the fly, may be at work or travelling or at any point of time, fog environment can be used as computation infrastructure to process the prediction tasks. This type of applications are sensitive, requires more attention to get earlier results and respond immediately. So, service latency and reliability parameters are important to be considered to schedule the computation tasks on fog environment. It also aims to design, develop efficient scheduling algorithm on fog environment and use the power of serverless.

 

Other Topics

(Satish SriramaPelle Jakovits)

 

  • Developing mobile applications for different domains (B/M) (Satish Srirama) - Topic is dropped
Description: You will be developing mobile applications dealing with different problems. We also have topics from Ecology department, dealing with developing multi-platform mobile applications for collecting and displaying data related to plants and animals.
  • Migrating an enterprise application from a relational back-end to NoSQL data store. (M) (Satish Srirama)

Description: we are interested in migrating enterprise applications to cloud scale solutions, because we want to develop methodologies for migrating applications from relational to non-relational data stores. Non-relational data stores, also called NoSQL databases, are a better fit for the cloud, because they have been designed for horizontal scaling. Generally they abandon the relational model in favor of simpler key-value based data models. Example of NoSQL databases include Riak, MongoDB, Cassandra and Neo4J.

 

  • (NB! Already Taken) Remote management of containers in IoT Devices (B) (Pelle Jakovits)

 

The goal of this topic is to investigate how to utilize cloud based IoT platforms (such as Cumulocity) to manage a large number of IoT computing devices, such as Raspberry Pi’s. Student should create software for integrating any computer running Docker with Cumulocity IoT platform. Such software should display information about the currently running containers, support deploying and configuring Docker containers, remote management of their life-cycle and executing arbitrary commands inside the deployed containers.

 

  • Automatic integration of IoT devices using MQTT (M) (Pelle Jakovits)

The goal of this topic is to seamlessly integrate MQTT "speaking" devices to an IoT platform (Such as Cumulocity) by using an intermediate Agent or Middleware which takes care of authentication, data delivery and synchronization, IoT platform configuration and other tasks and issues related to device integration. Using an intermediate agent also has a potential for augmenting the data and services that are provided by IoT devices -- for example by injecting additional information about the current location, state and service quality of such devices.