Personal tools
You are here: Home Theses

Open theses 2015/2016 study year

There are several other research topics which are not advertised here. If you are generally interested in doing a thesis on Mobile Applications, Cloud Computing or Internet of Things, write a mail to Assoc. Prof. Satish Srirama and talk to him personally to choose a topic.


Available Topics to choose from

All of the following topics can be laid out either as BsC or as MsC thesis (advisors are shown in brackets).

B means Bachelor, M means Master. A number in front of one of these letters means that there are several theses offered in this topic



Mobile Cloud


  • Mining Sensor Data for the Recognition of User Activities (Master Thesis) (Satish Srirama)

Embedded technologies such as sensors within the smartphones enable mobile applications adapting dynamically in real-time to the user's context. The idea of this thesis is analyzing the information provided by the sensors (create algorithms to find patterns) and collected with real-time databases (Sqlite, tokyo cabinet, etc.) with the intention of inferring what the user is doing in order to share such status in a social network such as Twitter/facebook.


Increasing advances in mobile technologies have enable the handset becoming a service provider rather than a service consumer. Mobile Host is an implementation that allows providing services from the mobile. This thesis creating extending a Synchronization engine for the communication peer-to-peer using mobile phones.



  • Mobile Cloud Middleware Quality of Service (Master Thesis) (Satish Srirama)

Mobile Cloud Middleware is a framework that enables to perform data-intensive invocation of cloud services from the mobile phone.  However, A queue middleware mechanism has to be implemented (with mathematical analysis/model) for providing QoS for the mobile users.



  • Load Balancing on the Cloud for Analyzing Middleware Scalability (Master Thesis) (Satish Srirama)

Mobile Cloud Middleware (MCM) is a framework that enables to perform data-intensive invocation of cloud services from the mobile phone. This thesis consists of a comparison of Load Balancing techniques with multiple tools (HAProxy, etc) in order to analyze the Scalability of the MCM Framework. (Using tools such as Tsung, JMeter, etc.


  • Remotely Control of an Android Phone using VNC Server (Bachelor Thesis)

The project consists of creating a Desktop application that allows connecting to the Android device remotely from a personal computer. Once connected, the application has to show all the information on the device (applications installed, files, etc.). Since the principal aim of the remote connection is the management of resource, the application has to allow installing apk files remotely.


  • Data Forensic in the Mobile Phone on the Cloud (Master Thesis)

The idea consists of moving all the content from the physical device to the emulator so the emulator can be analyzed and several instances of the mobile can be executed in several machines running on the cloud.


  • Mobile Gallery Analysis based on Parallel Image Processing on the Cloud Using MapReduce (Master Thesis)

The project consists of creating several recognition algorithms for identifying objects, people, etc. from pictures taken by the mobile phone. The algorithms can be implemented using MapReduce.


  • From IPv4 to IPv6 discovery mechanisms for Mobiles (Master Thesis)

With the introduction of IPv6 at the end of 2012, the migration of IPv4 to IPv6 has to be transparent as possible. This thesis consist of analyzing the transition from Ipv4 to Ipv6 and implementing a discovery mechanism that enables the communication peer-to peer between mobiles.



Internet of Things

    Adaptive Energy-Efficient Mobile-Phone Sensing in the Federated Urban Sensory Networks (M/B)

Mobile-phone sensing, also refer to crowd-sensing, is a cost-efficient way to implementing urban sensing systems. Inhabitants of the city participate the sensing tasks with their mobile devices in the given period of time. By using the inbuilt sensing components of the mobile devices, various information can be provided by the mobile users in urban areas. In recent years, many related works have been proposed. However, most works have not considered to integrate the sensing network across different systems. The front-end devices used to participate with different urban sensing systems can utilise device-to-device communication technology to form the collaboration towards saving energy by reducing the wireless data transmission. In this project, we consider an inter-organisational device-to-device collaboration among the federated urban sensory networks. The project aims to propose, develop and validate an adaptive middleware framework to support energy conservation among the front-end devices.


  • Visulising, Virtualising and Exploiting the Things in Proximity: Towards the Internet of Physical Proximity (M/B)

This project focus on developing a mobile cloud middleware framework that is capable of leveraging the front-end physical Internet of Things (IoT) devices derived from different organisations with the backend cloud-based Business Process Management System (BPMS). The organisations (managers of the front-end devices) provide software agents that enable the BPMS user to install the corresponding software in the cloud and directly communicate with the front-end IoT devices without go through the 3rd party mediator servers.


  • Now machines are socialising: Cloud Infrastructural Opportunistic Machine-to-Machine Communication in the Internet of Things (M/B)

This project aims to develop an adaptive cloud-based framework that utilises cloud infrastructure to support the autonomous communication among the front-end IoT devices managed by different organisations and individuals. The smart devices can further collaboratively provide various services for IoT applications.


  • Would you like to be aware of your surroundings? Towards Cloud Infrastructural Opportunistic Mobile Social Network in Proximity (M/B)

Opportunistic Mobile Social Network in Proximity (OMSNP) represents an environment in which mobile users are capable of discovering and communicating with the physical entities (human or machines) in their current presence. Considering the heterogeneity of the social network services (SNS) that are used by different individuals, discovery becomes a crucial challenge. This project  aims to develop a framework that can perform proactive social group discovery and formation across different mobile SNS individuals located in the physical proximity using the assistance of cloud computing.



  • A Framework for Trustworthy Internet of Things (M) (Chii Chang)

Security is one of the major challenges in the Internet of Things (IoT). Although various security-related works have been done for IoT, existing works were only based on the classic network security-aspect. For instance, “Cryptography alone cannot solve protecting information in IoT problem as internally compromised nodes can generate bogus information and still authenticate it using valid cryptographic” (Lize, Jingpei, & Bin, 2014). Further, the centralized solutions are not feasible for IoT since fundamentally, IoT is based on distributed environment. Assuming there can be a central management party to govern the entire environment is not realistic. Hence, IoT requires a feasible distributed trust strategy to overcome the drawback of existing security models



Scientific Computing on the Cloud (SciCloud)

(Pelle Jakovits)

Here are some of the Bachelors(B)/Masters(M) thesis topics associated with the SciCloud project. For detailed description of the project, click here


  • Parallel efficiency of R scripts in Spark Distributed computing framework (B) (Pelle Jakovits)

Spark is a large scale data processing framework that can process both huge amount of data stored in a distributed file system (usually HDFS) It can also be used as an alternative computing engine to MapReduce in a Hadoop YARN cluster. Originally Spark supported only Java, Scala and Python but recently it has also been extended for R, which is very widely used language for statistical data analyzing.

The goal of this thesis would be to evaluate the parallel efficiency of typical R script executed in a YARN cluster using Spark and to estimate when it is beneficial to start thinking about migrating existing R scripts to Spark.


Spark is a large scale data processing framework that can process both huge amount of data stored in a distributed file system (usually HDFS) It can also be used as an alternative computing engine to MapReduce in a Hadoop YARN cluster. Originally Spark supported only Java, Scala and Python but recently it has also been extended for R, which is very widely used language for statistical data analyzing.

The goal of this thesis would be to investigate means for automatic migration of existing R script to the mentioned platform.  What modifications are needed to be able to execute existing R scripts in Spark? Would it be possible to create a R script launcher that automatically migrates provided R scripts to a YARN cluster where it can be executed using Spark computation engine?


  • Automatic scaling of Scientific Workflows in the Cloud (B) (Satish Srirama, Jaagup Viil)

This topic consists of migrating scientific workflows to the cloud, executing them on a set of virtual machines to increase the performance and automatically scaling the computing resources on demand to optimize the cost.


  • Direct migration of scientific computing experiments to the cloud (M) (Pelle Jakovits)

We have created a Desktop to Cloud Migration (D2CM) tool enables scientists to easily migrate their domain specific computational experiments to cloud. To make the tool more accessible to larger audience it needs to be re-structured into a Software as a Service Application in cloud, and deployed for example on the Google App Engine platform.
The work would also include: Mapping the state of art in scientific application migration to grids, supercomputers and clouds and studying the current solution and improving its usability


  • Adapting MPI based legacy parallel scientific computing applications to BSP. (B/M) (Pelle Jakovits)

Description: Study what are the difficulties of adapting existing MPI based parallel applications to the BSP model. Adapt different case study examples and analyze the results.  Can choose BSP implementation freely.


Description: The goal is to study and document what affects the efficiency of applications written for Apache Spark framework. Implement some data heavy and processing heavy Spark applications, profile and measure the effects of different improvements to Spark framework and different coding practices.


The goal of this thesis is to study what is the state of art in using GPU's to accelerate data processing in different large scale distributed computing frameworks such as MapReduce, Spark and Tez. These frameworks can greatly simplify creating parallel applications and can scale up the computations to hundreds of nodes but the individual computing power of s single node is often left unoptimized.  
What is the best practices to 'move' data between the framework`s computing engine and the GPU's, and how advanced are the Java GPU interfaces in comparison to other programming languages (C, C++, ...) Practical part of the thesis would be to demonstrate the utilization of GPU  in these frameworks and measuring the parallel efficiency of the results.


  • Developing social networks and mash-up applications with cloud application services (Several Bachelor/Master theses) (Satish Srirama)

Description:This topic gives the student to define and come up with some applications that can be deployed with the public/private cloud platforms.

Example: Keeping track of a researcher’s calendar and location.

Let Alice is a researcher who attends conferences regularly and presents her results. She wants to keep track of herself and want to have a map of her current and future conferences. The idea is to have a map of her travel on the Google maps with storage of presentations and probably some pictures on cloud storages like Amazon S3. The application should also have support, to store all the relevant data regarding each publication like versions, reviews etc. As an extension, the application should remind her via mail, 4 weeks before the future conference.

Example 2: Can you extend the application to all of the employees of her department or can you think of building a scientific community? Can you think of any social networks from this scenario? Something like say the researchers she met during the last conferences, and probably reminding of their reappearance at alternative conferences.


Description: The goal is to implement security services on the cloud, which can be offered to mobile devices.

Example: Speech recognition service for identifying mobile device users. Mobile device owners speech patterns are stored in the cloud and is used to identify mobile device users. The user has to read a certain sentence (provided by the service) aloud, and the recording of the voice is sent to the cloud for speech recognition. If the recognition fails, then some of it's functionality, application or even the mobile device itself is not enabled.


Description: Cloud has become a convenient source for computing resources as it advetrises to give virtually unlimited access on-demand and in-real-time. Scientist working with scientific computing applications can definetly take advantage of cloud computing platforms, but migrating and deploying distributed scientific applications and running the involved experiments on cloud platforms is not a a simple task, especially for non-computer scientists.
CloudML is a modelling language developed in Norwegian research institute SINTEF which aims to gratly simplify the deployment of service oriented applications to cloud. The goal of this thesis is to investigate its suitability for distributed scientific computing applications, design improvements and tools if needed and demonstrate its usability with scientific use cases.


  • Pig vs Hive in large scale data processing - competitors or partners. (M) (Pelle Jakovits)

Description: Pig and Hive are data processing tools built on top of the Apache Hadoop MapReduce framework. While both can handle most data processing and analysing tasks, Pig is designed mainly for data preparation and transformation, while Hive is a tool for data warehousing, querying and presentation. We want to determine the feasibility of these solutions for different data analysis tasks and the goals of this thesis are to measure and compare their performance in processing large data in a distributed deployment and to determine for which kind of tasks it is best to use one or the other. Another aspect might be measuring the productivity gains when using these tools compared to using plain Hadoop MapReduce.


  • Migrating applications using relational databases to Hadoop by Translating SQL statements to HiveQL. (B) (Satish Srirama , Pelle Jakovits)

Description: The goal of this thesis is to try to (partly) automate the migration of data analyzing applications from relational databases to Hadoop by transforming the SQL statements into SQL like HiveSQL language.


  • InSAR application for studying land subsidence in Estonia. (M) (Pelle Jakovits, Kaupo Voormansik)

Description: Interferometric synthetic aperture radar (InSAR) is a radar technique used in geodesy and remote sensing.It's main advantage is that it is not affected by clouds and can achieve a unprecedented accuracy.The goal of this thesis is to create a parallel interferometry application, which would use phase information from pairs of co-registered SAR images for determining minor ground deformations that occur as after effects of earthquakes, mining or oil drilling. Input data consists of already co-registered SAR satellite images of Estonia.



Data Storage and Analysis on the Cloud

(Satish SriramaPelle Jakovits)


  • Amazon public data set analysis. (B/M) (Satish Srirama)

Description: find an interesting data set from AWS public datasets and apply text mining techniques to find interesting information (e.g sentiment analysis). One possible example is the Common Crawl corpus. Common Crawl maintains an open repository of web crawl data. You can use distributed data processing tools such as MapReduce, Pig, Hive, Mahout.


  • Migrating an enterprise application from a relational back-end to NoSQL data store. (M) (Satish Srirama)

Description: we are interested in migrating enterprise applications to cloud scale solutions, because we want to develop methodologies for migrating applications from relational to non-relational data stores. Non-relational data stores, also called NoSQL databases, are a better fit for the cloud, because they have been designed for horizontal scaling. Generally they abandon the relational model in favor of simpler key-value based data models. Example of NoSQL databases include Riak, MongoDB, Cassandra and Neo4J.


  • Natural language data processing using Pig and Hive (M) (Satish Srirama, Pelle Jakovits)

Description: Pig and Hive are tools built on top of the Apache Hadoop MapReduce framework. Pig is meant mainly for data processing and preparation, while Hive is a tool for data warehousing, querying and presentation. We are interested in seeing some Natural language algorithms implemented using  both of these tools. We want to determine the feasibility of these tools for different data analysis tasks and also to measure the performance on large data in a distributed deployment. We are also interested in measuring productivity gains when using these tools compared to using plain Hadoop MapReduce.


  • Large-scale distributed data mining with Hadoop Hive. (B) ( Pelle Jakovits)

Description: The goal of this thesis is to research which data mining algorithms can be adapted to Hive. The thesis should demonstrate several such algorithms and compare them to existing non-MapReduce solutions.