NEWT

A fault tolerant BSP framework on Hadoop YARN

We looked for a distributed computing framework that could be utilized for running iterative scientific computing algorithms in the cloud, but most solutions proved lacking in some respect. This is why we decided to create our own with the following goals:

  • Provide automatic fault recovery.
  • Retain the program state after fault recovery.
  • Provide a convenient programming interface.
  • Support (iterative) scientific computing applications.

The following is an example of a linear system of equations solver, using the conjugate gradient method, implemented in NEWT:

  • One has to register the program definition with the application master.

public class CGTestMaster {
	public static void main(String[] args) {
		//simply initialize the default configuration
		Configuration conf = NEWTConfiguration.createDefault();
		//the test data will be generated by the worker nodes
		//so only use commandline arguments as input and 
		//specify the number of processes to request
		Input input = new CommandLineInput(conf, args, Integer.parseInt(args[0]));
		//create an application master, using the configuration,
		//the state class definition and the input
		FaultTolerantJobMaster am =
				new FaultTolerantJobMaster(conf, CGState.class, input);
		//register the application code with the master
		am.addStage("init", new CG.Init());
		am.addStage("beginLoop", new CG.BeginLoop());
		am.addStage("checkCondition", new CG.CheckCondition());
		am.addStage("doMatVec", new CG.DoMatVec());
		//register our custom message type
		am.registerMessageType(DoubleArrayWritable.class);
		//finally, run the application master
		am.run();
	}
}

The CGState.class contains the definitions of all state variables used by the algorithm and instructions to serialize them to a bitstream. The objects that are given as arguments to am.addStage contain the code of the program, an example of these stages is:


//verbose Java 1.6 "closure" definition
//can be made much more concise with Java 1.7
//or a Scala interface
public static class CheckCondition extends Stage {
	//the method takes the BSPComm communicatior,
	//the state object and the superstep number as arguments
	@Override
	public String execute(BSPComm bsp, CGState state, int superstep) {
		//we know that on the previous superstep each process
		//sent a message of 2 doubles, containing the partial
		//dot product and local maximum norm of the residual
		//vector (which is used as an error estimate)
		double[] recv = getFromAll(bsp, 2);
		state.u += recv[0];
		state.error = Math.max(state.error, recv[1]);
		//depending on some state variables, determine what should be done next
		if (state.error > TOLERANCE && state.it <= MAX_IT) {
			if (state.it == 0) {
				for (int i = 0; i < state.p.size(); i++) {
					state.p.getData()[i] = state.z.getData()[i];
				}
			} else {
				double v = state.u/state.ou;
				CGMath.scalevector(state.p.getData(), v);
				CGMath.addVector(state.p.getData(), state.z.getData());
			}
			//we need to synchronize the vector p for matrix vector multiplication
			sendToNeighbors(bsp, state, state.p.getData());
			//do matrix vector multiplication on the next superstep
			return "doMatVec";
		} else {
			//we're done, write result to disk and stop
			return Stage.STAGE_END;
		}
	}
}

Here the getFromAll and sendToNeighbors are implemented using the communicator’s send function to transmit parts of some of the state from the CGState object. Since these are common communication patterns, these might eventually be implemented on the framework level. Some mathematical operations that need to be performed on the state variables are implemented in a separate class CGMath.

  • To run the program, it has to be submitted to the YARN cluster:
public class TestJob {
	public static void main(String[] args) {
		NEWTClient client = new NEWTClient();
		client.submit(CGTestMaster.class, args, true);
		client.close();
	}
}

For some general information on NEWT, refer to this poster.

While the framework is not in the shape where it can be used for any serious projects yet, it’s open-source and available at: https://bitbucket.org/mobilecloudlab/newt

Aneka Cloud

Aneka plays the role of Application Platform as a Service for Cloud Computing. Aneka supports various programming models involving Task Programming, Thread Programming and MapReduce Programming and tools for rapid creation of applications and their seamless deployment on private or public Clouds to distribute applications. The goal of this study is to use the Aneka platform to establish hybrid clouds for solving scientific computing problems. We have set up a local Aneka Cloud in the University of Tartu and the idea is to also involve resources from public clouds such as Amazon EC2 when larger scientific computing experiments need to be executed.

People

REMICS

REuse and Migration of legacy applications to Interoperable Cloud Services (REMICS).

REMICS is an EU FP7 project. The goal of REMICS is to develop advanced model-driven methodology and tools for REuse and Migration of legacy applications to Interoperable Cloud Services. Combining the deployment models of the cloud paradigm with the SaaS (Software as a Service) model allows software companies to reuse and extend their legacy applications, compose them in new ways, and reach new markets. REMICS also takes an active role in the standardization of metamodels and languages in the OMG activities around Architecture-Driven Modernization, SOA and Cloud Computing, Model-Driven Interoperability, Models@Runtime, Model Checking and Model-based Testing. Within the project, we are dealing with migrating legacy OLAP/OLTP applications to the cloud frameworks. For this purpose, we are developing a desktop to cloud migration (D2CM) tool. The tool integrates several software libraries to facilitate virtual machine image migration and management of experiments running in the cloud. We are also studying and developing a framework for verifying the performance and scalability of applications running in the cloud.

Homepage

People

Publications

  • M. Vasar, S. N. Srirama, M. Dumas: Framework for Monitoring and Testing Web Application Scalability on the Cloud, Nordic Symposium on Cloud Computing & Internet Technologies (NORDICLOUD 2012), Co-located event at Joint 10th Working IEEE/IFIP Conference on Software Architecture & 6th European Conference on Software Architecture (WICSA/ECSA 2012), August 20-24, 2012. (Accepted for publication)
  • M. Mazzucco, M. Vassar, M. Dumas: Squeezing out the Cloud via Profit-Maximizing Resource Allocation Policies, In Proceedings of the 20th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), Arlington, VA, USA, August 2012. IEEE Computer Society.
  • M. Mazzucco, I. Mitrani: Empirical Evaluation of Power Saving Policies for Data Centers, ACM Greenmetrics, London (UK), June 2012.

Theses

  • M. Vasar, A Framework for Verifying Scalability and Performance of Cloud Based Web Applications, Master’s thesis, University of Tartu, June, 2012. Supervisor: Satish Srirama

Pervasive Mobile Applications

Technological achievements in user context monitoring techniques enable automating certain computational tasks which are executed by anticipating the user’s intention. These kind of techniques are based on the processing of sensor information that is collected from multiple sources such as smartphones, environment, etc.

In the case of smartphones, sensor information is gathered by embedded micromechanical artifacts (e.g. accelerometer, gyroscope, etc) and processed locally in real-time (in the phone) for changing some usability aspects in the mobile applications. For instance, the accelerometer sensor can be used for rotating the screen of the device depending on how the user is holding the handset. Another example is the light sensor which enables augmenting and decreasing the brightness of the mobile screen according to the situation of the environment (e.g. indoor, outdoor, etc).

Similarly, environmental sensor information is provisioned as a service (raw data) to mobile users by locating multiple microelectromechanical appliances within the environment. For instance, a thermistor sensor can be located for perceiving the temperature in the context of the user so that a mobile application can use that information for a proactive reaction (e.g. displaying different background screens or triggering a vibration event in the case of mobile pervasive games).

Research staff

Projects

  • Tanel Tähepõld – Context-aware Mobile Games using Android, Arduino and HTML5 (2012)
  • Martti Marran – Generating Schematic Indoor Representation based on Context Monitoring Analysis of Mobile Users (2012)

Desktop to Cloud Migration (D2CM)

Scientific computing applications usually need huge amounts of computational power. The cloud provides interesting high-performance computing solutions, with its promise of virtually infinite resources.  However, migrating scientific computing problems to clouds and the re-creation of a software environment on the vendor-supplied OS and cloud instances is often a laborious task. It is also assumed that the scientist who is performing the experiments has significant knowledge of computer science, cloud computing and the migration procedure. Most often, this is not the case. Considering this obstacle, we have designed tools that help scientists to migrate their applications to the cloud. The idea is to migrate the complete desktop environment, with which the scientist is working daily, to the cloud directly. The developed desktop-to-cloud-migration (D2CM) tool supports transformation of virtual machine images, deployment description and life-cycle management for applications to be hosted on Amazon’s Elastic Cloud Computing (EC2) or compatible infrastructure such as Eucalyptus. We used an electrochemical case study which extensively used the tool in drawing domain specific results. From the analysis, it was observed that D2CM tool not only helps in the migration process but also helps in optimizing the experiments, clusters and thus the costs for conducting scientific computing experiments on the cloud.

This work is done in the context of the EU FP7 project REMICS – “Reuse and Migration of legacy applications to Interoperable Cloud Services”

Software and guides

Source code is available at: https://bitbucket.org/mobilecloudlab/d2cm

User guide: http://mc.cs.ut.ee/mcsite/projects/d2cm-user-guide-v1.0/at_download/file

People

  • Satish Srirama
  • Chris Willmore
  • Pelle Jakovits
  • David Rodas
  • Vladislav Ivaništšev, Center for Interface Science and Catalysis, University of Strathclyde, UK.
  • Enn Lust, Faculty of Science and Technology, Institute of Chemistry, University of Tartu.

Publications

  • S. N. Srirama, V. Ivanistsev, P. Jakovits, C. Willmore: Direct Migration of Scientific Computing Experiments to the Cloud, The 2013 (11th) International Conference on High Performance Computing & Simulation (HPCS 2013), July 01-05, 2013, pp. 27-34. IEEE. (Nominee for the Outstanding Paper Award)
  • S. N. Srirama, C. Willmore, V. Ivanistsev, P. Jakovits et al.: Desktop to Cloud Migration of Scientific Experiments, 2nd International Workshop on Cloud Computing and Scientific Applications (CCSA) @ 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012), May 13-16, 2012. (* No proceedings) Slides

Stratus

The goal of the Stratus framework is to provide a distributed computing platform and tools for performing large scale scientific computing simulations and experiments. The main reason for designing a new cloud computing framework aimed for scientific simulations is that the existing frameworks that we have studied so far (Hadoop, Spark, Twister, etc.), do not provide full support for iterative distributed applications and especially iterative scientific simulations. Additionally, distributed computing frameworks are usually designed for static computer clusters and do not take into account characteristics that have made cloud computing a successful source for computing infrastructure.We find that exploiting cloud computing characteristics like elasticity, agility, scalability and the ability to provision finegrained computing resource on demand and nearly on realtime can greatly improve the applicability of such frameworks and would lower the costs of deploying applications in the public cloud.

People

Publications

  • P. Jakovits, S. N. Srirama, I. Kromonov: Stratus: A distributed computing framework for scientific simulations on the cloud, The Fifth International Symposium on Advances Of High Performance Computing And Networking (AHPCN-2012), In conjunction with The 14th IEEE International Conference on High Performance Computing and Communications (HPCC-2012), 25-27 June, 2012, pp. 1053-1059. IEEE.
  • P. Jakovits, S. N. Srirama: Stratus: Scientific simulations in the cloud, The 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 13-16, 2012.

Posters

ROBO M.D.

In this project a home care robot is built, which can approach and initiate communication with the patient being monitored, thus validating the threat of the alerts detected by the sensors attached to the patient. Thus, this robot allows monitoring risk patients such as elderly people, without the need of a caregiver, and allowing them the opportunity to stay at home independently. Scenarios such as the fall detection of elderly patients were demonstrated as part of the project.

Homepage

A youtube video of the first public demonstration of the ROBO M.D. system by Dr. Satish Srirama

Excuse Me, Have You Fallen Down?”, a blog at UT Blog

Presentations

Partners

  1. JKU (UAT): Johannes Kepler University, Institute for Design and Control of Mechatronical Systems
  2. IEIIT (LOM): Italian National Research Council (CNR) Institute of Electronic, Information and  Communication Technologies (IEIIT) Milano Branch(MI)
  3. Fontys (NBR): Fontys University of Applied Sciences
  4. JU (SWB): University of South Bohemia in České Budějovice
  5. UT (Tartu): University of Tartu

Mobile Cloud Applications

Mobile cloud applications are considered as the next generation of mobile applications, due to their promises concerning the efficient utilization of mobile resources (e.g. offloading when it is needed), the rich functionality that can scale on demand and the reasonable levels of real-time interactivity for managing data-intensive tasks. However, adapting the cloud paradigm for mobile devices is still in its infancy and several issues are to be answered first. Some of the prominent questions are; how to decrease the effort and complexity of developing a mobile application that requires accessing distributed hybrid cloud architectures? How to handle a multi-cloud operation without overloading the mobile resources? How to keep the properties (e.g. memory use, application size etc.) of a mobile cloud application similar to that of a native one?

MCM and the resource intensive tasks can easily be envisioned in several scenarios. For instance, we have developed several mobile applications that benefit for going cloud-aware. Zompopo, consists of the provisioning of context-aware services for processing data collected by the accelerometer with the purpose of creating an intelligent calendar. CroudSTag, consists of the formation of a social group by recognizing people in a set of pictures stored in the cloud. Finally, Bakabs is an application that helps in managing the cloud resources themselves from the mobile, by applying linear programming techniques for determining optimized cluster configurations for the provisioning of Web-based applications.

Research staff

Assoc. Prof. Dr. Satish Narayana Srirama

Huber Flores

Publications

  • H. Flores, S. N. Srirama, C. Paniagua: Towards Mobile Cloud Applications: Offloading Resource-Intensive Tasks To Hybrid Clouds, International Journal of Pervasive Computing and Communications, ISSN: 1742-7371. Emerald Group Publishing Limited. (In Print)
  • S. N. Srirama, C. Paniagua, H. Flores: Social Group Formation with Mobile Cloud Services, Service Oriented Computing and Applications Journal, ISSN: 1863-2386. Springer. DOI: 10.1007/s11761-012-0111-5 (In Print)
  • C. Paniagua, S. N. Srirama, H. Flores: Bakabs: Managing Load of Cloud-based Web Applications from Mobiles, The 13th International Conference on Information Integration and Web-based Applications & Services (iiWAS-2011), December 5-7, 2011, pp. 489-495. ACM.

Master theses

Oleg Petshjonkin – Migration of Native Android Applications to HTML5 (2012)

Former staff

Carlos Paniagua

Mobile Cloud Middleware

Hybrid cloud and cloud interoperability are essential for mobile scenarios in order to foster the decoupling of the handset to a specific cloud vendor, to enrich the mobile application with the variety of cloud services provided on the Web and to create new business opportunities and alliances. However, developing a mobile cloud application involves adapting different Web APIs from different cloud vendors within a native mobile platform. To counter the problems with interoperability across multiple clouds, to perform data-intensive processing invocation from the handset and to introduce the platform independence feature for the mobile cloud applications, we have designed a Mobile Cloud Middleware (MCM).

MCM abstracts the Web API of different cloud levels (multiple vendors) using Clojure and provides a unique interface that responds (JSON-based) according to the cloud services requested (REST-based).

Research staff

Assoc. Prof. Dr. Satish Narayana Srirama

Kristjan Reinloo

Publications

ACM DL Author-ize service

H. Flores, S. N. Srirama, C. Paniagua: A Generic Middleware Framework for Handling Process Intensive Hybrid Cloud Services from Mobiles, The 9th International Conference on Advances in Mobile Computing & Multimedia (MoMM-2011), December 5-7, 2011, pp. 87-95. ACM.

Master theses

Huber Flores – Mobile Cloud Middleware (2011)

Former staff

Carlos Paniagua
Huber Flores

SciCloud

SciCloud project studies the establishment of private clouds, migration and execution of scientific computing applications on the cloud, and adapting scientific computing problem/algorithms to frameworks amicable to the cloud like the MapReduce.

Cloud computing has become quite popular in the distributed computing community and to study the cost of science on the clouds, Scientific Computing Cloud (SciCloud) project has been initiated at the University of Tartu. Its main goal is to study the scope of establishing private clouds at universities. With these, students and researchers can efficiently use the already existing resources of university computer networks, in solving computationally intensive scientific, mathematical, and academic problems. The project targets the development of a framework, including models and methods for establishment, proper selection, state management (managing running state and data), auto scaling and interoperability of private clouds.

SciCloud has been established on our high-performance computing (HPC) clusters using the Eucalyptus technology. Current research in the domain is focused at studying the cost of migrating scientific computing applications to the cloud. SciCloud has several customized machine images with support for several scientific computing tools and simulations like Python with NumPy and SciPy, Scilab tool, MPI. Detailed analysis with several benchmark applications like matrix-vector multiplications and NASA Advanced Supercomputing parallel benchmarks (NAS PB) are performed using the setup.

Moreover, to adapt resource-intensive scientific computing applications for the cloud, the applications must be reduced to frameworks that can successfully exploit the cloud resources. Generally, cloud infrastructure is based on commodity computers, which are cost effective, however are bound to fail regularly. This causes a serious problem as, the software has to adapt to failures and the best solution is to replicate the data and computation. One such framework that is built based on this idea is the MapReduce framework, which has gained popularity as a cloud computing framework on which one can perform automatically scalable distributed applications. SciCloud project studied reducing several scientific computing problems/algorithms to MapReduce and designed a classification, based on how the algorithms are adapted to the MapReduce framework.

SciCloud Infrastructure

SciCloud is also the name of the private cloud infrastructure running on the hardware of the University of Tartu. It is divided into separate smaller clouds built using Eucalyptus/OpenStack platforms. Both Eucalyptus and OpenStack are open source cloud platforms compatible on the API level with Amazon EC2 public cloud.

https://stratus.at.mt.ut.ee/horizon

https://katel40.hpc.ut.ee:8443

To work with the SciCloud infrastructure, please meet us in person in Liivi 2 – 311.

Profiling MapReduce

People

Publications

2013

  • Jakovits, Pelle; Srirama, Satish (2013). Adapting scientific applications to cloud by using distributed computing frameworks. In: Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on: Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, Delft; 13-16 May 2013. IEEE, 2013, 164 – 167.
  • Jakovits, Pelle; Srirama, Satish (2013). Clustering on the cloud: reducing CLARA to MapReduce. In: NordiCloud ’13 Proceedings of the Second Nordic Symposium on Cloud Computing & Internet Technologies: Second Nordic Symposium on Cloud Computing & Internet Technologies, Oslo, 2-3 September 2013. ACM, 2013, 64 – 71.

2012

  • S. N. Srirama, C. Willmore, V. Ivanistsev, P. Jakovits et al.: Desktop to Cloud Migration of Scientific Experiments, 2nd International Workshop on Cloud Computing and Scientific Applications (CCSA) @ 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012), May 13-16, 2012.
  • S. N. Srirama, P. Jakovits, E. Vainikko: Adapting Scientific Computing Problems to Clouds using MapReduce, Future Generation Computer Systems Journal, 28(1):184-192, 2012. Elsevier press. DOI 10.1016/j.future.2011.05.025

2011

  • S. N. Srirama, O. Batrashev, P. Jakovits, E. Vainikko: Scalability of Parallel Scientific Applications on the Cloud, Scientific Programming Journal, Special Issue on Science-driven Cloud Computing, 19(2-3):91-105, 2011. IOS Press. DOI 10.3233/SPR-2011-0320.
  • P. Jakovits, I. Kromonov, S. N. Srirama: Monte Carlo linear system solver using MapReduce, 4th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2011), December 5-7, 2011, pp.293-299. IEEE.
  • P. Jakovits, S. N. Srirama, E. Vainikko: MapReduce for Scientific Computing – Viability for non-embarrassingly parallel algorithms, The 14th International Parallel Computing conference (ParCo 2011), August 30-September 2, 2011. Published in: Advances in Parallel Computing book series, Volume 22: Applications, Tools and Techniques on the Road to Exascale Computing, Edited by: K. Bosschere, E. D’Hollander, G. Joubert, D. Padua, F. Peters, M. Sawyer, 2012, ISSN 0927-5452, pp. 117-124. IOS Press.
  • O. Batrashev, S. N. Srirama, E. Vainikko: Benchmarking DOUG on the Cloud, The 2011 International Conference on High Performance Computing & Simulation (HPCS 2011), July 4-8, 2011, pp. 677-685. IEEE.

2010

  • S. N. Srirama, P. Jakovits, E. Vainikko: Adapting Scientific Computing Problems to Clouds using MapReduce, International Conference on Utility and Cloud Computing (UCC 2010) in conjunction with the International Conference on Advanced Computing (ICoAC 2010), December 14-16, 2010. (Extended in FGCS journal)
  • S. N. Srirama, O. Batrashev, E. Vainikko: SciCloud: Scientific Computing on the Cloud, 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), May 17-20, 2010, pp. 579. IEEE Computer Society.