Incubator/GEMLCA

From Globus

GEMLCA is an Incubator project that aims to create a production-level solution to "gridify" (or grid-enable) legacy codes in order to run them on the Grid.

This GlobDev project webpage contains information for project committers and contributors. Information for users of GEMLCA can be found here.

Contents

Incubator Project Overview

Overview of the project objectives:

  • To support the easy deployment of legacy code programs exposed as Grid services. To achieve this, the interfaces provided to the Grid client have to cover the full life-cycle of legacy code deployment, execution and administration.
  • To minimise the time and effort required to upgrade and migrate GEMLCA onto new platforms. Given the current dynamic evolution of Grid systems and middleware solutions this requirement is particularly crucial (See OGF Gin Interoperability Now - GIN- initiative).
  • To help Grid system administrators by providing a flexible architecture that can be easily deployed on several sites with minimum effort.

GEMLCA concept

GEMLCA enables legacy code programs written in any source language (C, Fortran, Java, etc.) to be easily deployed as a Grid Service without significant user efforts. GEMLCA does not require any modification of, or even access to, the original source code. A user-level understanding, describing the necessary input and output parameters and environmental values is all that is needed to port the legacy application binaries onto the Grid. The GEMLCA concept is built on three actors: resource managers, code owners and code users. Code owners have to register their legacy code applications to make them accessible for the Grid community.

GEMLCA concept

They should ask resource managers to deploy the registered legacy applications on the Compute Servers as Grid services. Neither code owners nor code users have to do any deployment if they want to use those legacy code services that were registered by code owners and were previously deployed by the Compute Server resource managers. Code users can immediately invoke pre-deployed legacy code services having access to the GEMLCA Services. The implementation of the GEMLCA concept incorporates the GEMLCA Client, GEMLCA Server, Grid Hosting Environment and Computer Server. The GEMLCA client is built on the GEMLCA client API. Any kind of client can be built using this API. Currently there is a Portal client implementation (see the P-GRADE portal) and a command line client (GEMLCA CLI). This can be can be installed on any client machine through which a user would like to access the GEMLCA Services. Code owners and code users can use these clients to access the GEMLCA service. The GEMLCA Server provides a set of GEMLCA Grid services, which expose legacy codes as Grid services. Among others it exposes a new legacy code application as a Grid service, it provides the list of available legacy applications and the list of legacy parameters with default values and allows the user to modify these, it supports submission of legacy jobs to a job manager, it queries the status of previously submitted legacy jobs and retrieves results from legacy applications. On the Grid Host Environment a service-oriented OGSA-based Grid middleware (GT4) is installed to connect the Compute Server into an OGSA-based Grid. The Grid middleware should allow the exposure of functionalities as Grid/Web services, and should also support job submission to different job managers like Condor, Fork, PBS etc. There are currently available GEMLCA implementations for GT2 and GT4 and gLite/WMS 3.0. The Compute Server is a single or multiple processor computing system, including PC clusters on which several legacy codes are already implemented and available. GEMLCA turns these legacy codes into Grid services that can be accessed by Grid users.

GEMLCA architecture

GEMLCA has a three-layer architecture. The front-end layer offers a WSRF Service interface, built on GT4 WSRF-Service-Core, that any authorized Grid user can use to retrieve the list of available legacy codes, to administer legacy codes and to execute legacy codes. The front-end layer hides the core layer, which deals with legacy codes as legacy code processes, with their environments and parameters. The back-end layer connects GEMLCA to different Grid middlewares. GEMLCA is currently capable to submit legacy code jobs to EGEE and GT2 Grids and invoke legacy code applications as Grid services in GT4-based Grids.

GEMLCA architecture

GEMLCA versions before 3.2 supported n->1 mapping, i.e. multiple GEMLCA Services were able to submit jobs to a single GT4 service. The other limitation was that Grid middleware and the file transfer software should be installed on the same node. The latest GEMLCA release (GEMLCA 3.2) supports n->m mapping, i.e. a single GEMLCA Service can submit jobs to multiple GT2 GRAMs and/or service requests to multiple GT4 WS-GRAMs using the GT2 and GT4 plug-ins, respectively. This release is also able to collaborate with the file transfer software and the Grid middleware of GT4 when they are installed on different nodes. GEMLCA incorporates an application repository with a job submitter to support publishing and execution of legacy applications on the Grid. Applications can be exposed via a GEMLCA service and can be executed using a GEMLCA client. As soon as an application is published and deployed, GEMLCA is able to submit it through the back-end plug-ins.

Workflow interoperability and Database support

We published and deployed OGSA-DAI clients and workflow engines using GEMLCA to support data-level interoperation and workflow-level interoperability.

Database support

We developed three OGSA-DAI clients (query client, update client and request document client), and described them as GEMLCA legacy codes. We also uploaded these clients to the GEMLCA repository. To access a database, the OGSA-DAI node should connect the GEMLCA repository and get the OGSA-DAI client. GEMLCA submits the OGSA-DAI client to the Grid on behalf of the OGSA-DAI node. The client connects to the specified OGSA-DAI service and invokes the request and later retrieves the result. More about this here.

Workflow support

Command-line workflow engines just like any other legacy code applications can be published via GEMLCA without any kind of re-engineering. Currently, Kepler, Taverna and Triana workflow engines have been described as legacy code applications. These workflow engines are published through a GEMLCA service to make them available. If the GEMLCA is integrated to a particular workflow engine, then the published workflow engines can be executed as non-native workflow nodes by this workflow engine. For example inside a Taverna workflow Kepler and Triana workflows, which are non-native workflows, are executed on their own workflow engines through the GEMLCA. More about this here

GEMLCA workflow and database integration

User applications and communities

The University of Westminster launched the Westminster Grid Application Support Service (W-GRASS) in May 2007 to migrate legacy code applications to the Grid and to support user communities who are interested in utilising GEMLCA. Since May 2007 the following applications have been ported to Grid using GEMLCA (detailed information on the projects, applications and user communities can be found by following the links below):

Bio-science

  • CHARMM - macromolecular simulator calculating energy minimalisation molecular dynamics
  • GAMESS-UK - ab initio molecular electronic structure program
  • MultiBayes - program for analysing DNA sequences of genes
  • patient readmission - patient’s readmission rates calculation in NHS hospitals using R
  • protein molecule modeling - using AutoDock & Amber to perform docking simulation

Engineering

  • DSP - calculation of class of optimal periodic non-uniform sampling sequences
  • MadCity - urban traffic simulator

Incubator Project Metadata

Status

The status of GEMLCA is: Newly accepted Incubator Project 6/15/2007, as defined by the Incubator Process Guidelines found at http://dev.globus.org/wiki/Incubator/Incubator_Process .

Roadmap

Development roadmap for GEMLCA. Summary of the tasks in the roadmap:

  1. Milestone: development of back-end plug-ins.
    • Output: EGEE plug-in to support direct job submission and job submission through gLite/LCG broker
    • Release: 2.6
  2. Milestone: de-coupling GRAM and GridFTP.
    • Output: GRAM and GridFTP could be installed on different nodes
    • Release: 3.0 made on 13/12/2007
  3. Milestone : de-centralised GEMLCA.
    • Output: single GEMLCA is able to submit jobs to multiple WS-GRAMs (1->n mapping)
    • Release: 3.1
  4. Milestone: upgraded legacy code management
    • Output: upgraded admin client + admin service, GEMLCA with registry + repository, improved security solution
    • Release: 3.2 made on 17/11/2008 Sc08.png
    • Features:
      • Supports n -> m mapping (i.e. a single GEMLCA Service can submit jobs to multiple GT2 GRAMs and/or service requests to multiple GT4 WS-GRAMs)
      • Service front-end consolidated
      • New GemlcaServiceFactory and GemlcaService interfaces
      • De-coupled back-ends
      • Support for multiple back-end plugins
      • GT2 backend plugin re-engineered. All plugins are written in Java and support notification & polling
      • GT2, GT4 and gLite UI WMS plugins available & in production with NGS & EGEE
      • Extended LCID with back-end support, list of executor sites
      • Monitoring of GEMLCA job execution statistics to MDS4
      • Used in production on the UK NGS
      • Tested with TG and OSG

Further Information

Demos

SC'08 - International Conference for High Performance Computing, Networking, Storage and Analysis

15-21 November 2008, Austin, Texas, USA

The W-GRASS and Centre for Parallel Computing team will be present with several demonstrations on this year's Supercomputing conference. The demonstrations are focusing on the interoperation of Grid compute and storage resources, the interoperability of various Grid workflow solutions including Taverna, Triana, Kepler and P-GRADE, and will also showcase applications ported to the Grid by W-GRASS. You can find more information about the demos here.

Presentations

A generic overview presentation on GEMLCA is available here.

Other GEMLCA related presentations and training material are also available from the EGEE Training Material Library.

Publications

A list of GEMLCA related publications can be found here

Incubator Project Metadata

Committers

If you would like to become a committer, guidelines are [here].

  • Dr. Thierry Delaitre
  • Gabor Kecskemeti
  • Zsolt Lichtenberger
  • Tamas Kukla
  • Alexender Tudose

Mailing Lists

Developer discussion (gemlca-dev) archive/subscribe/unsubscribe
User discussion (gemlca-user) archive/subscribe/unsubscribe
Announcements (gemlca-announce) archive/subscribe/unsubscribe
Commit notifications (gemlca-commit) archive/subscribe/unsubscribe

How to subscribe
How to unsubscribe
Search the email archives

Download

You can get the binary of GEMLCA release 3.2 here:

Gemlca-r3.2-install.zip

You can get the manual of GEMLCA release 3.2 here:

Gemlca-r3.2-manual.pdf

You can get the source of GEMLCA release 3.2 through SVN:

svn co https://svn.globus.org/repos/GEMLCA/branches/3.2

You can get the binary of GEMLCA release 3.0 here:

Gemlca-r3.0-install.zip

You can get the manual and the source of GEMLCA release 3.0 through SVN:

svn co https://svn.globus.org/repos/GEMLCA/branches/3.0

Policies

In addition to the Globus Alliance Project Guidelines, the GEMLCA adheres to the following policies:

Guidelines for committers

  • Currently there are no further policies than required by dev.globus.

Guidelines for individual contributors

  • Currently there are no further policies than required by dev.globus.

Contributors

The GEMLCA gratefully acknowledges the following contributions

  • Dr. Gabor Terstyanszky
  • Tamas Kiss
Personal tools
Execution Projects
Information projects
Distribution Projects
Documentation Projects
Deprecated