Globus Replication Client

From Globus

The Globus Replication Client is a new tool available in the GT 4.2.0 Replica Update Source Bundle. The replication client will be packaged with GT 4.2.1+. This document describes the installation and usage of the tool.

Contents

Prerequisites

The source bundle comes with all of the packages for the Globus Replication Client. This includes client libraries for GridFTP and the Replica Location Service. The bundle includes the Replica Location Server. It does not include the GridFTP Server. The bundle is compatible with GT 4.2.0 installations. It shares the same general prerequisites as GT 4.2.0.

Installation

To install the tool use the following commands. Note that these commands assume that you have set the GLOBUS_LOCATION environment variable.

% tar -zxvf gt4.2.0-replica-update-1.0-src_bundle.tar.gz
% cd gt4.2.0-replica-update-1.0/
% ./configure --prefix=$GLOBUS_LOCATION
% make gt4-replication-client gt4-replication-client-test rls
% make install

Testing

You must start a GridFTP Server on port 9001 of the local host. Note that GridFTP Server normally runs as the root user. See GridFTP for more details. If GridFTP Server is running on the local host on port 9001, run the following script.

% $GLOBUS_LOCATION/test/globus_replica_replication_test_unit/TEST.pl

The above test performs unit testing of the replication client java API. If the command indicates errors or failures, check the web page in the directory as specified by the script for more details.

Next, run the following script to unit test the globus-replication-client command-line client.

% $GLOBUS_LOCATION/test/globus_replica_replication_test_client/TEST.pl

Usage

The command-line client may be found at $GLOBUS_LOCATION/bin/globus-replication-client. For a complete list of options use the help option.

% $GLOBUS_LOCATION/bin/globus-replication-client --help

The client supports common data operations, such as put, get, copy, delete, and also register and replicate. The following examples assume that:

  • an RLS is running on the local host using the default port (39281);
  • the local RLS catalog is updating the local RLS index service (OPTIONAL); and
  • a GridFTP Server is running on the local host using port 9001.

Put

The put command takes a source file, either a local file (e.g., file://...) or a remote file (e.g., gsiftp://...), transfers it to a destination location, and registers it with a designated logical name.

% echo "FOO WAS HERE!" > foo.orig
% $GLOBUS_LOCATION/bin/globus-replication-client -r rls://localhost put \
 ./foo.orig foo gsiftp://`hostname`:9001/tmp/foo.put

Get

The get command looks up a replica using its logical name, randomly selects a replica source location, and gets the file to a local file location.

% $GLOBUS_LOCATION/bin/globus-replication-client -r rls://localhost get \
 foo ./foo.get
% cat ./foo.get
 FOO WAS HERE!

Copy

The copy command looks up a replica using its logical name, randomly selects a replica source location, and performs a thrid-party transfer to a remote location. It does not register the new copy in the RLS, hence the new remote file is called a copy not a replica.

% $GLOBUS_LOCATION/bin/globus-replication-client -r rls://localhost copy \
 foo gsiftp://`hostname`:9001/tmp/foo.copy

Replicate

The replicate command is nearly identical to the copy command, however, after the file is transfered to its new destination the files is registered in the RLS as another replica associated with the logical name.

Public Interfaces

Public interfaces for the replication client are available at Globus Replication Client API (pre-release) .

Development

The Globus Replication Client also includes an API for developing custom clients or other services that use data replication. The API is developed in Java and requires JDK 1.5+. It makes use of Java Generics to improve the type safety of the interfaces and to improve the self-documentating nature of the them.

Import Packages

While the replication client is contained within the org.globus.replica.replication package, other Globus packages will be needed for use in parameters.

import org.globus.gsi.GlobusCredential;
import org.globus.gsi.gssapi.GlobusGSSCredentialImpl;
import org.globus.util.GlobusURL;
import org.globus.replica.replication.Replication;
import org.globus.replica.replication.ReplicationFactory;
import org.globus.replica.replication.TransferClientFactory;
import org.globus.replica.naming.Name;
import org.globus.replica.naming.Registry;
import org.globus.replica.naming.RegistryLocator;
import org.globus.replica.naming.provider.rls.GlobusRLSRegistryLocator;
import org.ietf.jgss.GSSCredential;

Instantiate a Credential

A GSSCredential is required. In this example, the client gets the default Globus credential and converts it into a GSSCredential.

GSSCredential credential = new GlobusGSSCredentialImpl(
		GlobusCredential.getDefaultCredential(),
		GlobusGSSCredentialImpl.INITIATE_AND_ACCEPT);

Locate a Replica Naming Registry

Replica names are managed in a replica naming registry. The defaut provider uses the RLS to manage replica location information.

GlobusURL registryURL = new GlobusURL("rls://hostname");
RegistryLocator<Name> locator = GlobusRLSRegistryLocator.getInstance();
Registry<Name> registry = locator.locateRegistry(registryURL, credential);

As seen in the example, the RLS based registry uses Name objects as the target objects of a mapping. All mappings begin with a Name, the logical name, and bind to an object. The RLS uses Names again as the target of a binding. So the mapping is from Name to Name. In Registry<Name> the <Name> indicates the class argument.

Create a Transfer Client Factory

The TransferClientFactory is an interface that allows the replication client to use different transfer tools and protools. The user may create TransferClientFactory classes to support a favored transfer protocol. The replication client package comes with an implementation of the TransferClientFactory interface that supports GridFTP protocol.

TransferClientFactory transferClientFactory = 
	new org.globus.replica.replication.SimpleGridFTPClientFactory(credential);

The constructor used in the example only sets the credential parameter of the factory. Any connection created with this factory instance will use the given credential for authentication and authorization. The SimpleGridFTPClientFactory has constructors that accept most of the common GridFTP related parameters, such as TCP buffer size, paralellism, and several other.

Get a Replication Factory

A repication factory creates replication instances. The ReplicationFactory is generic parameterized class. The class parameter indicates the type of objects used by the replica Registry<T> interface supplied to the factory. The SimpleReplicationFactory is an implementation of the ReplicationFactory<Name> that supports registries of Name objects.

ReplicationFactory<Name> factory =
	org.globus.replica.replication.SimpleReplicationFactory.getInstance();

Create a New Replication

The Replication object is the primary interface for performing replication operations. Use the replication factory to create new replication objects. Aside from the replication registry and the transfer client factory objects discussed already, the factory also requires a replica selection algorithm. The algorithm must implement the org.globus.replica.replication.ReplicaSelection<T> interface. The type parameter T of the interface should match the type parameter of the replica Registry<T> implementation. In the following example, the random algorithm implementation is used.

Replication replication = factory.newReplication(
	registry,
	new org.globus.replica.replication.RandomReplicaSelection<Name>(),
	transferClientFactory);

Put

The put method accepts a source file, a sink URL, and a logical name. An easy way to create an instance of a Name for use as a paramter with the replication client is to make an instance of the SimpleName. This class accepts a string in its

File source = new File("/tmp/foo.original");
GlobusURL sink = new GlobusURL("gsiftp://hostname/tmp/foo.put");
Name name = new org.globus.replica.naming.SimpleName("foo");

replication.put(source, name, sink);

Get

The get method accepts a logical name and a sink file.

File sink = new File("/tmp/foo.get");
replication.get(name, sink);

Replicate

The replicate method accepts a logical name and a sink URL. The operation will resolve the logical name to the set of bound replicas, it will use the selection algorithm to select among the replica, and after transfering the replica to the new replica location it will bind the new replica location to the logical name.

GlobusURL sink = new GlobusURL("gsiftp://hostname/tmp/foo.replica");
replication.replicate(name, sink);
Personal tools
Execution Projects
Information projects
Distribution Projects
Documentation Projects
Deprecated