Contact:
Clovis Chapman
Department of Computer Science,
University College London,
Malet Place, WC1E 6BT, London, UK
tel: +44 (0) 207 679 7758
mail: c.chapman [ at ] cs.ucl.ac.uk
Recent Publications:
Clovis Chapman et al. (2010) Software Architecture Definition for On-demand Cloud Provisioning. HPDC 2010. [ link ]

Luis Vaquero et al. (2010) Principles, Methodology and Tools for Engineering Cloud Computing Systems. (IGI Global) [ link ]

more >>

What is the Grid? Page 1 | Page 2

Extract of PhD Thesis [ pdf ]: Chapter 2 background

Emergence of Grid Standards

The last few years have seen the adoption of open standards by the grid community as a means of facilitating interoperability between independent systems, and we are particularly concerned with Web Services, as a framework for the definition of loosely-coupled higher-level services, and the Job Submission Description Language, as a means of specifying job requests independently from DRMs. These standards provide a context within which to develop and deploy new services, and also introduce capabilities that will be explored throughout this thesis.

Web Services

Service-oriented architectures and Web Services particularly, provide a flexible means of specifying and developing services which enable the use of resources across heterogeneous, autonomous domains. Web Service technology has become an increasingly important building block in the design and development of a global grid infrastructure (Foster et al., 2002). The ability to decompose resources and the functionality they provide into a set of discoverable and loosely coupled services - capable of interaction in heterogeneous environments - addresses many of the interoperability issues that can be encountered in large scale grid infrastructures. Essentially a collection of XML-based protocols and standards, Web Services ensure that independently developed applications and tools can interact across a network. The reliance on the eXtensible Markup Language (XML) (Bray et al., 1998) as a common formatting language ensures a degree of platform, programming language and system independence. The XML-based protocols and standards define the ways which define the ways in which services should be described (through the Web Service Description Language (WSDL) (Christensen et al., 2003)), how they can be accessed and how communications should be formatted ( Simple Object Access Protocol (SOAP) (Mitra, 2003)), and finally how these may be discovered by client applications. A wide range of related standards have emerged to take further advantage of the capabilities of Web Services, and in a grid context, the need to maintain state and provide asynchronous communication has led to the definition of the Web Service Resource Framework (Czajkowski et al., 2004) and Web Service notification (Graham et al., 2004) family of standards.

Loose coupling, as well as the numerous development kits and compatible applications and tools, make Web Services particularly desirable to the grid community. In addition, it should be noted that Web Services enable session management over a single port, considerably facilitating the administration of firewalls.

Job Submission Description Language

With respect to compute grid environments, the need for flexible means of defining and manipulating job description documents has led to the specification of the Job Submission Description Language (JSDL) (Anjomshoaa et al., 2005), an emerging standard recommended by the Global Grid Forum (Global Grid Forum (GGF), n.d.).

JSDL essentially defines an XML based vocabulary and schema to describe requirements and characteristics of computational jobs - such as input/output files, arguments, resource requirements, etc. - for execution on grid resources. Grid environments may rely on a wide range of job submission systems, and intermediate services required to manipulate job specifications, such as portals, meta-schedulers, accounting systems, security systems; JSDL aims to facilitate interoperability by decoupling job specifications from execution environments.

Like Web Services, the adoption of XML opens access to a wide range of technologies and standards that provide a wide range of applications, such as dynamic transformation of documents (i.e. Extensible Stylesheet Language Transformations (Zongaro et al., 2006)), code generation, queries (i.e. XQuery (Florescu et al., 2006) XPath (DeRose & Clark, 1999), etc.

JSDL elements can be categorized as follows:

  • Job parameters, which describe the various parameters of the job to execute, arguments, environment variables, standard streams, etc. JSDL also does allow the specification of parallel jobs.
  • Resource requirements, which describe the characteristics of the target execution machines required, in terms of hardware requirements – memory, disk space, processor architecture, bandwidth, etc. – and platform requirements – operating system, file system etc.
  • Data requirements, which describe the data files, such as an input, output files, binaries, etc., required for the computation and the data transfers to perform, including descriptions of the protocol to use, the source and targets for the data files and handling operations (e.g. delete on termination). The following listing represents a sample of a JSDL (v1.0) document, which show job parameters and data requirements:

 

JSDL Document

JSDL is intended to fit into a wider range of standards specifying the various aspects of submitting, managing and running jobs in grid environments. This includes a resource description language, a job policy language and a scheduling requirement language. Unfortunately these do not currently exist and when required, our meta-scheduling framework will rely in the context of this thesis on resource manager specific interfaces, until a suitable replacement can be used.

An important advantage of using this standard is the clear distinction between job specification and job lifetime. JSDL is not intended to cover the entire lifetime of a job, facilitating its distribution amongst different nodes before execution.

The fact that it also separates data transfers for job parameters and requirements, allows the specification document to be distributed independently from the data required for the transfer, ensuring that the cost of transferring job specification through various nodes is kept minimal, with larger data transfers only occurring before submission to the selected DRM.

Additional information can be found in my thesis here: [ pdf ]