Grid Computing Comes Of Age

Date:   Tuesday , September 30, 2003

GOOGLE FOR ‘GRID COMPUTING’, AND THE response comes back with 1.45 million hits in about 0.22 seconds. That’s a large reference set of articles and annotations about Grid from an online library of documents and is being made available to any user with access to Google portal. The virtual resource being used is Google. The interface to Google is intuitively simple. Google runs its search on a huge cluster of Linux boxes1 executing the search in parallel across this bank of computation nodes to respond correctly, quickly and comprehensively to the request. This is a classic example of what virtualization is going to be all about. Grid computing is poised to dramatically change the economics of a computation by drastically lowering costs and extending the availability of huge resources to those with modest budgets.

“Grid” has been synonymous with Utilities. Electrical, gas, water and telecommunication networks are Grids. It is not the utility network that is interesting, it is the service model delivered by very complex operations; the client of these services are provided with an extremely simple interface and are unaware of the technologies delivering the Grid. A consumer of an electric utility is unaware of the source of the power being consumed by him. The appliances connected to the Grid draw power at a constant voltage and phase. It is the responsibility of the Utility Company to manage the generation and transmission. Generators must be brought online and offline without affecting the voltage and the phase, and in resonance with the load on the Grid. When quality of service deteriorates, the gridlock followed by an outage is news.

A computational architecture has never been materialized as a utility Grid until now. Industry trends have now made such a Computational Utility possible and a functional necessity. Global Grid Forum (GGF), standards body for Grid specifications, is driving the evolution of specifications for adoption by applications to harness virtual resources. Blade servers are providing huge computational resource at a very low cost. High bandwidth networks exist and last mile broadband rollouts are around the corner.

With exponential growth in amount of information generated and consumed, the storage is increasing infinitely. The data is no longer directly attached to the information consumption or the generation point. The information never obsoletes, it just morphs into new information. All these trends are driving virtualization of entire computational infrastructures for Utility Computing. This will be the Holy Grail of Computing—complete computational nirvana when information computation at an enterprise level is available as and when one needs it—on demand.

With true Grid Computing, the services provided will be at the level of a Utility. Computation and information will be delivered to the consumer as, when and wherever the client requires it.

Grid Evolution
Grid Computing has many definitions and mean many things. The definitions arise due to the evolution of Grid Computing from the annals of Research Organizations to realms of an Enterprise domain. The evolution can be traced over a period of time.

Scavenging Resources
Certain research tasks demanded lots and lots of compute cycles but are embarrassingly parallel. Most of these tasks can be executed in parallel and autonomously by spreading the tasks around. A good example of such task is finding the next largest prime number since it involves continual crunching of a well-defined compute intensive algorithm using a list of values and reporting the results back to the requester. Such scavenging of resources has been usually the first attempt to harness the power of large set of loosely coupled computers when not in use. SETI@home is another example such a Grid usage. There is no predictability of results returned, no guarantee of correctness and no guarantee of security due to malicious software or user.

Sharing Resources
There are certain resources, for example an earthquake simulator that cannot be replicated at all places. Thus, there exists a need for more organized approach to using resources. Communities share such compute resources in a controlled manner. Information must be moved into and out of such resources. In addition, the resource has to be scheduled in an orderly manner. The identity of the user has to be established to prevent misuse or impersonation. The organizations involved are within a single company or they can be across different administrative domains as we have seen for collaboration across universities. Grids have been designed to facilitate users and applications to share both compute and data resources. Sharing resources require a secure, authenticated, and controlled access.

Dedicating Resources
Large Data Centers pool massive computing power hardware for multiple applications, in place of buying different hardware for individual application. The resources are allocated to individual application and controlled based on their usage characteristics. Such usage characteristics can be used as policies to run software that provide batching, scheduling and workflow automation.

There has been lot of work done in universities like University of Wisconsin—Condor Project or in companies like Platform Computing. The usage characteristics have to be analyzed a priori and programmed to be effective in the job scheduling algorithms. Such virtualization of dedicated resources allow for controlled access and utilization, but is largely static in nature. Moreover, it is primarily non preemptive, and when resources have been assigned, there is no clean way of migrating work off these resources to dynamically adjust to the current operating conditions which could be different from the planned one.

Oracle uses such a Grid—a development Grid—to build its own database. Its regression server farm uses a modified version of Condor to run a massive pool of Sun Netras to schedule 3500 compute hours of jobs to crunch through 100,000 plus regression tests per night. This allows every label to be consistent with the changes done every day—a phenomenal improvement in productivity for every developer. In addition, developers can submit any subset of these tests to be run on the farm to verify the correctness before being absorbed into the product.

Virtual Resources—Utility Computing
There has been a single invariant in the three phases described to you so far—under utilization of existing computational, network and storage resources. The previous phases have all attempted to address this single invariant. There are massive islands of computational power.

No supercomputer is ever 100% utilized, effectively increasing its cost of ownership. The utilization rate is 30-40% for even most servers in a data center. Just like an electrical utility that seeks to maximize power distribution at maximum efficiency, minimum power loss, and lowest generation cost, enterprises are pooling their resources in order to improve the utilization rate and availability of resources. This is Grid Computing born again; it is Utility Computing. It is Distributed Consolidation with some key attributes:

Dynamic provisioning of resources—avoids fragmentation, automatically coalesces free pools to improve utilization of resources. It is very analogous to memory management techniques of an operating system.
Strong Security—Dynamic nature of Grid implies users, resources, and services are constantly morphing. There are unique security challenges. Every service and consumer over the Grid needs to be secure—authenticated and authorized. Grid resources need to be audited transparently for adherence to operational and legal policies.
Discovery—Applications must discover services dynamically. Applications will move, services will move but they discover each other seamlessly.
Just in Time Information—The next generation of computation is all Information. Infrastructure delivering the Utility model will be continuously adjusting to the information being generated and moved around. The key technical challenge will be in providing the right information at the right time at the right place without gridlock.

The evolution of Grid Computing parallels the evolution of Web. The World Wide Web started with simple requirements for sharing documents for collaboration. The technology for presentation over the Internet had its roots in the research labs of CERN. It was infectious, and innovations morphed the Web in ways unfathomable at the point of creation. Enterprises found new ways to do business. Grid Computing is the evolution of “presentation over the Web” to “computation over the Web.” Its roots are also in CERN.

The Economist predicted in its June 21st 2001 issue - “The best thing about the Grid is that it is unstoppable.” Evolutions are truly unstoppable. The synergy across a spectrum of technologies to extract more from less adds fuel to the momentum. The exact path cannot be visualized, but quantum changes will be made in the way information is created, managed, delivered and consumed.