There are a lot of varied definitions of 'Grid' but here are some of the more commonly accepted views:
In their seminal paper, The Anatomy of the Grid, Foster et al (2001) defined the grid as:
"...coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations." Adding... "The key concept is the ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose."
This definition appears to focus on sharing between organizations, something that was important in the early days of grid computing when the focus was largely on academic, scientific applications. Early exemplars included universities sharing supercomputers in order to run huge scientific codes: something that would either be impossible without grid software, or have to be programmed, by experts, in a completely bespoke manner. In fact, in an earlier paper Computational Grids Foster & Kesselman (1999) state that:
"A computational grid is a hardware and software infrastructure that provides dependable, pervasive, and inexpensive access to high-end computing capabilities."
The commercial world is much less likely to share resources across computational boundaries, largely due to security concerns. As a result, most commercial grid computing platforms are intended to be used within organizations; for example, to support resource sharing across clusters owned by different sub-divisions of a company. This can make huge efficiency savings for a company. Later definitions from Foster and associates made it clear that this style of computing still fell under the Grid banner. In The Physiology of the Grid, Foster et al (2002) argue that:
"The grid integrates services across distributed, heterogeneous, dynamic 'virtual organizations' formed from the disparate resources within a single enterprise and/or from external resource sharing and service provider relationships in both ebusiness and e-science."Ian Foster later unpacks this definition, in the article What Is The Grid (GRIDtoday, 2002), into a checklist that can be used for judging what is, and what is not, a Grid. This defines a Grid as a system that:
Interestingly, whilst this is general enough to encompass intra-organisational grids (where the grid combines resources that are under the control of different divisions), the second point rules out many commercial grids. The Global Grid Forum (now the Open Grid Forum) was set up, largely by academic groups, to define and standardize grid protocols. This is clearly of great benefit when building applications that share resources across organizational boundaries – if each institution supports the same protocols (e.g., for authentication, authorization, resource discovery, and resource access) then it makes it much easier to build applications that cross institutional boundaries. However, in the commercial sector, where intra-institution grids are much more common, the pressure to adopt standard protocols is much less important, and very few commercial grid providers have taken this approach.
Taking these differing definitions into account we would define Grid computing as:
"A form of distributed computing that provides a unified model of networked resources (processing, application, data, storage) within and/or across organizations. The resources may be under different management regimes, and may be geographically dispersed. The unified model presents a homogeneous view of collections of heterogeneous resources, and supports the dynamic coordination, sharing, selection and aggregation of these resources in order to meet users’ availability, capability, performance and cost requirements."
Analysts have also offered their definitions of grid computing. In IDC report on grids (GRIDtoday, 2002), IDC defines grids in technical markets as:
"...a set of independent computers combined into a unified system through systems software and networking technologies."
For IDC, major characteristics of the grid include:
Note that the last point is presumably intended to rule out the need for special purpose hardware for connecting computers into Grids. Also, the emphasis in the second bullet is on agility – the ability to dynamically reconfigure the system; for example, as user workloads and requirements change.
Gartner has also entered the debate.
"Grids are collections of computer resources, owned by multiple organizations, that are coordinated to solve a common problem. These resources can be computers, collectively run in extremely large parallel-processing programs — typically used to solve the large-scale problems found in scientific and engineering computing. The multiple organizations may be product or geographic divisions of one firm, or may be multiple companies. Another type of grid hosts large amounts of data by spreading it across many systems. Regardless of its type, the key aspects of a grid are multiple resource ownership and a single purpose. This distinguishes grids from previous distributed- or networked-computing models." (Gartner, 2006)
This definition is useful when drawing a line between grid computing and distributed or networking models. It is interesting, however, that the focus is on parallel processing for scientific applications. Perhaps for this reason, Massimo Pezzini, vice president and distinguished analyst at Gartner, draws a distinction between traditional grid computing and commercial deployments that are focused around application platforms. In Grid Gets Transactional (GRIDtoday, 2006), Pezzini highlights the similarities and differences.
"In traditional Grid, the problem is allocating processing power and you don't really care about availability of the overall system, if one box is available you don't care. You need to have that server. In transactional apps, it is not that simple. With transactional applications, you have to deal with databases, you have to deal with maintaining the transactional integrity of the thing that you are doing. It's a bit more complicated."
The complication comes from the need to identify data that has been left in an inconsistent state due to failure, and to take the appropriate action necessary to rectify the problem before corrupted data spreads through the enterprise. Therefore, a key question for Grid providers is how to evolve their infrastructures to embrace transactional enterprise applications. New solutions are required to achieve this. Based on our experiences in managing more static forms of data sharing in enterprise applications over the past two decades, we believe that these new infrastructures must be able to:
Read Arjuna's White Paper for further detail.