Graupner, S. (2010):
Architecture for the Automated Management of Data Center IT Infrastructure
IT management in data centers faces fundamental challenges with the increasing scale and complexity of IT systems and services that can no longer conventionally be addressed by relying on human operation. More automation is necessary in data centers, particularly automation of lower-ordered routine operational and resource management tasks. Today, automation solutions in data centers remain fragmented and isolated. They lack conceptual foundations and cannot adequately address recent trends in data centers such as virtualization, sharing, dynamic provisioning, transformation and construction of resources. Automated operational control is barely supported.
This thesis systematically develops an architecture for the automated management of data center IT infrastructure, specifically for the domains of resource management and automated operational control. Its architectural principles are rooted in two disciplines that have not been considered in combination previously: IT management and operating systems. Over the past decades, operating systems have developed proven principles and abstractions for the automated operation of hardware components and applications in a computing machine. The central question of the thesis is which and how can these proven principles be adopted for systematically establishing the architecture of a "Data Center Infrastructure Operating System" (DCI-OS)?
The thesis first surveys the field of operating systems and categorizes essential concepts and abstractions. Then, concepts from IT management are surveyed and related to concepts in operating systems. The result, in combination with an analysis of requirements, then provides the basis for systematically deriving the architecture for the DCI-OS as the main contribution of the thesis. In the remaining parts, a number of realizations and case studies are presented validating the research, which was performed at Hewlett-Packard Laboratories from 2003 - 2008.
As result of the research, a number of innovations were developed: the concept of a Resource Topology as blueprint of a set of configured and connected operational resources providing the execution environment for complex enterprise applications; the modeling of temporal information about past, present and future states of resources; the linkage of planning and design stages to operational stages; support for modularity and reusability by automated resolution and construction processes of complex resources from topologies; the late binding of abstracted resource requests to the assignment and subsequent construction of resources; and the introduction of a controller concept with two discrete state sets of desired and observed management states as generalized pattern for automated operational control.