jump to navigation

Observation, Monitoring and Metrics May 8, 2012

Posted by janettheresjohn in Service Administration.

System administration is a mixture of technology and sociology. It involves systems, network and users.  Even though the users cause most of the problems that a system administrator is trying to solve, we cannot eliminate the users from the system environment. The users of computer systems are constantly making changes to the system conditions. Therefore, system administrators have to constantly monitor the system and observe technology to identify problems and develop solutions for optimal performance.

According to Burgess, in technology the act of observation has two objective goals. The first goal is to gather information about a problem in order to motivate the design and construction of a technology which solves it, and the second goal is to determine whether or not the resulting technology fulfills its design goals. [1] The ITIL service lifecycle also begins with observation.  In this first stage, system administration has to observe and identify trends in emerging technologies, security threats, customer needs, change in the marketplace and competitions. After gathering information about business requirements, system administration has to align developing trends with business requirements and develop a strategic plan for implementation. This is the service strategy phase in which the system administration identifies requirements and agrees on changes. Based on the strategic plan, system administration implements services and look for room for possible expansions. System administration will perform quality analysis to ensure that the infrastructure operate in foreseeable extreme or abnormal circumstances.

The second objective goal of observation is to determine whether or not the resulting technology fulfills its design goal. ITIL service life cycle includes a continual service improvement phase. In this phase, system administration conducts periodic evaluations to improve service quality. They will identify future improvement opportunities. In order to measure whether the resulting technology fulfills its goal, system administration perform gap analysis between what is or can be measured today and what is ideally required and identify and what risks may be involved as a result. System administration then analyzes data to identify whether the administration is meeting targets and any corrective action required.

System administrators use different monitoring application to observe the pulse of their system and can evaluate their performance. Their knowledge can be used to define repeatable benchmarks or criteria for different aspects of a problem.[1]

System administrator is responsible for monitoring and evaluating the efficiency of software components. They identify design and implementation faults and catalogue them for future reference to avoid making similar mistakes again. It is also important to identify legacy issues since they might be placing demands on onward compatibility, or by restricting optimal design or performance. System administration is responsible for defining policies and adjusts them to fit behavioral patterns. System administrators observe a policy in a real situation and judge the value of the policy. They will identify how changes in policy can move one closer to a desirable solution.

In order to get quantitative support for or against a hypothesis about system behavior, system administrator use metrics that do not require extensive system auditing that can negatively affect performance. System administration is concerned with maintaining resource availability over time in a secure and fair manner. The author mentioned about using operating system metrics to understand system behavior. These metrics can be used in system administration as well. The metrics that fall into two main classes: current values and average values for stable and drifting variables respectively. An averaging procedure over some time interval might be the best approach to use.

System administration identifies periodic and non-periodic metrics that characterize resource usage. System administrators can use these metrics to identify pattern of change, diagnose problems, define policies, identify resource requirements, perform fine tuning of various parameters and to perform load balancing.  Some example metrics are disk usage, disk operations, number of privileged and non-privileged operations. The disk usage metric indicates the actual amount of data generated and downloaded by users, or the system. Evaluating the disk usage metric will help the system administration to design policies for disk usage, disk quota, garbage collection etc. to improve overall performance and availability of a shared resource.  These metrics helps system administrators to identify pattern of change that includes social patterns of the users, and systematic patterns caused by software systems.

System administrators can use three scales to extract information from a complex system. These are microscopic, mesoscopic and macroscopic scales. Microscopic scale deals with individual systems, mesoscopic deals with clusters and patterns of system and the macroscopic level views all the activities of all the users and systems.  System administration can use entropy to gauge the cumulative behavior of a system, within a fixed number of states. Entropy can also be used to measures the amount of noise or random activity on the system. It is also important to identify random errors (occur by accident), personal errors and systematic errors (often a result of misunderstandings).

These metrics, use of monitoring applications and observation helps system administrators to detect risks and security treats early enough, accurately identify problem causes, design a plan for implementation, implement solutions for the problem, install solutions, and evaluate and improve existing solution. This would ultimately save the administrators time and improve customer satisfaction.


  1. Mark Burgess, Principles of Network and System Administration, John Wiley & Sons Ltd, Second edition
  2. An introductory overview of ITIL V3; http://www.itsmfi.org/files/itSMF_ITILV3_Intro_Overview_0.pdf
  3. COBIT 4.1, IT Governance Institute, 2007


No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: