Chapter 7. IT Staff Disasters

 

‘There are men in the world who derive exaltation from the proximity of disaster and ruin, as others from success.’

 
 --Winston Churchill (1874-1965)

Who do you mean by IT staff?

Every modern organization will usually have several staff or departments (internal or outsourced) for maintaining and troubleshooting the IT infrastructure. Such staff or departments are usually called IT staff, tech support, and technical assistance, etc. They usually have specialized training and the skills necessary for maintaining critical IT equipment. For example, you could have a specialized team just to manage backups and restorations of various servers in your organization. They could be trained in using the backup software, how to back up, what to back up, how to restore, etc. Or there could be a dedicated team just to manage and operate your company’s e-mail systems.

What are the general precautions to prevent disasters relating to IT staff?

No modern organization today can run its business without a proper and qualified IT department. The IT staff of any organization are usually categorized as key or critical staff, because they handle the organization’s critical IT equipment. A people-related disaster like resignation, injury or death to one or more critical IT staff could easily paralyse an organization. Ironically, it is possible to replace an organization’s CEO overnight, but it is nearly impossible or highly risky to replace or lose key IT staff suddenly.

Some of the common precautions to prevent IT staff-related disasters are listed below.

  • Don’t have all your IT staff seated in one place. For example, don’t have all your IT staff working in the same location or building. If something happened to that building, all the knowledgeable staff would be affected, and it would cripple your ability to get the network up and running again.

  • Ensure every IT staff member is adequately trained in all or most support services. For example, it would be risky to have just one person who knows how to operate the backup software or who has the admin passwords for all servers. If that person quits or meets with an accident then nobody else can take system backups or perform administration activities.

  • Pay industry standard or better salaries to IT staff. Good and competitive salaries ensure low resignations and attrition. Many organizations still believe IT staff are a dime a dozen. And many business managers think that they can always hire or replace experienced technical staff immediately from some outsourcing or external companies should there be a computer-related disaster. Believe me, it is impossible. It is not possible for any new IT staff, however qualified, to suddenly walk into an organization and start assisting in disaster recovery, business continuity or even plain day-to-day support immediately. It usually takes several weeks or months for any new IT staff to fully understand the IT nature, functioning and the culture of any organization, its past history, operation of legacy systems, etc. Even in small organizations this isn’t easy.

  • Have an adequate staff ratio: see below.

  • Don’t hire temporary staff just to reduce costs. Temporary staff will usually have no commitment or loyalty to the organization, and will always be on the lookout for better opportunities elsewhere. Secondly, they may leave at any time at very short notice, causing serious IT service issues to any organization. There will always be improper and inadequate handover between service providers.

  • If you are outsourcing IT staff ensure that you demand a minimum set of IT qualifications and experience from the staff being supplied by the outsourcing vendor.

What is an appropriate IT staff ratio?

In order to maintain a large IT infrastructure, it is necessary to have a sufficient number of IT staff to properly manage various systems. Irrespective of the amount of automation, sufficient numbers of qualified staff are still needed to understand, control, manage and run the operations. However, many organizations fail to understand this important issue and try to keep the ultimate minimum number of staff, or a slave-sized team, to maintain a large IT infrastructure. The standard decision factors are cost saving, businesses unable or unwilling to invest in more headcount, etc. However, managing a large IT infrastructure will put enormous pressure, stress and overheads on the staff if the IT department is too small. Most IT staffs today are struggling to meet service expectations that are too high for the current sizes of their IT departments. Naturally, this will result in frequent resignations, improper process compliance, delay in support, and other issues that will slowly engulf the organization. The revenue loss due to an overloaded, understaffed IT service team will be several times the saving in salaries of having twice the number of IT staff.

It is not enough for organizations to say they have implemented IT best practices just by preparing a bunch of process documents, procedures, policies, etc – they also need the right number of staff to practice IT in its recommended way. Secondly, there is no point in committing high levels of availability everywhere when there is a shortage of staff to maintain even basic services. This is where staff ratio will help.

An appropriate IT staff ratio means having the right number of IT staff for a certain number of end-users and IT equipment. For example, a general rule of thumb is to have two IT staff to support 100 end-users using 100 computers and about three or four servers. However, it would be unreasonable to have the same two IT staff continue to support the organization when the strength grows to 200 end-users. Naturally, the IT staff will have to increase in direct proportion to the end-user count. Many business managers may argue that it is possible to simply implement a few fancy tools and not increase IT staff. However, such arguments usually do not work out in real-world, practical scenarios. Fancy tools that can replace a human being are usually very expensive, and will anyway need highly-qualified staff to operate and maintain, plus there will be ongoing costs. Hence it is absolutely necessary for businesses to ensure that they have the correct IT staff strength to maintain the expected levels of availability.

Example . Inadequate staff ratio affecting business

As you may have observed, computers always mysteriously fail at the most inappropriate time. For example, a computer could fail during an important business presentation to your potential client. If there were enough IT staff, a techie could speedily attend the trouble within minutes to give some immediate work-around. This can give a good impression to the client about your organization’s service standards. On the other hand, if there were a shortage of staff and the techie walked in after two hours (or did not turn up at all) it could lead to an acute embarrassment, an abrupt end to the meeting, loss of the client, etc.

There is no magic number for the IT staff ratio. Organizations will have to gather the following statistics and arrive at an optimum number:

  • Average number of end-user calls per day

  • Average number of call backlogs per day

  • Call response time compared to committed time

  • User downtime

  • User downtime calculated in financial terms

  • Growth of end-user count.

Companies that wish to compete based on properly fulfilling commitments made to their external and internal customers must invest in correct IT staff to end-user ratios to remain competitive. Otherwise, the company could slowly suffer from an internal decay that could cripple the business.

What are the usual reasons for IT disasters?

Many organizations have implemented computers, software, telecommunications, etc, for running their businesses. However, these implementations are usually done without proper planning of any sort. This example shows how most organizations usually implement IT in their organizations:

Example . 

A small company’s business owner may buy a single computer, initially for general use. After seeing the benefits of using computers, he may immediately decide to buy 25 more for his staff. Within a short time his business will be computerized, and very soon IT support headaches will enter the business. Using a computer may be easy, but maintaining a computer system is a complicated task. Users may suddenly experience crippling virus attacks, equipment failures, software licensing issues, data corruption, data loss, upgrade issues, and so on. They may not be in a position to support and maintain a computer network and its associated functions. Overnight, a smart purchasing assistant may undergo a crash course in computer maintenance, or buy a book called ‘Computer Maintenance for Dummies’, and soon will be given the responsibility for technical support of the business along with their other responsibilities. This is how IT departments start in thousands of organizations. However, this sort of approach will soon lead to major and uncontrollable issues later.

Some of the common hassles faced by many small and even large organizations are listed below:

  • Roles and responsibilities of staff are not clearly defined or non-existent.

  • A single IT staff or a very small team of IT staff responsible for anything and everything related to IT.

  • Lack of clearly defined and simple processes. No service level agreements, vendor agreements, technical training, etc.

  • Business and technical staff not seeing eye-to-eye. Poor management buy-in, inadequate funding, culture issues, resistance to change, etc.

  • Businesses not understanding essential factors of using IT in their organizations like having proper IT staffing, exponential hardware and software budgets, on-going costs, frequent and mandatory upgrades, etc.

  • Technical staff concentrating only on technical matters, and unable or unwilling to understand business needs.

  • No structured customer support mechanism. No help desk or service desk facilities.

  • No proactive IT trouble prevention methods. Only reactive support. Troubles get solved after it occurs with no prevention mechanism in place.

  • IT staff using outdated tools and equipment due to various reasons resulting in the IT department out of sync with modern business demands.

... and several more.

What are some of the best practices to be followed by IT staff?

Proper IT service is a very important aspect of any IT department. Many organizations do not have any good processes in place to manage IT services. Different organizations follow their own proprietary methods to provide internal IT support, but there are industry standard practices readily available that can be easily adopted by any organization of any size. One of the best-known is the ITIL Practices, also known as the IT Infrastructure Library. ITIL was prepared by the OGC (Office of Government Commerce, UK) and defines best practices in IT service management. Excellent books, written by the ITIL gurus, and now in version 3, are available on the subject. Visit www.itgovernance.co.uk/itil.aspx for further information and details of books on the subject.

What are the main benefits of using ITIL?

Many organizations believe they have already implemented excellent self-developed IT services and don’t need any change. That might be right. But on closer examination it is more likely to be found that they are missing out on various processes that could enhance their IT department. The benefits of using ITIL are simply enormous, eg:

  • Proven and tested processes. No need for businesses to reinvent the wheel for implementing IT services in their organizations. Covers end-to-end.

  • Improved quality of IT service for business functions.

  • Reduced downtime, reduced costs, improved customer and end-user satisfaction

  • Measurable, controllable, recoverable.

  • Proactive rather than reactive. Clearly-defined roles, responsibilities and activities.

  • Greater understanding of IT and its limitations by the business. Business will understand IT better.

  • Continuous improvement, stability, and problem prevention.

  • Improved business image. Businesses will also learn what to commit, and what not to commit, to their external customers.

How can change management prevent disasters?

A proper change management method for all IT implementations, upgrades, maintenance, etc, can prevent a number of foreseeable disasters. Today most modern organizations have implemented nit-picking change management procedures for technical implementations within their organizations. What this means is that any changes such as additions, deletions, modifications, replacements, etc, to any part of the IT infrastructure must go through a series of approvals and sign-offs before the changes are actually implemented. For example, the management should not allow any unauthorized technical changes to the infrastructure. A knowledgeable change management team will need to study the change requested, and view it from several technical and non-technical angles before giving the go-ahead. Having a proper change management process can prevent several types of disaster, for example:

  • Preventing any IT or network changes during critical periods. For example, organizations that sell consumer goods over the Internet or through retail stores should not disturb their IT infrastructure (on which they depend for sales) in any way during the Christmas period. Suppose the IT staff install an untested software patch just before Christmas on an organization’s online sales web server. If the patch misbehaves and the server crashes due to a bug, customers cannot purchase the company’s products during such a critical time, causing loss of reputation and other negative impacts.

  • Businesses should not allow any maintenance activities (except emergency fixes) on production systems during business hours. Not allowing any technical changes to be done during business hours can prevent any unexpected disasters and business disruptions. For example, if your techies upgrade a software patch or an anti-virus upgrade on your office mail server during office hours, and if the patch misbehaves, your mail server can become inaccessible to all users. Instead, if the patch is applied after office hours or first tested on a test server, then your IT staff can take the necessary precautions or get the necessary breathing time to recover the server without creating chaos in the organization.

  • All IT changes must and should have a proper back out plan. For example, if you are upgrading some software on an important server, then an accurate snapshot or baseline must be taken before installing the upgrade. If the upgrade fails or causes some other unexpected problem, the system can be reverted back to the previous baseline. Tools like Norton Ghost or Disk Imaging software can help create accurate images or snapshots of the systems being upgraded or modified.

What are the other risks relating to IT staff?

Risks and disasters can happen with every employee. However, risks from IT staff can be more severe as they are specialized employees who may have complete access to all critical equipment. They will have access to equipment and data that even the CEO will not have. Of course, organizations cannot survive without having some IT staff, but care can be taken to minimize the risks relating to them. Some of the common risks are listed below.

  • A disgruntled IT person can be an enormous threat to an organization. He or she can simply destroy data from critical equipment for revenge.

  • Employee dissatisfaction among IT staff resulting from lack of growth opportunities, inadequate salaries, overworked/underpaid situations, etc, are all potential threats to an organization.

  • An inadequate IT staff ratio is also a potential risk and a disaster waiting to happen. Not having enough staff can gradually reduce an organization’s capability to be competitive. Problems will get fixed slowly, processes will not be followed, dangerous IT shortcuts will become commonplace, data backups may not be regular, etc. All these can lead to disaster sooner or later.

  • IT service is a serious business and should be handled by mature and responsible staff with at least several years of proven experience.

  • Organizations should also ensure that the IT staff have no drink problems, if possible:

    Example . 

    A highly qualified techie was managing a certain large organization’s IT infrastructure. However, the techie had an alcohol problem, and would usually get completely drunk every evening. During a critical project implementation there were some technical problems late in the night, so the project chaps called the techie to come over and solve the problem. Unfortunately, the techie was very drunk by then but he somehow managed to crawl into the office. Unable to understand what was going on he picked a fight with some project members and started banging on the keyboards and terminals. Luckily, no serious damage was done to any data or equipment. Finally, building security had to be called to handle the techie.

    As you can see from this example, IT staff who have drinking problems can be a great threat to any organization.

  • Resignation by critical IT staff: In this competitive world, qualified and experienced staff are always in high demand everywhere. Hence organizations should ensure that they retain their qualified IT staff as far as possible, and also have enough qualified IT staff to handle sudden resignations or employee ‘poaching’ by the competitors.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.5.176