CHAPTER 7: TIME TO REFOCUS

I couldn’t believe the numbers.

I checked them three times, but it always came out the same. Over the last 90 days, the number of incidents had remained almost constant. There had been a few swings between teams, but the total wasn’t statistically different than it had been before I started.

All that work; all the effort I had invested was for nothing. I had done my best to get the incident management team to gather enough data so they could identify how to prevent repeat disruptions to the service. There always seemed to be another incident that got in the way of them spending the time they needed to get the answer and make it happen.

I could hear Ramesh now, and no matter how I envisioned it; the last words out of his mouth were always, “You’re Fired.”

As if to reinforce my utter failure, my phone lit up with another SEV1 incident. Several clusters of stores in the Western Region had lost the ability to operate their cash drawers. That meant no customer transactions, payments, or cash receipts. Needless to say, sales were nearly apoplectic.

This was the fourth time in the last two months it had happened. I knew the symptoms by heart. The solution was simple and worked consistently; reboot the servers controlling the cash drawers in the stores. Unfortunately, by the time they were in this state, direct access to the physical server itself from a crash cart in the data centre, was the only way to fully power cycle the platform to get the server working again.

I wanted to let my business contact know we were on top of it and that the store cash drawers would be back online within 30 minutes. But before I could message her, she dropped a text on my phone asking if we had rebooted the servers yet. And then added a note that during the flight to New York yesterday she read an article about how Cloud Computing would prevent this type of thing from ever happening, and wondered why we hadn’t thought of doing something like that for them.

I couldn’t believe it. The incident situation was now so out of control that our business partners knew the symptoms and how to restore the service, as well as IT did. They were even telling us how to redesign our services. I picked up one of the ITIL books and threw it against the wall of my cube.

As if on cue, Ramesh showed up a few minutes later looking for the results. I lied and told him they weren’t ready yet. I was too upset to have a discussion about why incidents had not decreased, despite all of the efforts we had put into them, and I didn’t think I could take getting fired right this minute. I needed more time to put an explanation together, along with a plan of action to fix things, so I would at least have a chance at keeping my job. If the job market had been any better, I probably would have resigned myself to the fact that the task was impossible and just quit. Better to go out on your under terms than be tossed out. Besides, not every task can be successfully completed. Some were just too unrealistic to achieve.

Ramesh picked up the ITIL book lying on the floor where I’d thrown it. He sat down opposite me and pushed the book across the desk toward me. “Looks like you dropped this.”

“Chris,” said Ramesh. “This is more than just a social call. I’ve been off-site with the rest of IT leadership working on our strategic and tactical plans for the future. We discussed your recommendation that we staff separate problem management and incident management leaders, by allowing you to hire a problem manager to supplement the incident manager.”

The new incident manager hadn’t been a hire. I’d been assigned Darren, by Ramesh. This was Darren’s third role in 18 months. Unfortunately, I’d heard that he didn’t get moved because he was a superstar that crushed objectives and led like a hero. He was related by marriage to one of the members of the Board of Directors. When his own entrepreneurial business venture had gone under a couple of years ago, his employment by the company had been arranged. What he lacked in knowledge and experience, he made up for in enthusiasm. He did have one highly honed skill, aside from marrying well. He was very good at slapping together a chaotic jumble of work, then declaring victory and moving on. Usually, his manager would support and commend his work for the simple reason that it was the only way to have him go work for someone else.

For some reason, Darren and Sean seemed to connect like brothers from different mothers. Sean had taken him under his tutelage, and spent countless hours helping him actually master the role of incident manager. So I let Sean mentor him, despite Sean’s superhero tendencies. That freed me up to concentrate on problem management, and actually got me some credit as the person who had figured out the right niche for Darren in the company. Darren actually seemed to like the job, with all its chaotic adrenaline and constantly changing focus. No outage seemed to exceed his attention span.

It was good to see Darren grow, and maybe even find a place for himself. But there was no way he had the personality, or focus, to be the problem manager. He didn’t have that questioning attention to detail, and dogged pursuit of the answer a problem manager needs. Asking him to do both jobs was a recipe for disaster.

“I know I’m still kinda new here, Ramesh,” I said. “But shouldn’t I have had the chance to make my case to them directly?”

“It wouldn’t have been appropriate. In light of the tight economy and our financial goals for the future, leadership needs to be able to have a frank discussion full of give and take that involves both potential new hires, as well as the capabilities of existing employees. If you were at the meeting, it might inhibit leaders from the open and frank discussions we needed to have in order to optimize our resources, to best meet the needs of the business. There is just too much personal and confidential information discussed. Besides, it wouldn’t be fair to people being discussed. That is privileged information.”

Ramesh sat down on the other side of my desk. “But don’t worry; I made a vigorous presentation and defence of your recommendation as an IDLE best practice.”

“That’s ITIL, Ramesh. Information Technology Infrastructure Library®2. ITIL.”

Ramesh shook his head and waved his hands at me. “Don’t worry. It was for a group of leaders … people focused on the content, not on the peripheral aspects of the name.”

I resisted the urge to push it further, and pulled out the personnel requisition and job description for the problem manager. I pushed them across the desk to Ramesh.

“I appreciate your support and efforts. Congratulations on getting them to approve this. When can I get it posted? I really need this person on board, so we can make some headway.”

Ramesh fanned the documents and pushed them back across the table to me. “The IT leadership has decided to postpone the staffing you recommended. They decided the positions of incident manager and problem manager are so similar, that one person should cover them both.”

“Didn’t you show them my analysis supporting the recommendation?” I pulled the slide deck from my files and pushed it across the table toward Ramesh.

Ramesh fanned through the slides. “Although the case we made was well-reasoned and appropriate, when compared to some of the other staffing needs, it seemed like redundant positions at a time when we do not have the resources to fill all the critical roles IT needs.”

He pushed the slide deck back at me. “In fact, a solid case was made that separating these roles would cause a loss of information about key elements during the handoff from incident manager to problem manager. There was strong consensus among the leadership that based on their extensive experience, the incident team was best equipped to determine how to prevent the incidents from recurring again, because they were closest to the facts. The idea of bringing another group in to make that determination made no sense and would be counter-productive.”

“Leadership has zero experience with problem management,” I protested. “Did you tell them that it is precisely because the incident team is so close to the event, that you need a problem management team? The incident team is too close to the immediate events to see the patterns,” I said.

“It was all discussed at length in a very open and frank discussion, before leadership reached its conclusions. Jessica had spent a great deal of her time soliciting input from a number of employees directly involved in the restoration of service after incidents.”

“So the answer is no?”

Ramesh nodded. “I am truly sorry. But out of respect for all that has been accomplished so far, leadership did agree that if the critical headcount needs get filled and additional resources become available, they will absolutely reconsider it without prejudice … provided we can demonstrate a way to prevent loss of information during the handoff from one team to the other. You should be proud that they gave you that strong endorsement. Till then, please integrate the two activities under the incident manager and get the number of service disruptions down. Everyone is counting on you.”

Ramesh smiled, stood up, and just before he walked away said, “I’m sure that in light of all of the conversations we have had in the past about influencers, you can understand leadership’s discovery and thought process.”

Ramesh then walked away without saying goodbye. As soon as he was gone, I picked up the ITIL book he’d returned to me and threw it against the wall again. How did leadership expect me to succeed if they wouldn’t support me? Oh sure, when I’d asked them, everyone had been supportive and willing to invest in the staffing needed, because it was essential to the success of the company. But when it came to investing in problem management, or their own pet projects, it was no contest. I guess I had been naive in my discussions. The reality was that no matter how much they told me they were supporting me, when things got tough, they’d toss me aside if it suited their purposes better. And was Ramesh’s allusion at the end to Sean true, or just another diversion? I wasn’t sure it mattered anymore.

Fortunately, I didn’t have time to wallow in frustration and self-pity. There was a problem management meeting in two days and I needed to find Darren and let him know about his promotion from incident manager to problem and incident manager. It was time for his education to begin.

The problem management meeting started on time, but Darren was missing the meeting … again. Over the last few weeks, since he’d become the joint incident and problem manager, he hadn’t missed a single incident restoration gathering, but his attendance at problem management meetings had been low, and his working with the technical teams on remediation plans had become almost non-existent.

It wasn’t that he didn’t know what to do. He’d set up the meetings and have plans and objectives for each meeting all laid out. I had trained him well, and I had to give him credit for executing. It may have been the most focused and organized thing he’d ever done in his entire life.

But every time there was an incident, he’d drop all of his problem work and scurry off to restore the service. The frustrating part was that Ramesh had been given a lot of very positive feedback from the business about Darren’s performance. It appeared that the business was really focused on the current moment, and as long as service disruptions were being addressed quickly, they felt it was an improvement.

It just made no sense to me. I was still confused as to why the business couldn’t see that by investing in problem management, we could reduce the number of incidents. And with our better response capabilities in incident management, we could really produce a much better service environment for them. But the biggest disappointment was in the IT department, where no one seemed very interested in reducing the number of incidents, as long as the business wasn’t complaining. They seemed to believe that incidents were the thing that provided tangible proof of the value of IT. Prevention made them uneasy because they couldn’t count the number of crises they’d prevented. Restoration gave them a sense of worth and contribution that could be measured. Perhaps Sean had been right after all. All that mattered here was quick treatment of the symptoms. No one wanted the cure.

I sent Darren a text and a phone message, but with no response. I started the meeting without him, because of all the SEV1s we’d had in the past week. We needed to assess which of them needed to go through the full root cause and remediation process, and which were better handled on an observational basis.

Darren arrived about 20 minutes late, an energy drink in his hand. “Sorry, I’m late,” he said, and with a huge yawn threw himself down in a chair. “I overslept. We’ve been having too many late night incidents.”

Darren quickly slurped down the energy drink. “I don’t think I’ve had more than eight hours sleep in the last four days. So let’s get going. Maybe I’ll have time for a nap later,” he said with a laugh.

As if that were the cue, phones in the room began beeping as SEV1 alerts came piling in.

After checking his phone, Darren said, “Sorry folks. I’ve got an incident to deal with, and that takes priority over this. The business comes first, you know. We’ll reschedule this for later in the week.”

The other members of the problem management meeting nodded in agreement and began following him out.

“Don’t go,” I said. “Darren may be the problem manager, but I’ll fill in for him. We can still have the meeting and fix some of these problems. That will cut down on the alerts. This is very important work.”

Nicola stopped long enough to say, “Come to the War Room, Chris. We need your help there. We can’t ignore this SEV1. Once it is fixed we can go back to working on problems.”

And that was the issue. Problem management would always be a fill-in for whenever there wasn’t a SEV1. But the moment there was an alert; problem management would get kicked to the side until there was more time. The decision by IT leadership to merge the incident and problem management owners into a single person, just reinforced that approach in the minds of everyone in IT and the business. Problem management could never become effective like that. It was the equal of incident management, and just as important. It deserved the same level of commitment.

I watched in frustration as the meeting attendees filed out behind Darren and headed for the war room. Mia was the last to go.

Just before she walked out the door, Panav, the Unix platform manager, turned and said to me, “I know this is very important work to you, Chris. When I have work conflicts, I always have to ask myself, which work is more important to my customers and users. I’ve found that the best choice is always the one that brings me more in alignment with my users. I tell my team to listen to the wisdom of our users; the business. I tell my team that if they are not directly serving our users, or serving those employees that serve the users, then they should think long and hard about what they are working on, and why they are doing it. We do not have an excess of resources and any time they are not serving one of those two groups, then they are at risk listening to what they want, rather than what our users want.”

As I watched her leave, I thought about what she had said, what Ramesh had done, and what Sean had said. It would be very easy to follow their lead and let problem management go; to make it an auxiliary subset of incident management; to make it an afterthought that was only engaged if there was spare time. There would never be enough spare time to mature it to the point where it would be useful.

The number of incidents wasn’t down, but the business was happier than it had been before and that seemed to be what mattered most to the company. But I knew we could do better. I decided that the only reason the business and IT leadership wasn’t insisting on more, was that they didn’t know any better. They had measured success this way for so long they couldn’t conceive of any other way.

My job was clear. I had to convince them there was a better way. The question was, could I do it without getting fired?

Tips that would have helped Chris

Sometimes the solution path you choose comes to a dead-end. It doesn’t work. Although you must not be afraid to work through difficult situations, you must also be objective enough to see when something will not work, and courageous enough to retool your solution plan.

You will probably not have a choice of who is part of the solution team. Everyone is motivated by different things. Get to know these people one on one. Understand where your common ground is, and what interests them about the solution. Part of success when leading via influence comes from creating excitement about the work. You are the coach and your role is to creatively blend the best of each team member for group success.

2 IT Infrastructure Library® is a registered trade mark of the Cabinet Office.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.19.111