About Us  | Contact Us

Archive for the ‘Incident Management’ Category

Work Instructions, please!

How run books or work instruction procedures makes your IT team more efficiency and can save face!

Leave a Comment

Process Ownership: Who and Where?

I’ve written in the past about the nicely crafted accountability model that exists within ITIL. (See http://www.itsmwatch.com/itil/article.php/3794216/The-IT-Accountability-Model.htm).   One of the most prevalent questions we get what person or area of the organization is best suited to play the role of process owner.  Unfortunately, the standard consultant answer, “It depends…” is the answer.  But the following guidelines might help to pinpoint be best person or place for these roles for some of the more commonly implemented processes:

Process

Incident Management: One could make a good argument that this role should be owned with the Service Desk function.    Whether or not it is the Service Desk Manager who assumes these responsibilities depends upon the size and nature of your service desk.  Service Desks that handle more than 200 calls per day, should consider this a dedicated role – particularly if this person is also responsible for oversight of a Major Incident process.  This person should be reviewing incident metrics, documentation, opportunities to turn data into knowledge, and implementation of continuous improvement metrics.  This person also has to rally staff that are outside of the Service Desk to participate within service level commitments, for issues that need to escalate to groups with deeper element-level skills.  This is one of the easier processes to find a process owner home for.

Problem Management: Problem management has two missions:  get to the root cause of the reactive issues (the incidents that have occurred) and proactively identify trends as a means of identifying problems.  Tougher to pinpoint a definitive owner here.   In larger organizations, we have seen the emergence of a service delivery function where this role may logically reside.  In smaller enterprises the oversight of this process may also reside in the Service Desk or within the realm of Event Management (monitoring – perhaps a NOC group), being mindful that it cannot conflict with the incident management goal of rapid restoration.   The individual responsible for Problem Management must be able to leverage resources from level 2 and 3 organizations outside of their direct functional responsibility to perform successful root cause resolution and to assist in the identification of trends.

Change (Configuration & Release) Management:Many think  of  Change Management as an operational function – likely due to its role in protecting the “production” services through prudent evaluation of risk versus benefit.  It is, however,  a governance role – a control point and oversight for two other tightly related processes: Service Asset & Configuration management (SACM) and Release & Deployment management.   Some large organizations have Enterprise Risk Management functions in place where  Change Management would find a logical home.  Small organizations may assign the ownership of all three processes to one individual as an approved Change drives asset and configuration repository updates and spawns the release of developed changes to production.  All three represent a collective of governance over the risk and quality of service delivery.   As with Incident and Problem management processes. Change, SACM, and Release & Deployment necessitate the oversight and cooperation of cross-functional teams within the IT organization – always a consideration in determining “where” a role should reside organizationally.

Bottom Line

The typical first approach to this in most organizations is to not “upset the apple cart”, but slotting these roles into existing organizational buckets.  This might be a good initial pass, but the cross-organizational nature of Process  requires a longer term strategy for success.  This may mean some restructuring within the organization that positions process owners with the empowerment necessary gain cooperation and compliance to process from stakeholders throughout the organization.  Considering a Service Delivery organization or a Governance organization  outside of the standard IT organization silos that is staffed by managers that have the seniority and expertise to drive cross-functional efforts may be a key to a lasting strategy for Service Management effectiveness.

Valerie Arraj
valerie@cppit.com

Leave a Comment

What I Learned From Getting Hacked

In CPP’s June Podcast, we discussed a security breach that occurred a few years ago and the steps my team took to detect, respond and remediate the incident.  Here are the five things I learned from that breach.

1).  Planning your response to a disaster or security incident is just as important as the safeguards you put in place
You cannot protect against everything.  The following often delays or prohibits putting the necessary mitigation plans and preventative controls in place:
   -  Residual risk that remains based upon your organization’s tolerance or risk appetite
   -  The cost of mitigating risks and putting necessary controls in place to thwart threats & vulnerabilities
   -  Business strategies and priorities that conflict with your security program
   -  Zero day threats and vulnerabilities
If you agree with at least one of the bullets above, then it is of the upmost importance to have Incident Response Plans and Response Teams in place that you can trust.
2).  Select a team or teams you can trust
Tough times don’t last, tough people do.  Choosing people for your Emergency Response and Incident Response teams should be done on a selective basis.  Having the right people on call at the right time may save your organization from further loss.  Creative people that can think clearly in stressful situations can make all the difference between ending up in the headlines or heading the bad guys off at the pass.
3).  Store your Incident Plans in plain sight (and at multiple sites)
When an incident or disaster occurs you don’t want to leave your response to chance — even if you have selected a great team.  Know exactly where your Continuity, DR and Incident Response Plans are located.   This is achieved through constant awareness and possibly automation.  Both electronic and paper documents should exist in multiple locations.
4).  Monitor, Monitor, Monitor
Our security breach was discovered by a higher-than-normal CPU event that triggered an automated alert to our Service Desk.  Good processes and disciplines (automated and otherwise) must take over from there.  Monitoring for anomalies on your servers, network devices, databases and applications are an important first step in addition to the traditional security monitoring (IDS/IPS, Anti-virus, logging, etc.). 
5).  Embed good processes and practices such as ITIL into your organization’s daily life
I brought ITIL into my previous employer’s organization in 1999.  Good Event, Incident and Problem Management disciplines were vital in detection, notification, “root cause” and escalation of the attack.  Change/Configuration and Release Management disciplines were significant in quickly correcting the incident, the underlying problem and putting the necessary corrective, compensatory and deterrent controls in place.

Comments are welcome.
Jay Martin
jay.martin@cppit.com

Leave a Comment