Incident Manager

  • Zopa
  • London, United Kingdom
  • Nov 12, 2017
Full time Information Technology

Job Description

At Zopa, we’re shaping the future of finance.   We offer simple loans and smart investments that help people take control of their finances and do more with their money. In the 12 years we’ve been in business, we’ve helped more than 60,000 people lend over £2 billion to 246,000 UK consumers.   And our journey’s only just beginning. In November 2016 we announced our plans to build a next generation bank so that we can bring a greater range of smart, ethical finance products to even more people.   Role overview   As an Incident Manager at Zopa, you will be responsible for driving the speedy restoration of IT services experiencing outage, performance or stability issues. Incident Managers are responsible for ‘herding the cats’ across multiple technology streams to ensure downtime is minimised whilst communicating clearly and succinctly status updates on a regular basis to stakeholders. Following recovery, Incident Management will lead the post mortem and root cause analysis.   Acting within Problem Management, you will oversee the implementation of any defined defect fixes.   Incident Managers at Zopa are the custodians of customer experience, empowered to drive improvements to software, infrastructure or processes. Incident Management will own the quality of Runbooks and key product support information and act as ambassadors for Zopa IT operations with internal and external stakeholders. 

 

Key responsibilities

    • End to end responsibility for the management, communication, escalation, investigation and resolution of incidents, ensuring Business / Customer updates are timely and of sufficient quality, arranging discussions and updates as required.
    • Acting as Incident escalation focal point, identifying and resolving conflict and bottlenecks.
    • Lead the creation of RunBooks for product support and operations.
    • Creation of agreed action plans with named actions & deadlines. Accountable for the Delivery of that plan.
    • Document post incident recovery steps in order to establish Root Cause, aid in Process improvements, identify deviations and to enable creation of a Knowledge Base.
    • Driving, developing and managing the major incident process and associated procedures / systems.
    • Providing consolidated production incident metrics to the Head of Technology Operations and Operations Centre Manager along with resolution rates.
    • Act as Problem Manager and Service Delivery in the absence of dedicated resource.
    • Be an Evangelist for the Incident Management Process.
    • 09:00 to 17:30 weekdays, plus out of hours via on-call, providing 24/7 support.

Requirements

    • Excellent succinct but detailed communication skills, written and verbal.
    • Keeps a cool head under pressure.
    • Excellent interpersonal, influencing skills, interacting appropriately with technical and business resources, driven but courteous.
    • Demonstrable experience in an Incident Management Role within IT.
    • Understanding of Enterprise Architecture, on-prem and cloud IT environments.
    • Strong analytical/fault finding/diagnostics/trouble-shooting skills.
    • Methodical approach to problem solving and attention to detail.
    • Kepner-Tregoe or similar exposure.
    • Flexible and ‘can-do’ attitude.
    • Effective time management skills; with the ability to work on multiple tasks simultaneously, prioritizing tasks, shifting priorities, fluctuating workloads, deadline pressures.
    • Degree level education or equivalent experience.
    • ITIL Certified.
    • Strong technical competencies resulting from previous working experience at expert level within an IT operational or support environment.
    • Experience of Amazon Web Services and/or Google Cloud Platform
    • Virtualisation, containers, Kubernetes knowledge and experience desired