Job Description…
We are looking for a candidate who owns L3-level operations for Capacity Management within a highly regulated banking estate, ensuring availability, performance, security, audit readiness, and rapid recovery.
Key Responsibilities:
– Provide L3-level support for Capacity Management in a mission-critical banking environment.
– Ensure compliance with PCI DSS, SWIFT, and local banking regulator requirements.
– Operate under ITIL processes (Incident/Change/Problem/Knowledge).
– Maintain high availability, performance, and security; participate in DR/BCP drills.
– Design/architect, capacity plan, and optimize Capacity Management for large-scale banking workloads.
– Lead major incidents/war rooms; guide L1/L2; produce post-incident reports for auditors.
– Define standards, golden configurations, and automation (scripts/Infra-as-Code) to reduce toil.
– Roadmap planning, version upgrades, performance baselining, and DR strategy ownership.
Required Tools & Technologies:
– ITSM (ServiceNow/BMC), monitoring (Grafana/Prometheus), vendor admin tools
– ITIL Foundation
Experience Requirements:
12+ years in architecture, performance engineering, and major incident leadership.
– Clear communication with operations, security, audit, and business stakeholders.
– Evidence-driven troubleshooting; strong documentation and runbook hygiene.
– Ownership mindset with 24×7 support readiness and on-call rotation participation.
Profiles will be fit for our position who have hands-on and solutioning/architect side with 12 to 17 years of experience with tools like ITSM (ServiceNow/BMC), monitoring/capacity (Grafana/Prometheus), SolarWinds (Monitoring/Capacity), VMware Aria.