Role Overview
The Data Platforms Operations Lead is responsible for ensuring the stability, performance, security, and availability of enterprise data platforms. This role owns day-to-day operations, monitoring, incident management, and service delivery, while coordinating closely with vendors and managed service providers (MSPs) to meet agreed SLAs and operational standards.
Key Responsibilities
- Own day-to-day operations of enterprise data platforms, ensuring high availability and performance.
- Establish and manage monitoring, alerting, and operational dashboards across all platform components.
- Lead incident, problem, and root cause management, ensuring timely resolution and continuous improvement.
- Define, track, and enforce SLAs, OLAs, and KPIs with internal teams and external vendors/MSPs.
- Manage capacity planning and performance optimization to support current and future demand.
- Plan and oversee patching, upgrades, and maintenance activities, ensuring minimal service disruption.
- Coordinate with vendors and managed service providers for support, escalations, and service improvements.
- Ensure operational compliance with security, data governance, and regulatory requirements.
- Maintain operational documentation, runbooks, and standard operating procedures (SOPs).
- Support release management and collaborate with platform, data, and security teams.
Key Skills & Experience
- Proven experience in data platform operations, IT operations, or platform reliability roles.
- Strong understanding of data platforms, analytics, and cloud or hybrid infrastructure.
- Hands-on experience with monitoring tools, incident management, and ITSM processes.
- Experience managing vendors and MSPs, including SLA governance.
- Strong troubleshooting, problem-solving, and operational leadership skills.
- Familiarity with ITIL, SRE, or DevOps operational practices is an advantage.