- 7+ Years experience in OpenShift in Banking domain with ARO,AKS and Github
- Operate and maintain OpenShift clusters (install, upgrade, patching, and capacity
management).
- Implement/verify readiness & liveness probes; ensure appropriate health
endpoints are exposed.
- Tune Java-based microservices for resource efficiency (memory/GC params);
optimize logging for troubleshooting and correlation.
- Configure autoscaling (HPA/cluster autoscaler) parameters aligned to
performance test results.
- Manage container artifacts: images, registries, image policies; implement RBAC
and security baselines.
- Leverage Operator Lifecycle Manager (OLM) for operator installation/upgrade;
enforce least-privilege RBAC.
- Manage deployments (Deployments/DeploymentConfigs), rollouts and rollbacks;
perform blue/green or canary as required. Internal
- Set up/maintain monitoring, alerting and dashboards (e.g., Prometheus/Grafana,
APM integrations).
- Maintain backup/restore and disaster recovery runbooks for clusters and critical
workloads.
- Monthly platform health report (capacity, performance, incidents, changes,
vulnerabilities).
- Hardened OCP baseline (RBAC, network policies, image policies) and
documentation.
- CI/CD pipeline catalog with ownership and support runbooks.
- Knowledge base articles for common runbooks (deployments, rollbacks, scaling,
disaster recovery).
- Incident RCA reports with corrective and preventive actions (CAPA).
- Configuration and change documentation stored in version control.