We are looking for an experienced MLOps Engineer (GCP) to design, operationalize, deploy, monitor, and scale production-grade AI/ML solutions on Google Cloud Platform (GCP). In this role, you will build reliable, secure, and automated end-to-end machine learning platforms and pipelines while enabling seamless collaboration between Data Scientists, AI Engineers, Platform Teams, and Operations teams.
You will play a key role in ensuring machine learning models are consistently trained, versioned, deployed, monitored, and governed across their lifecycle using GCP-native technologies, particularly Vertex AI.
This role is based onsite in Jeddah, KSA. Applicants must be willing and ready to relocate to Jeddah, Saudi Arabia.
Key Responsibilities
- Design and implement scalable end-to-end MLOps architectures using GCP-native services.
- Build standardized frameworks for model training, deployment, monitoring, retraining, and governance.
- Deploy and manage ML models using Vertex AI Endpoints for online and batch inference.
- Implement model versioning, rollout/rollback strategies, and traffic splitting for production deployments.
- Build and automate CI/CD pipelines for ML workflows and model deployment.
- Develop automated ML pipelines using Vertex AI Pipelines and ensure reproducibility across environments (development, testing, and production).
- Integrate source control, testing frameworks, and artifact repositories into ML workflows.
- Monitor model performance, model drift, data quality, and system reliability.
- Implement observability, logging, alerting mechanisms, and service-level objectives (SLOs) for ML systems.
- Define retraining triggers and support incident analysis and remediation of production ML services.
- Ensure scalability, security, compliance, and alignment with enterprise cloud architecture standards.
- Collaborate closely with Data Scientists, AI Engineers, Data Engineers, Platform Teams, and business stakeholders.
Requirements
Experience
- 5+ years of experience in ML Engineering, DevOps, MLOps, or related engineering roles.
- Minimum 3+ years of recent hands-on experience with Google Cloud Platform (GCP) (mandatory).
- Strong production experience deploying and managing ML systems at scale.
Technical Skills
- Strong hands-on experience with Google Cloud Platform (GCP).
- Deep expertise with Vertex AI including Pipelines, Endpoints, Model Registry, and Monitoring.
- Strong understanding of CI/CD practices, infrastructure automation, and ML lifecycle management.
- Experience with Docker and containerization/orchestration concepts.
- Strong Python programming skills for ML workflows and automation.
- Experience with ML monitoring, observability, reliability, and scalability practices.
- Knowledge of model versioning, deployment automation, and production operations.
Education & Certifications
- Bachelor's degree in Computer Science, Artificial Intelligence, Data Science, or a related field.
- GCP certifications such as Professional Cloud DevOps Engineer or equivalent are a strong plus.
Preferred Candidate Profile
- Strong problem-solving mindset with a focus on automation and reliability.
- Experience working in cross-functional AI/ML environments.
- Ability to work in production-grade cloud environments and drive operational excellence for ML systems.
- Strong communication and stakeholder collaboration skills.
- Fluent English, Arabic is a plus