Search by job, company or skills

Tarjama&

DevOps Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 8 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

The DevOps Engineer will play a mission-critical role owning the deployment, scalability, security, and reliability of AI systems and digital platforms. This role has a strong focus on LLM deployments, AI workloads, and cloud-native infrastructure, ensuring that all AI and software systems operate with enterprise-grade availability, performance, and compliance.

Key Responsibilities

CI/CD & Automation Engineering

  • Design, build, and maintain CI/CD pipelines for AI models, LLM services, and software applications
  • Automate build, test, deployment, and environment configuration workflows to enable rapid and reliable releases

AI & LLM Deployment Operations

  • Deploy, operate, and scale AI systems, LLM APIs, inference workloads, and cloud-based AI services
  • Ensure high availability, horizontal scalability, and low-latency inference across all production environments

Infrastructure, Reliability & Cost Optimization

  • Monitor infrastructure performance, system health, and AI workloads using observability and monitoring tools
  • Optimize infrastructure for reliability, performance, and cloud cost efficiency

Security, Compliance & Governance

  • Implement and enforce security best practices, access controls, secrets management, and environment isolation
  • Ensure infrastructure and deployment processes align with national data governance, compliance, and cybersecurity standards

Cross-Functional Enablement

  • Collaborate closely with AI Engineers, Full-Stack Engineers, and Product teams to enable seamless, scalable deployments
  • Act as the primary technical owner for production reliability during mission-critical deployments

Documentation & Architecture Standards

  • Maintain comprehensive documentation for DevOps workflows, system architecture, environments, and deployment standards
  • Ensure operational readiness, auditability, and knowledge transfer across teams

Required Qualifications

  • Minimum 5 years of hands-on DevOps engineering experience in production environments
  • Mandatory: Proven experience deploying and operating AI systems and LLM-based workloads in production
  • Strong hands-on expertise with Docker, Kubernetes, CI/CD platforms, and cloud services
  • Experience with monitoring, observability, logging, and infrastructure-as-code (e.g., Terraform, similar tools)
  • Strong understanding of networking, security, and cloud-native architecture principles
  • Excellent troubleshooting and incident response capabilities in high-availability systems

Preferred Qualifications

  • Experience with MLOps platforms such as MLflow, SageMaker, Vertex AI, or similar
  • Proven experience scaling AI and LLM applications in high-traffic production environments
  • Exposure to AI model lifecycle management, retraining pipelines, and operational governance
  • Experience in government, regulated, or national-scale enterprise environments

KPIs & Deliverables

  • Uptime, reliability, and stability of AI platforms and production systems
  • Deployment speed, automation maturity, and release reliability
  • Infrastructure performance, scalability, and cost optimization efficiency
  • Security posture and compliance readiness across all environments
  • Quality, completeness, and audit readiness of DevOps documentation and workflows

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 136403211

Similar Jobs