Search by job, company or skills

Astek

Senior AI Platform Resident Engineer

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 months ago

Job Description

Our client is in search of a Senior AI Platform Resident Engineer L2/L3 Operations (VMs & OpenShift) to oversee the operational excellence of AI platform components on virtual machines and OpenShift.

Role Overview:

In this critical role, you'll lead L2/L3 operational practices, close knowledge gaps, and ensure stable, secure, and well-observed deployments across various services including model serving, vector search, messaging, and runtime operations.

Key Responsibilities:

  • AI & Vector Systems: Operate and support vLLM, LLM inference, Qdrant, Kafka, and Rasa on VMs and OpenShift, focusing on observability, security hardening, and performance optimization.
  • Messaging & Caching: Manage Kafka and Redis, ensuring high availability, tuning, and effective backup/restore procedures.
  • Platform Operations: Deploy, manage, and enhance services across VM-based environments and OpenShift clusters while applying best practices for security and resource management.
  • Reliability & Observability: Establish metrics, logs, alerts, and monitoring dashboards, leading incident response and root cause analysis.
  • Knowledge Transfer: Identify L2 skill gaps and deliver structured training to ensure operational readiness.

Profile Requirements:

  • Advanced degree (MS/PhD) in Computer Science, AI, or a related field.
  • 5+ years in operating distributed systems in production; 2+ years with VM-based environments and OpenShift/Kubernetes.
  • Strong proficiency in Linux, networking, observability, and security hardening.
  • Hands-on experience with Kafka, Qdrant, Rasa, or LLM inference frameworks.
  • Familiarity with CI/CD practices and the ability to standardize release processes.

Core Competencies:

  • Excellent problem-solving and critical-thinking skills.
  • Strong communication and collaboration abilities.
  • Ability to lead and mentor junior team members effectively.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 137381309