Search by job, company or skills

Interact Technology Solutions

Senior AI Data Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Summary

The Senior AI Data Engineer is responsible for designing, building, and optimizing enterprise-scale data and AI infrastructure to support machine learning models, generative AI applications, and real-time analytics. The role drives the development of end-to-end data pipelines, from ingestion to production-ready AI data products, ensuring scalability, performance, and compliance across multi-cloud environments.

Accountability & Responsibilities

  • Design, build, and maintain scalable ETL/ELT data pipelines using modern data engineering tools (e.g., Apache Spark, dbt).
  • Architect and implement Lakehouse data platforms (Delta Lake, Apache Iceberg, Apache Hudi) following Medallion architecture (Bronze/Silver/Gold).
  • Develop real-time streaming pipelines using Apache Kafka, Apache Flink, and Spark Structured Streaming.
  • Build and optimize AI/GenAI data pipelines for LLM training, fine-tuning, and inference (tokenization, dataset curation, prompt engineering datasets).
  • Design and implement Retrieval-Augmented Generation (RAG) pipelines, including embedding workflows and vector database integration.
  • Manage feature stores for real-time and batch machine learning use cases.
  • Integrate data pipelines with AI/ML platforms (Databricks MLflow, Azure ML, AWS SageMaker, Vertex AI, OpenAI/Azure OpenAI).
  • Implement data orchestration workflows using Apache Airflow or similar tools with CI/CD pipelines.
  • Ensure data quality, governance, and security using frameworks such as Great Expectations and data catalog tools.
  • Deploy and manage infrastructure using Infrastructure-as-Code tools (Terraform, Bicep, CDK).
  • Collaborate with Data Scientists, ML Engineers, and Solution Architects to deliver production-ready AI solutions.
  • Lead technical design decisions, mentor junior engineers, and contribute to data platform strategy.
  • Maintain documentation, data contracts, and operational runbooks for all pipelines.

Requirements

1 – Required Experience

  • Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.
  • 4–5 years of experience in data engineering, with strong exposure to AI/ML data infrastructure.
  • Proven experience building scalable data pipelines and working with large-scale datasets.
  • Hands-on experience with AI/ML platforms and modern data architectures.
  • Experience in regulated industries (e.g., Banking, Telecom, Healthcare) is a plus.
  • Strong problem-solving, analytical thinking, and communication skills.
  • Experience working in cross-functional teams and agile environments.

2– Technical Skills

  • Strong SQL and advanced data modeling techniques
  • Apache Spark (PySpark, Spark SQL, Streaming)
  • Python (pandas, PySpark, data processing libraries)
  • Data pipeline orchestration (Apache Airflow)
  • CI/CD for data pipelines (GitHub Actions / Azure DevOps)
  • Lakehouse architectures (Delta Lake / Iceberg / Hudi)
  • Streaming technologies (Kafka, Flink)
  • Cloud platforms (AWS / Azure / GCP)
  • Vector databases (Pinecone, Weaviate, pgvector, OpenSearch)
  • RAG pipeline design and LLM data processing
  • Infrastructure-as-Code (Terraform / Bicep / CDK)
  • Containers (Docker, Kubernetes)
  • Data quality & governance tools

More Info

Job Type:
Industry:
Employment Type:

Job ID: 146195539