AI4ALL Department - Senior AI Data Engineer

Valeo

Egypt, Cairo

5-7 Years

Save

Posted 3 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Mission

As a Senior AI Data Engineer, you will bridge the gap between raw enterprise data and our Generative AI ecosystem. You will architect the nervous system of our AI models, ensuring our LLMs have access to high-quality, real-time, and contextually relevant data to power the next generation of Valeo's Software Defined Vehicle (SDV) tools and internal processes.

Key Responsibilities

Advanced RAG Architecture: Design and implement end-to-end pipelines for Retrieval-Augmented Generation (RAG), including advanced retrieval techniques, Graph RAG, Agentic RAG, and multimodal RAG.
Vector Database Management: Architect and optimize vector stores (e.g., Qdrant, Pinecone) to handle high-dimensional embeddings and ensure low-latency similarity searches.
Data Connectivity: Design, develop, and maintain secure data connectors to pull information from various external tools, SaaS platforms, and internal databases.
Data Pre-processing for Gen-AI: Develop cleaning and normalization workflows specifically for unstructured data (PDFs, HTML, Markdown) to ensure optimal LLM performance.
Orchestration: Use tools like LangChain, LlamaIndex, or Haystack to orchestrate complex data flows between storage, embedding models, and LLM endpoints.
Monitoring & Evaluation: Implement RAG-as-a-service monitoring to track retrieval quality (faithfulness, relevancy) and data drift in production.

Candidate Profile

Education: B.Sc. or M.Sc. in Computer Science, Data Engineering, or a related field.
Experience: 5+ years in Data Engineering with a recent focus on AI/ML pipelines.
Technical Skills:

Languages: Expert proficiency in Python and SQL; knowledge of Java or Go is a plus.
Data Tools: Experience with Spark, Kafka, Airflow/Prefect, and dbt.
AI Frameworks: Hands-on experience with LangChain, LlamaIndex, OpenAI API, Hugging Face, MCP, A2A, and ADK.
Vector DBs: Proficiency with Qdrant, Pinecone, Chroma, Weaviate, or pgvector.
Cloud/DevOps: Knowledge of GCP/AWS, Docker, and Kubernetes.