Data Scientist (Azure AI Engineer)

Dautom

United Arab Emirates, Dubai

8-10 Years

Save

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Description Full Stack Data Scientist (Azure AI Engineer)

Location: Dubai

Experience: 8+ years (Data Science / AI Engineering / Applied ML)

Job Type: Contract

Job Summary

We are looking for a highly capable Full Stack Data Scientist / Azure AI Engineer who can build end-to-end AI products: data + ML/DL/CV models + agentic workflows + APIs + UI + scalable deployment on Kubernetes (AKS). The role requires deep expertise in the Azure AI ecosystem (Azure Machine Learning, Azure AI Foundry, Azure AI Search) and strong hands-on experience building AI agents using LangChain, LangGraph, and/or Microsoft Agent Framework, with Langfuse for tracing, evaluation, and observability. The ideal candidate has shipped production systems with measurable business impact and can operate them reliably through strong MLOps/LLMOps practices.

Key Responsibilities

1) End-to-End AI Product Delivery

Own delivery from problem definition architecture development deployment monitoring iterative improvements.
Translate business needs into robust AI solutions with clear KPIs, timelines, and measurable outcomes.
Build AI applications that are secure, scalable, maintainable, and production ready.

2) AI Agents & Agentic Workflows (Must-Have)

Design, implement, and orchestrate AI agents capable of planning, tool use, function calling, retrieval, and multi-step execution.
Build agent systems using:
LangChain for tool/function orchestration, retrieval, and integrations
LangGraph for stateful, multi-step, resilient agent workflows
Microsoft Agent Framework for enterprise-grade agent patterns and integrations
Implement agent patterns: routing, task decomposition, multi-agent collaboration, memory, verification, retries/fallbacks, and human-in-the-loop approvals.
Apply security & safety: prompt-injection defenses, tool permissioning, grounding/citations, policy checks, and audit logs.

3) LLMOps / Observability / Evaluation (Langfuse)

Implement Langfuse (or equivalent) for:
prompt and trace logging, latency/cost monitoring
dataset-based evaluation, regression testing, and quality gates
feedback loops and continuous improvement of prompts/agents
Establish evaluation frameworks for RAG/agents: retrieval metrics, answer quality, hallucination checks, and guardrail effectiveness.

4) Azure Machine Learning & MLOps (Must-Have)

Build/operate ML workflows using Azure Machine Learning:
training jobs, compute, environments, pipelines, MLflow tracking
model registry and promotion, managed online endpoints
Implement CI/CD for model + application releases and MLOps practices: versioning, reproducibility, automated testing, and retraining triggers.

5) Azure AI Foundry & Azure AI Search (Must-Have)

Build GenAI solutions using Azure AI Foundry (prompt flows/orchestration, deployment integration, evaluation workflows).
Implement RAG pipelines using Azure AI Search:
ingestion/indexing of structured & unstructured data
vector + hybrid search, semantic ranking (where applicable), filtering, and relevance tuning
citations, metadata-based access control, and indexing automation

6) ML/DL & Computer Vision (Strong Requirement)

Develop and deploy strong ML/DL solutions including Computer Vision:
classification, detection, segmentation, OCR/document understanding, anomaly/defect detection
Conduct experimentation, tuning, and optimization (performance, robustness, cost).
Productionize CV pipelines with monitoring and continuous improvement.

7) Backend/API Engineering (FastAPI + Node.js)

Build production APIs for models and agents using FastAPI (Python) (async, OpenAPI/Swagger, auth, middleware, validation).
Build service orchestration and integrations using Node.js where appropriate.
Implement secure API patterns: authentication/authorization (Azure AD/RBAC patterns), rate-limiting, caching, and error handling.

8) Frontend Engineering (React)

Build modern UIs in React for AI applications (agent chat UI, dashboards, workflow screens).
Support streaming responses, citations, session memory, feedback capture, and user analytics.

9) Kubernetes/AKS Deployment & Operations

Containerize services using Docker and deploy on Kubernetes (AKS preferred).
Implement scaling, rollouts, secrets/config management, ingress, and reliability patterns.
Set up monitoring/telemetry using Azure Monitor/App Insights (or equivalent), alerts, and runbooks.

Required Skills and Qualifications

Mandatory Certifications (Must)

AI-102: Microsoft Certified Azure AI Engineer Associate
DP-100: Microsoft Certified Azure Data Scientist Associate

Core Technical Skills

Agents/Frameworks: Strong hands-on experience with LangChain, LangGraph, and Microsoft Agent Framework.
LLMOps: Strong experience with Langfuse for tracing/evaluation/monitoring (or equivalent tooling, with Langfuse preferred).
Azure: Azure ML, Azure AI Foundry, Azure AI Search; plus Key Vault, Storage, App Insights/Monitor as needed.
Programming: Strong Python; API development with FastAPI; Node.js for services/integrations.
Frontend: React for production UI development.
ML/DL/CV: Proven hands-on depth in ML/DL and Computer Vision.
Deployment: Docker + Kubernetes/AKS.
Data: Strong SQL; experience with structured + unstructured data.

Proven Experience (Non-Negotiable)

Demonstrated end-to-end delivery of AI applications in production (build deploy operate), with measurable impact.

Preferred Qualifications

Experience in real estate / construction domain AI use cases (valuation, forecasting, risk, customer support automation).
Exposure to graph databases (e.g., Neo4j) and vector search/vector databases for AI applications.
Extra certifications (nice-to-have): Azure Fundamentals (AZ-900), Azure Developer (AZ-204), Kubernetes (CKA/CKAD), Databricks ML.

What Success Looks Like (Outcomes)

Delivered production-grade AI solutions end-to-end: data model agentic workflow API UI AKS deployment monitoring.
Established strong LLMOps with Langfuse: traceability, evaluation, cost controls, and reliability improvements.
Built reliable, secure, observable systems with measurable business impact (time saved, accuracy gains, automation rate, cost reduction).
Demonstrated strong ownership from POC to production and post-launch iteration.