Key Responsibilities
LLM Agent Development Architecture
- Design and develop LLM-powered agents for natural language to data analysis workflows (CSV-first, then database integrations).
- Build intelligent agent pipelines using LangChain, LangGraph, LlamaIndex, or similar frameworks.
- Implement multi-step reasoning strategies (ReAct, plan-and-execute) to handle complex analytical queries including joins, aggregations, subqueries, and trend analysis.
- Optimize prompt design, tool-calling logic, and conversation memory handling.
Data Processing Text-to-SQL Systems
- Develop and maintain pandas-based data analysis tools for CSV/Excel processing.
- Implement secure text-to-SQL generation and execution using SQL toolkits (e.g., SQLDatabase toolkit, DuckDB).
- Ensure schema-aware query generation by injecting metadata, table descriptions, and contextual information.
- Support self-correction, clarification prompts, and iterative reasoning for improved accuracy.
Security Safe Execution
- Implement safeguards for safe Python and SQL execution (query validation, injection prevention, restricted execution environments, timeouts).
- Apply validation checks before executing generated queries.
- Minimize hallucinations and ensure high execution reliability.
API Interface Development
- Expose LLM agents via REST APIs using Fast API.
- Develop simple demo or internal interfaces using Streamlit, Gradio, or Chainlit.
- Integrate connectors for relational databases and metadata retrieval.
Testing, Evaluation Optimization
- Write clean, modular, and maintainable Python code with proper type hints.
- Develop basic unit tests using pytest.
- Contribute to evaluation datasets measuring accuracy, execution success rate, and hallucination reduction.
- Continuously optimize performance and response latency.
Collaboration Documentation
- Collaborate with backend, frontend, and product teams to expand from CSV-only support to full database connectivity.
- Maintain clear documentation for agent workflows, architecture decisions, and integration processes.
- Participate in sprint planning, technical discussions, and continuous improvement initiatives.
Documentation Reporting
- Maintain detailed documentation of architecture, modules, and integration workflows.
- Provide periodic progress reports, technical evaluations, and release notes to the management team.
Preferred/ Strong Advantage
- Experience building natural language to pandas or text-to-SQL agents (including personal prototypes).
- Hands-on experience with Duck DB, SQL Alchemy, psycopg2, or other database connectors.
- Experience building agent interfaces using Streamlit, Gradio, or Chainlit.
- Knowledge of secure execution environments, sandboxing, and query validation techniques.
Experience implementing Retrieval-Augmented Generation (RAG) for schema or context injection
Qualifications
Qualifications Skills
- Bachelor's degree in computer science, AI, Data Science, or related field.
- 3+ years of experience in Python development.
- Strong proficiency in Python (3.9+), modular architecture, type hints, and basic async programming.
- Advanced experience with pandas and NumPy for data manipulation and analysis.
- Hands-on experience integrating LLM APIs (OpenAI, Anthropic, Grog, Ollama, vLLM, etc.).
- Practical experience building LLM agents or chains using Lang Chain, LangGraph, or LlamaIndex (minimum 12 real projects).
- Strong understanding of SQL (SELECT, JOIN, GROUP BY, WHERE) and relational database concepts.
- Familiarity with prompt engineering, tool-calling, React patterns, and conversational memory.
- Experience using Git and writing basic tests with pytest.
Additional Information
- The work location is Yas Bay, Waterfront, Yas Island, Abu Dhabi.
- Please ensure you are willing and able to work within the Abu Dhabi area before applying.