Ledgers is building an AI-powered Financial Operating System for founders and business leaders.
We focus on clarity, control, and decision making bringing financial logic into a product founders can operate day-to-day.
We're looking for an LLM Engineer who is specialised in developing, fine-tuning, and deploying applications using Meta's Llama family of open-source models, and who can ship reliable LLM features into production (not demos).
Location: Egypt (Remote-first). We expect to transition to an Egypt office once Ledgers Egypt is set up (target: Jan 2027).
Type: Full-time (or strong long-term contract)
What you'll do
- Design, prototype, and productionize Llama-powered workflows across product surfaces (insights, classification, retrieval, forecasting support, internal tools).
- Build and optimize RAG pipelines (chunking, embeddings, retrieval, reranking, caching).
- Fine-tune models when needed (LoRA/PEFT), manage inference tradeoffs, and ensure predictable outputs.
- Implement structured outputs (JSON schemas), tool/function calling, and reliability patterns.
- Define evaluation methodology (golden sets, offline tests, online metrics) and track quality over time.
- Optimize for latency + cost (model choice, quantization, batching, caching, truncation strategies).
- Implement guardrails: safe completion patterns, fallbacks, failure handling, and privacy/security practices.
- Collaborate with product + engineering to translate user needs into AI specs and acceptance criteria.
Requirements
- Proven experience building LLM applications (projects or production work).
- Strong Python skills and experience with LLM engineering stacks.
- Hands-on experience with Meta Llama models (Llama 2/3/3.x): adapting, fine-tuning, and deploying.
- Strong understanding of embeddings + retrieval (vector DBs, chunking strategies, reranking).
- Experience designing evaluations (datasets, scoring, regressions) and iterating based on results.
- Comfortable working with APIs, backend services, and data pipelines (not only notebooks).
Nice to have
- Experience deploying Llama via vLLM / TGI / Ollama / Triton (or similar).
- Experience with quantization (GGUF/GPTQ/AWQ) and inference optimization.
- Familiarity with Kubernetes/Docker, monitoring/observability, and production debugging.
- Exposure to fintech/accounting concepts (helpful, not required).