Search by job, company or skills

Xebia

Lead Data Engineer - Scala/Spark

Save
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Title : Lead Data Engineer - Scala/Spark

Job location : Bengaluru

Exp Range : 5-14 years

Notice Period : immediate - 15 days

We are seeking a Senior Data Engineer with deep expertise in Scala-based Spark development and end-to-end

deployment of data pipelines on Kubernetes cluster, orchestrated via Airflow. The ideal candidate should have

a strong software engineering foundation, excellent understanding of distributed systems, proficient in

software design, modern project/code structuring skills, with good understanding on CI/CD processes and

implementation which enables them to deliver reliable, scalable and robust data solutions. Should have overall

experience of minimum 6-8 years with minimum 5Years in Hadoop, Spark.

Key Responsibilities:

• Design & implement robust, scalable, batch & real-time data engineering solutions using Apache

Spark (Scala) & Spark structure streaming.

• Architect well-structured Scala projects using reusable, modular, and testable codebases aligned

with SOLID principles and clean architecture principles & practices.

• Develop, Deploy & Manage Spark jobs on Kubernetes clusters, ensuring eTicient resource utilization,

fault tolerance, and scalability.

• Orchestrate data workflows using Apache Airflow — manage DAGs, task dependencies, retries, and

SLA alerts.

• Write and maintain comprehensive unit tests and integration tests for Pipelines / Utilities developed.

• Work on performance tuning, partitioning strategies, and data quality validation.

• Use and enforce version control best practices (branching, PRs, code review) and continuous

integration (CI/CD) for automated testing and deployment.

• Write clear, maintainable documentation (README, inline docs, docstrings).

• Participate in design reviews and provide technical guidance to peers and junior engineers.

Technical Skills:

Primary:

• Languages: Scala, Java

• Big Data Orchestration: Airflow, Spark on Kubernetes, Yarn, Oozie

• Big Data Processing: Hadoop, Kafka, Spark & Spark Structured Streaming.

• Experience on SOLID & DRY principles with Good Software Architecture & Design implementation

experience

• Advanced Scala experience (e.g. Functional Programming, using Case classes, Complex Data

Structures & Algorithms)

• Proficient in developing automated frameworks for unit & integration testing.

• Experience with Docker and Helm and related container technologies.

• Proficient in deploying and managing Spark workloads on Kubernetes clusters.

• Experience in evaluation and implementation of Data Validation & Data Quality

• Devops experience in Jenkins, Maven, Github, Github actions, CI/C

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149207407