We are looking for a skilled
Data Engineer with strong expertise in
PySpark and Data Modeling to join our Data & Analytics team. The ideal candidate will be responsible for building scalable data pipelines, optimizing data workflows, and supporting advanced analytics initiatives.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using PySpark
- Perform data modeling (conceptual, logical, and physical) for analytics and reporting
- Build and optimize ETL/ELT workflows for large-scale datasets
- Work with structured and unstructured data across multiple sources
- Ensure data quality, integrity, and governance standards
- Collaborate with data analysts, scientists, and business stakeholders
- Optimize performance of Spark jobs and data processing systems
- Support deployment and monitoring of data solutions in production
Required Skills & Qualifications
- Strong experience in PySpark and Apache Spark ecosystem
- Hands-on experience in data modeling (Star Schema, Snowflake, etc.)
- Proficiency in SQL and database technologies
- Experience with data warehousing concepts
- Knowledge of ETL/ELT tools and frameworks
- Familiarity with cloud platforms (AWS / Azure / GCP) is a plus
- Understanding of big data technologies (Hadoop, Hive, Kafka, etc.)
- Strong problem-solving and analytical skills
Preferred Qualifications
- Experience in banking/financial services domain
- Exposure to data governance and data quality frameworks
- Knowledge of CI/CD pipelines and DevOps practices