Search by job, company or skills

  • Posted 17 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Hiring: Data Engineer PySpark | Cloudera CDP | Informatica BDM

Location:

Onsite Dubai (Up to 23K AED)

Offshore Bangalore / Chennai (Up to 37 LPA)

Experience: 5+ Years

Notice Period: Immediate / Serving / Max 30 Days

We are hiring an experienced Data Engineer with strong hands-on expertise in PySpark, Cloudera Data Platform (CDP), and Informatica Big Data Management (BDM) to build and support enterprise-scale big data solutions.

Key Responsibilities

Design, develop, and maintain optimized ETL pipelines using PySpark on CDP

Implement data ingestion from multiple sources (databases, APIs, filesystems)

Perform Spark and CDP performance tuning for large-scale workloads

Build and enforce data quality checks, validation, and monitoring

Automate workflows using Oozie / Airflow

Develop and maintain Informatica BDM mappings and workflows

Ensure security, compliance, and stability of data pipelines

Collaborate with cross-functional teams to support data-driven initiatives

Technical Skills

Advanced PySpark (RDDs, DataFrames, optimization techniques)

Strong experience with Cloudera Data Platform (CDP) Hive, Impala, HDFS, HBase

Hands-on experience in Informatica Big Data Management (BDM)

Strong knowledge of Oozie scheduling, HQL, data partitioning

Experience with SQL & NoSQL databases

Exposure to Hadoop, Kafka and distributed systems

Strong Linux shell scripting skills

Understanding of security, compliance, and data governance frameworks

Preferred Experience

Enterprise Banking / Financial / Fintech environments

Agile methodology and CI/CD tools (Git, Jenkins, etc.)

Experience working in production-grade distributed data ecosystems

Job Description for Informatica BDM:

Education

  • Degree, Post graduate in Computer Science or related field (or equivalent industry experience)

Experience

Minimum 4+ years of development and design experience in Informatica Big Data Management

Extensive knowledge on Oozie scheduling, HQL, Hive, HDFS (including usage of storage controllers) and data partitioning

Technical Skills

Extensive experience working with SQL and NoSQL databases

Linux OS configuration and use, including shell scripting.

Good hands on experience with design patterns and their implementation.

Well versed with Agile, DevOps and CI/CD principles (GitHub, Jenkins etc.), and actively involved in solving, troubleshooting issues in distributed services ecosystem

Familiar with Distributed services resiliency and monitoring in a production environment.

Experience in designing, building, testing and implementing security systems including identifying security design gaps in existing and proposed architectures and recommend changes or enhancements.

Responsible for adhering to established policies, following best practices, developing and possessing an in-depth understanding of exploits and vulnerabilities, resolving issues by taking the appropriate corrective action.

Knowledge on security controls designing Source and Data Transfers including CRON, ETLs, and JDBC-ODBC scripts.

Understand basics of Networking including DNS, Proxy, ACL, Policy and troubleshooting

High level knowledge of compliance and regulatory requirements of data including but not limited to encryption, anonymization, data integrity, policy control features in large scale infrastructures

Understand data sensitivity in terms of logging, events and in memory data storage such as no card numbers or personally identifiable data in logs.

Implements wrapper solutions for new/existing components with no/minimal security controls to ensure compliance to bank standards.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 137862741

Similar Jobs