
Search by job, company or skills
Hiring: Data Engineer PySpark | Cloudera CDP | Informatica BDM
Location:
Onsite Dubai (Up to 23K AED)
Offshore Bangalore / Chennai (Up to 37 LPA)
Experience: 5+ Years
Notice Period: Immediate / Serving / Max 30 Days
We are hiring an experienced Data Engineer with strong hands-on expertise in PySpark, Cloudera Data Platform (CDP), and Informatica Big Data Management (BDM) to build and support enterprise-scale big data solutions.
Key Responsibilities
Design, develop, and maintain optimized ETL pipelines using PySpark on CDP
Implement data ingestion from multiple sources (databases, APIs, filesystems)
Perform Spark and CDP performance tuning for large-scale workloads
Build and enforce data quality checks, validation, and monitoring
Automate workflows using Oozie / Airflow
Develop and maintain Informatica BDM mappings and workflows
Ensure security, compliance, and stability of data pipelines
Collaborate with cross-functional teams to support data-driven initiatives
Technical Skills
Advanced PySpark (RDDs, DataFrames, optimization techniques)
Strong experience with Cloudera Data Platform (CDP) Hive, Impala, HDFS, HBase
Hands-on experience in Informatica Big Data Management (BDM)
Strong knowledge of Oozie scheduling, HQL, data partitioning
Experience with SQL & NoSQL databases
Exposure to Hadoop, Kafka and distributed systems
Strong Linux shell scripting skills
Understanding of security, compliance, and data governance frameworks
Preferred Experience
Enterprise Banking / Financial / Fintech environments
Agile methodology and CI/CD tools (Git, Jenkins, etc.)
Experience working in production-grade distributed data ecosystems
Job Description for Informatica BDM:
Education
Experience
Minimum 4+ years of development and design experience in Informatica Big Data Management
Extensive knowledge on Oozie scheduling, HQL, Hive, HDFS (including usage of storage controllers) and data partitioning
Technical Skills
Extensive experience working with SQL and NoSQL databases
Linux OS configuration and use, including shell scripting.
Good hands on experience with design patterns and their implementation.
Well versed with Agile, DevOps and CI/CD principles (GitHub, Jenkins etc.), and actively involved in solving, troubleshooting issues in distributed services ecosystem
Familiar with Distributed services resiliency and monitoring in a production environment.
Experience in designing, building, testing and implementing security systems including identifying security design gaps in existing and proposed architectures and recommend changes or enhancements.
Responsible for adhering to established policies, following best practices, developing and possessing an in-depth understanding of exploits and vulnerabilities, resolving issues by taking the appropriate corrective action.
Knowledge on security controls designing Source and Data Transfers including CRON, ETLs, and JDBC-ODBC scripts.
Understand basics of Networking including DNS, Proxy, ACL, Policy and troubleshooting
High level knowledge of compliance and regulatory requirements of data including but not limited to encryption, anonymization, data integrity, policy control features in large scale infrastructures
Understand data sensitivity in terms of logging, events and in memory data storage such as no card numbers or personally identifiable data in logs.
Implements wrapper solutions for new/existing components with no/minimal security controls to ensure compliance to bank standards.
Job ID: 137862741