Master Works is hiring an experienced Data Engineer (5+ years) in Riyadh to design and optimize large-scale real-time and batch data pipelines within the Telecom domain.
- Design, develop, and maintain real-time and batch data pipelines leveraging Kafka, Spark, and Hadoop components
- Must have understanding of Teradata CLDM, should know how to create new Data Model or modify existing data model based on business requirement
- Collaborate with business analysts and data architects to translate business requirements into robust data models and ETL frameworks
- Apply Relational and Dimensional modeling techniques to design databases and ensure data is organized effectively for both operational and analytical purposes
- Write, debug, and optimize SQL and Stored Procedures to ensure efficient data processing
- Work closely with BI, Data Science, and Campaign teams to ensure seamless data availability for analytics
- Work closely with Data Architects, Analysts, and Business Stakeholders to translate business requirements into database solutions
- Ensure all database design and code is well-documented and follows best practices for performance and maintainability
- Involves designing fact and dimension tables for reporting and analytics purposes, often in a star or snowflake schema
- Develop and maintain technical documentation (data flow diagrams, source-to-target mappings, architecture documents)
- Ensure that the Data Dictionary is always up-to-date, capturing all changes to the database schema, including newly created or modified tables, columns, views,
- Perform data quality checks, validation, and ensure end-to-end data accuracy and lineage
- Support and troubleshoot real-time streaming jobs and ensure high availability of data pipelines
Requirements
Have-Must
- Strong expertise in real-time data integration using Kafka, Spark Streaming, or Dataflow
- Hands-on experience with Hadoop ecosystem components (HDFS, Hive, Sqoop, Spark etc.)
- Strong Data Modeling concepts including FSLDM, CLDM, and Dimensional / Data Vault modeling
- Deep understanding of Telecom domain (BSS/OSS, CDR, usage, revenue, and campaign data)
- Experience building and optimizing ETL pipelines and data ingestion frameworks for structured and unstructured data
- Proficiency in SQL and distributed data processing using Hive, Spark SQL, or PySpark
- Good understanding of data governance, data quality, and lineage frameworks
- Strong analytical and problem-solving skills
- Excellent communication and collaboration skills with cross-functional teams
- Experience in working on data vartulization tools like (Tibco. Trino etc)
Good-to-Have
- Familiarity with Data Catalogs, Metadata Management, and NDMO data governance standards.
- Experience with Data Catalogue tools
- Familiarity with CI/CD pipelines, Git
- Knowledge of ETL orchestration tools like Airflow, NiFi