Data engineer

Master Works

Saudi Arabia, Riyadh

5-7 Years

Save

Posted 3 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Master Works is hiring an experienced Data Engineer (5+ years) in Riyadh to design and optimize large-scale real-time and batch data pipelines within the Telecom domain.

Design, develop, and maintain real-time and batch data pipelines leveraging Kafka, Spark, and Hadoop components
Must have understanding of Teradata CLDM, should know how to create new Data Model or modify existing data model based on business requirement
Collaborate with business analysts and data architects to translate business requirements into robust data models and ETL frameworks
Apply Relational and Dimensional modeling techniques to design databases and ensure data is organized effectively for both operational and analytical purposes
Write, debug, and optimize SQL and Stored Procedures to ensure efficient data processing
Work closely with BI, Data Science, and Campaign teams to ensure seamless data availability for analytics
Work closely with Data Architects, Analysts, and Business Stakeholders to translate business requirements into database solutions
Ensure all database design and code is well-documented and follows best practices for performance and maintainability
Involves designing fact and dimension tables for reporting and analytics purposes, often in a star or snowflake schema
Develop and maintain technical documentation (data flow diagrams, source-to-target mappings, architecture documents)
Ensure that the Data Dictionary is always up-to-date, capturing all changes to the database schema, including newly created or modified tables, columns, views,
Perform data quality checks, validation, and ensure end-to-end data accuracy and lineage
Support and troubleshoot real-time streaming jobs and ensure high availability of data pipelines

Requirements

Have-Must

Strong expertise in real-time data integration using Kafka, Spark Streaming, or Dataflow
Hands-on experience with Hadoop ecosystem components (HDFS, Hive, Sqoop, Spark etc.)
Strong Data Modeling concepts including FSLDM, CLDM, and Dimensional / Data Vault modeling
Deep understanding of Telecom domain (BSS/OSS, CDR, usage, revenue, and campaign data)
Experience building and optimizing ETL pipelines and data ingestion frameworks for structured and unstructured data
Proficiency in SQL and distributed data processing using Hive, Spark SQL, or PySpark
Good understanding of data governance, data quality, and lineage frameworks
Strong analytical and problem-solving skills
Excellent communication and collaboration skills with cross-functional teams
Experience in working on data vartulization tools like (Tibco. Trino etc)

Good-to-Have

Familiarity with Data Catalogs, Metadata Management, and NDMO data governance standards.
Experience with Data Catalogue tools
Familiarity with CI/CD pipelines, Git
Knowledge of ETL orchestration tools like Airflow, NiFi