Search by job, company or skills

S

Principal Data Platform Engineer

9-15 Years
20 - 22 LPA
Save
  • Posted a month ago
  • Be among the first 50 applicants
Early Applicant
Quick Apply

Job Description

Principal Data Engineer:

Experience: 9+ Years

Work Mode: Onsite

Location: Bangalore

Principal Data Platform Engineer

Architecture: Lakehouse (Medallion: Bronze/Silver/Gold)

Compute: Apache Spark (Expert level)

Storage/Table Format: Delta Lake (Required), Iceberg (Strong Plus)

Transformation: dbt (Expert level)

Orchestration: Airflow, Cosmos

Infrastructure: Cloud-native (GCP preferred) + Databricks/Commercial tooling

Patterns: Microservices, Event-driven, CI/CD, IaC (Terraform)

Shape

Core Technical Requirements

1. Data Engineering & Spark Internals

Deep Spark: You must understand RDDs, DataFrames, Spark SQL, and internals (Shuffle,

Partitioning, Memory Management, Catalyst Optimizer).

Pipeline Mastery: Building idempotent, self-healing ELT/ETL pipelines. Experience with Schema

Evolution and handling late-arriving data.

Lakehouse ACID: Expert knowledge of transaction logs, time travel, and file compaction in

Delta/Iceberg.

2. Software Architecture & Design

Engineering First: This isn't just SQL and scripts. You apply SOLID principles, design patterns,

and write production-grade Python/Scala/Java.

Integration: Experience building and consuming Microservices. Knowledge of API design

(REST/gRPC) and message brokers (Kafka/PubSub).

System Design: Experience building a platform from scratch. You know how to design for 99.9%

availability and horizontal scalability.

3. Data Modeling & dbt

Modeling: Expert in dimensional modeling (Kimball), Data Vault 2.0, or OBT (One Big Table) for

high-performance analytics.

dbt Power User: Advanced dbt usage (Macros, Packages, Custom Tests, dbt Mesh). You treat

dbt projects like software repositories (version control, PR reviews, CI).

4. Cloud & Platform

Cloud Native: Deep understanding of IAM, VPCs, Object Storage, and serverless compute.

Migrations: Proven track record of moving petabyte-scale data from legacy systems (On-prem,

Redshift, Snowflake) to a Lakehouse without data loss.

Shape

Key Deliverables (First 6-12 Months)

Platform Zero: Evaluate, select, and deploy the foundational Lakehouse infrastructure.

Core Frameworks: Build the reusable libraries/templates for the rest of the engineering team to

build pipelines.

Legacy Decommission: Design the technical map to migrate all high-priority finance/business

data to the new stack.

Performance Baseline: Optimize Spark/Cloud costs by at least 20% through better resource

management.

Shape

The Plus List

MLOps: Building feature stores and model deployment triggers.

GCP Specialization: BigQuery (as a Lakehouse layer), Dataproc, and Cloud Composer.

Observability: Implementing Data Quality monitoring (Great Expectations, Monte Carlo) and

OpenTelemetry.

More Info

Job Type:
Function:
Employment Type:

Job ID: 146585763

Similar Jobs

Bengaluru, India

Skills:

snowflake GithubTerraformGitlabQualysAWSAirflowdbtNessusFivetran

Bengaluru, India

Skills:

JavaHiveHadoopScalaSparkKafkaKubernetesPythonTrinoFlink

Bengaluru, India

Skills:

compaction ECSTypescriptKafkaCloudformationTerraformS3Data ModelingData QualityAWSKubernetesPythonDockerSparkData serving layersFlinkStreaming pipelinesData lakesSchema evolutionAccess ControlPipeline monitoringPartitioningData ingestion frameworksGovernanceGoData isolationLakehouse architectureApache IcebergData observability

Bengaluru, India

Skills:

snowflake Apache SparkSqlDjangoAzure FunctionsPostgresDatabricksFastAPIPythonPytestAzure Blob StorageRayPrefectGitHub ActionsdbtMetaplane