Search by job, company or skills

  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are looking for a skilled Data Engineer to design, build, and operate scalable data pipelines that power real-time processing and analytics. You will work on high-throughput data systems, ensuring reliability, performance, and maintainability across the data lifecycle from ingestion to storage and search.

This role requires strong experience in distributed systems, stream processing, and cloud-native data infrastructure.

Responsibilities

  • Design and implement real-time and batch data pipelines
  • Build and maintain scalable streaming systems
  • Develop and optimize stream processing jobs
  • Ensure reliable ingestion from multiple internal and external data sources
  • Design event schemas and data contracts
  • Implement data validation, transformation, and enrichment logic
  • Optimize storage layouts and lifecycle management strategies
  • Improve system observability (metrics, logging, alerting)
  • Troubleshoot and resolve performance bottlenecks in distributed systems
  • Implement retry, dead-letter, and replay mechanisms
  • Ensure data quality, consistency, and governance
  • Collaborate with Backend, DevOps, and Security teams

Required Qualifications

  • Proficiency in Java, Python, and SQL; strong software engineering fundamentals
  • Experience with distributed messaging systems (e.g., Apache Kafka) and stream processing frameworks (e.g., Apache Flink)
  • Knowledge of event-time processing, windowing, state management, and handling out-of-order events
  • Experience with cloud storage/data lakes, relational databases, and search/indexing engines (e.g., OpenSearch / Elasticsearch)
  • Familiarity with cloud platforms (AWS preferred), IaC (Terraform), CI/CD, and containerization (Docker)
  • Strong problem-solving skills and ability to design scalable, fault-tolerant data pipelines

Nice to Have

  • Experience with workflow orchestration platforms
  • Experience in security, log processing, or observability domains
  • Schema management tools (Avro, Protobuf, Schema Registry)
  • Data lake table formats (Iceberg, Hudi, Delta Lake)
  • Experience with distributed query engines (Athena, Trino, Presto)
  • Multi-tenant system design
  • Cost optimization in large-scale cloud environments

Soft Skills

  • Strong problem-solving and debugging skills in distributed systems
  • Ownership mindset with attention to reliability and quality
  • Clear communication and documentation skills
  • Ability to work cross-functionally
  • Comfort working in fast-paced environments

What We're Looking For

  • 7+ years of experience in data engineering or distributed systems
  • Strong fundamentals in system design and scalability
  • Proven experience operating production-grade data platforms
  • Ability to balance performance, cost, and reliability

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 144191689

Similar Jobs