Search by job, company or skills

Jpmorgan & Co

Lead Software Engineer -SRE (Grafana, Dynatrace, SLO/SLI)

Save
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.

As a Lead Site Reliability Engineer at JPMorgan Chase within the AI/ML Data Platforms team, you will be instrumental in building scalable, resilient and market-leading data solutions. You will engage in root cause analysis, production changes, budgetary considerations, and staffing challenges. Your experience will be vital in managing and mentoring team members to drive strategic change, both within your team and in partnership with colleagues across JPMorgan Chase & Co.s global network of innovators.

Job Responsibilities

  • Expertise in application development and support with multiple technologies such as Databricks, Snowflake, AWS, Kubernetes, etc.
  • Coordinate incident management coverage to ensure effective resolution of application issues.
  • Collaborate with cross-functional teams to perform root cause analysis and implement production changes.
  • Mentor and guide team members to foster innovation and strategic change.
  • Develop and support AI/ML solutions for troubleshooting and incident resolution.
  • Drives team adoption of enterprise-authorized AI-assisted engineering practices within the work environment to improve code quality, delivery speed, and operational outcomes (e.g., AI-assisted code review/refactoring, test strategy acceleration, incident/root-cause analysis support), while establishing consistent validation standards (secure coding, peer review, automated testing) and promoting reuse of effective patterns across the team.
  • Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation.

Required Qualifications, Capabilities And Skills

  • Formal training or certification on software engineering concepts and 5+ years applied experience
  • Proficient in site reliability culture, principles and expertise in

running production incident calls and managing incident resolution.

  • Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
  • Strong understanding of SLI/SLO/SLA and Error Budgets
  • Proficiency in Python or PySpark for AI/ML modeling.
  • Must be able to reduce toil by building new tools to automate repeated tasks.
  • Hands-on experience in system design, resiliency, testing, operational stability, and disaster recovery
  • Understanding of network topologies, load balancing, and content delivery networks.
  • Awareness of risk controls and compliance with departmental and company-wide standards.
  • Demonstrated experience leading effective use of approved AI-assisted software development tools (e.g., for coding, code review, test acceleration, troubleshooting) with the ability to set team expectations for validating AI outputs for correctness, performance, and security.
  • Strong understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; experience coaching engineers on safe, compliant adoption within delivery practices

Preferred Qualifications, Capabilities And Skills

  • Hands on experience an SRE or production support role with AWS Cloud, Databricks, Snowflake or similar Technologies.
  • AWS, Snowflake or Databricks certifications.
  • Familiar on how to implement site reliability within an application or platform

ABOUT US

JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.

We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants and employees religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.

About The Team

Our professionals in our Corporate Functions cover a diverse range of areas from finance and risk to human resources and marketing. Our corporate teams are an essential part of our company, ensuring that we're setting our businesses, clients, customers and employees up for success.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149259843

Similar Jobs

Hyderabad, India

Skills:

Cloud ArchitectureInfrastructure ManagementAws ServicesDynatraceRoot Cause AnalysisDevOps practicesTroubleshooting

Hyderabad, India

Skills:

Setting up alertsSaaS version of DynatraceManaging DynatraceManaging deploying Dynatrace agentsDynatrace administration

Hyderabad, India

Skills:

Cloud ArchitectureAWSSplunkGrafanaDynatrace

Hyderabad, India

Skills:

Dynatracemigration experience

Remote

Skills:

DynatraceApplication MonitoringSplunkServicenowItsmIncident Management