SRE Engineer

Fresher

Save

Early Applicant

Job Description

Job Description

Key Responsibilities :

Monitor, maintain, and improve reliability, availability, and performance of enterprise applications and infrastructure.
Implement ITSM processes such as incident, problem, and change management to ensure operational excellence.
Identify and eliminate bottlenecks by developing automation and proactive monitoring solutions.
Collaborate with development and infrastructure teams to ensure smooth deployment and reliable operation of applications.
Participate in on-call rotations and shift operations, ensuring critical incident response and timely resolution.
Conduct root cause analysis (RCA) for high-impact incidents and drive permanent fixes.
Develop and maintain runbooks, standard operating procedures (SOPs), and service documentation.
Gather metrics, generate performance reports, and support continuous improvement initiatives.

Required Skills And Competencies

Strong understanding of ITSM frameworks (preferably ITIL) and service operations for enterprise-scale environments.
Experience in application monitoring, alerting, and observability tools (e.g., Prometheus, Grafana, Splunk, AppDynamics, or Dynatrace).
Familiarity with cloud infrastructure (AWS, Azure, or GCP) and key DevOps/SRE practices.
Proficiency in incident response, system troubleshooting, and performance optimization.
Basic scripting or automation skills (Python, Shell, or PowerShell) for operational efficiency.
Excellent collaboration and communication skills with a proactive problem-solving mindset.

Willingness to work in rotational shifts and support 247 production environments.