Job Summary
The Senior DevOps Engineer will be responsible for overseeing the design, development, and implementation of scalable and efficient infrastructure solutions that support the needs of the organization. The primary objective is to ensure smooth operation and continuous improvement of the deployment pipeline, systems performance, and operational tasks through automation, collaboration, and proactive monitoring. As a key player in the team, the Senior DevOps Engineer will drive best practices in cloud infrastructure, CI/CD, incident management, and security, while collaborating with cross-functional teams to meet business and technical requirements
Key Accountabilities
- Lead the design, implementation, and management of scalable,resilient infrastructure solutions.
- Ensure high availability, performance, and reliability of both production and non-production environments across cloud and on-premises systems.
- Manage the infrastructure lifecycle, including planning, provisioning,and decommissioning resources efficiently.
- Oversee the development, maintenance, and optimization of CI/CD pipelines to automate application deployment and infrastructure provisioning.
- Ensure efficient integration of automated testing,continuous integration, and continuous deployment to facilitate rapid,reliability, and safe releases.
- Implement, maintain, and refine infrastructure-as-code practices for consistency and scalability.
- He will be responsible of building the team and get new hires if needed
- Lead the implementation and maintenance of monitoring,alerting, and logging systems to track application and infrastructure performance.
- Respond to high-severity incidents, troubleshoot issues,conduct root cause analysis, and implement preventive measures to reduce recurrence.
- Continuously improve incident management processes to minimize downtime and ensure swift recovery.
- Automate routine operational tasks and infrastructure provisioning using scripting languages (e.g., Python, Bash, PowerShell).
- Develop and maintain custom automation scripts for tasks related to deployment, scaling, and monitoring.
- Collaborate closely with development teams to integrate new features, services, and tools into the infrastructure.
- Work with security teams to establish and enforce best practices for infrastructure and application security.
- Communicate effectively with stakeholders and other departments, providing regular updates on system health, performance, and incident resolution.
- Lead efforts to continuously monitor, analyze, and optimize system performance, identifying and resolving inefficiencies or bottlenecks.
- Implement load testing, performance tuning, and system scaling strategies to ensure applications can meet user demand and business needs.
- Create and maintain comprehensive documentation for infrastructure, deployment processes, operational procedures, and disaster recovery plans.
- Ensure that troubleshooting guides, best practices, and technical solutions are accessible, clear, and kept up to date.
- Lead initiatives to implement and maintain security best practices for infrastructure, deployment processes, and services.
- Ensure compliance with security policies, regulatory requirements, and internal standards.
- Perform regular security audits and vulnerability assessments to identify and mitigate potential risks.
Education: Bachelor's degree in computer science or computer engineering.
Skills
3-5 years of experience.
- Strong command of the English language, both written and verbal
- Proven experience with Linux administration (including server management and performance tuning).
- In-depth knowledge of database administration for technologies like Elasticsearch, MongoDB, and PostgreSQL.
- Extensive experience in scripting with Python, Bash, and/or PowerShell.
- Solid experience with cloud platforms (e.g., AWS, Azure,GCP) and infrastructure management tools (e.g., Terraform, Ansible).
- Experience in leading or mentoring a DevOps team, managing multiple simultaneous projects, and delegating tasks based on project criticality.
- Expertise in continuous integration, delivery pipelines, and version control (Git, GitHub, GitLab).
- Experience with web technologies like Django and Python is a plus.
- Familiarity with containerization (Docker, Kubernetes) and microservices architecture.
- Ability to work cross-functionally with diverse teams and drive collaboration between development, operations, and security teams.
- Strong communication and presentation skills, with the ability to explain complex technical concepts clearly to non-technical stakeholders.
- Attention to detail and a proactive approach to problem-solving.