Ref: 180120212TO_1610977987

Site Reliability Engineer

England, Greater Manchester

Job description

Site Reliability Engineer


Are you ready for your next step to a senior role?

Do you want to work for an innovative company on Greenfield Projects?

My client an internet service provider who are looking to build a devops team. You will be an experienced DevOps engineer using your knowledge and skills to transform our software deployment process towards a fully automated solution. You will be joining a new DevOps team and have the opportunity to shape the direction and strategy of the team going forward.

Working within the DevOps team, you will be responsible for our application monitoring across the technical estate. Building on our existing monitoring framework, you use your infrastructure and scripting experience to add bespoke and robust monitoring checks to continuous assess the health of our services and applications.


* 30 days annual leave (option to buy an additional 5 days)
* Private medical insurance
* Life assurance
* Group Pension plan (company matching up to 7%)

Roles & Responsibilities;

* Take ownership over the monitoring of applications, services and infrastructure.
* Write and maintain software and scripts that capture detailed heuristics about the health of applications and alert accordingly.
* Design and implement monitoring checks for new services prior to launch.
* Ensure consistent and thorough monitoring across all environments (development, beta, production, etc).
* Capture improvements to the logging platform including integrating with LogStash.
* Expand the existing monitoring within Zabbix and investigate and prototype monitoring checks using alternative frameworks.
* Integrate with 3rd-party APIs and services to export application log data for auditing purposes.
* Work with DevOps Engineers, Sys Admins and Software Developers during software releases.
* Write automated monitoring tests and integrate within the CI/CD framework.
* Be an ambassador for DevOps across the business, influencing others to embrace automation and DevOps principles.
* Work with the Release Manager to ensure successful and streamlined production deployments.

Must have experience;

* Strong background in software engineering (using languages such as Java, Python, etc).
* Deep knowledge of JMX and Java-based application monitoring.
* Extensive experience with Linux.
* Extensive system design knowledge.
* Experience monitoring Kubernetes clusters and pods.
* Knowledge of Zabbix and LogStash/ELK highly desirable.
* Confident monitoring the health of servers (cloud-based and on-prem) including CPU, Memory, Storage.
* Confident with Ansible, Terraform, GIT.
* Substantial experience with AWS.
* Network and security knowledge.
* Experience deploying, managing and troubleshooting of software applications (including Web Apps and B2B).
* Happy working using Agile practices, and JIRA.

If you would like to know more please submit your CV to