Ref: SRE_1669984196

Site Reliability Engineer

England, London

Job description

Site Reliability Engineer


SRE | Inside IR35 | Remote | 6 months

Your Role and Responsibilities:

Working with a team of SRE's on an exciting project. Candidate should be able to demonstrate ability to cope under pressure and have operational experience of running applications and platforms in production, on call support and looking for long term challenging opportunity.

4 core skills:
* Solid cloud operations skills with a focus on reliability
* Google Cloud/AWS/AZURE, troubleshooting, networking skills
* Systems design, pinpoint bottlenecks
* Ability to cope under pressure and prioritise incidents.

SREs for Services:
* Create, manage and integrate software to automate and secure public cloud environments;
* Testing and examining code written by others and analyzing results;
* Develop and own the solutions that can support large capacity and scale reliability in a 24/7 environment;
* Monitor the system and respond to incidents to maintain system SLO/SLA, review and follow up production incidents;
* Share on-call responsibility and Troubleshoot problems across a wide array of services and functional areas.

* 3+ years experience working with Google Cloud/Azure/AWS
* Familiar with system operation skills in Linux, Kubernetes and network;
* Experience programming in at least one of the following languages: Python, Perl, Go, or C/C++.
* Experience in CI/CD, Kubernetes, Database experience or setting up big data pipelines.

Preferred Technical and Professional Expertise:
Experience with the following: Managing infrastructure services, responsible for including but not limited to deployment, operation and troubleshooting;
* Maintain services to meet service-level-agreements (SLAs) or service-level-objective (SLOs) by measuring and monitoring availability, performance, and overall system health;
* Provide user support, incident responses and post-mortems;
* Experience in one or more of the following types of systems at their newest versions:
* Kubernetes and Docker
Spinnaker & Jenkins
* Redis
* Kafka
* Dynatrace / Google Cloud Operations
* Istio Service Mesh
* Helm
* Familiar with Unix/Linux operating systems
* Experience in debugging and automating routine tasks;
* Strong skills in problem solving and communication
* Strong skills in cloud networking
* Experience of working in highly regulated environments such as financial services

If you have the required skills and wish to know more, please apply with your CV at first instance.