In this role as a Site Reliability Engineer in the Public Cloud group, you will be working on complex and difficult technical problems solving for scale, performance, and availability. The ideal candidate has experience gained in a software development environment and a deep appreciation of best practice for the design and deployment of fault tolerance solutions for cloud platforms.
Role & Responsibilities
* Engage with systems engineering and application development teams at all stages of the technology life-cycle.
* Devise innovative ideas for solving difficult technical problems involving distributed systems, scale and security and translate these ideas into designs and implementation.
* Implement best practices when it comes to availability, scalability, operational excellence and efficiency, using data analysis techniques when appropriate.
Preferred Skills & Qualifications
* Hands on experience developing and engineering software using technologies such as Java, Python, C++ or Ruby
* Experience with modern SDLC tools, ability to develop and enforce CI/CD practices
* Experience working with Kubernetes and developing containerized applications
* Expertise with monitoring and observability technologies like Prometheus and Grafana
* Health Benefits and Dental and Vision - Start on your first day
* 401K Match up to 6%
* Discretionary Merit and Bonus Increases
* PTO and Holiday Pay
* Hybrid Work Schedule