Ref: SRE_1651523093

Site Reliability Engineer (Fully Remote)

USA, Colorado

  • 130000 to 170000 USD
  • DevOps Role
  • Skills: AWS< Ansible, Terraform, Kubernetes, SaaS
  • Level: Mid-level

Job description

Site Reliability Engineer (Fully Remote)


Job Description

Role & Responsibilities

* Maintain 24x7 production environment with a high level of service availability. Perform quality reviews, manage operational issues
* Create and monitor dashboards and alerts for key infrastructure metrics, and business KPIs that relate to site reliability. Make monitoring and alerting alert on symptoms and not on outages.
* Ensure services are designed with 24/7 availability and operational readiness and rigor
* Develop processes, tools, automation, and software changes to address operational issues
* Automate infrastructure management and maintenance with the aim of empowering the team and ensuring site reliability
* Implement automation and orchestration for manual processes required to operate and deploy cloud services, be at the heart of developing new ideas into internal OPS/SRE tools by working closely with advanced technology
* Document every action so your findings turn into repeatable actions-and then into automation.
* Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems
* Resolution of product/service defects or design changes, infrastructure changes, or operational changes
* Identifies, evaluates and executes preventive measures to minimize/avoid impact to the customers experience. Proactive v/s Customer escalated

Skills & Qualifications

* A self-starter who's comfortable working independently without a ton of supervision
* A software engineer with a curiosity for operations, or an operations engineer that wants to work closely with software engineers to help improve response times, scalability and availability.
* You're obsessive compulsive, in a good way. Your systems and scripts are clean, well-documented and comprehensible.
* You hate doing the same thing twice, you'd rather spend the time to automate a problem away rather than having to spend time on it again.
* You are collaborative and are excited to empower the engineering team to work better and faster
* Fluency with at least one current generation scripting language used by DevOps professionals (Python, Perl, PHP, Ruby) + Java Development
* You have a passion for learning when it comes to working with new technologies or languages
* You live and breathe scalable web architectures.
* You're cool in a crisis and can align with others to ensure complex problems meet a timely and effective resolution.
* You've worked with Linux, containers/namespaces, and system automation tools for Unix and/or cloud platforms.


* Medical
* Dental
* Vision
* Flexibile time off
* 401k
* Birthday off