The Senior Cloud Engineer will help design and manage a cloud-based infrastructure that supports customer facing and internal applications. He/she will have broad technical experience supporting SaaS products and related infrastructure as well as CI/CD pipelines using a variety of tools and platforms.
* Architect and migrate existing enterprise systems to the cloud, automate application environments and support internal users on our platforms and environments.
* Develop and implement systems featuring high-availability, horizontal scalability and self-healing capabilities.
* Implementation of proactive monitoring, alerting, trend analysis and self-healing systems.
* Implement, operate and optimize CI/CD pipelines for effective delivery of cloud resources and software.
* Develop and implement effective reference architecture solutions for the delivery of platform services.
* Systems engineering and automation activities to solve complex problems associated with running large scale, multi-tenant, production environments.
* Build, migrate, operate and improve Wash's cloud infrastructure's security posture and operational capabilities.
* Participate in incident resolution processes driving restoration and repair of service-impacting issues.
* Solve problems relating to mission critical services and build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.
* Support services before they go live through activities such as system design consulting, developing automation tools and frameworks, capacity planning as well as operational and security reviews prior to launch.
* Be a technical subject matter expert on all cloud deployments.
* Identify and drive opportunities to improve operational workflows.
* Bachelor's degree in Computer Science required.
* 5+ years of experience with building, deploying, administration and monitoring of SaaS applications and related infrastructure on both Windows (.NET) and Linux platforms.
* Experience with cloud platforms (AWS, Azure).
* Understanding of high availability and scaling solutions like caching, CDNs, load balancing, clustering.
* Demonstrable experience scripting with languages like Python, PowerShell, bash, etc.
* Demonstrable experience with infrastructure tools like Terraform, CloudFormation.
* Experience with configuration management tools such as Puppet, Chef, Ansible, PowerShell DSC or others.
* Knowledge of both RDBMS and NoSQL database solutions like MSSQL, MySQL, Casandra, MongoDB, ElasticSearch, etc. from an Ops perspective.
* Experience with build systems such as Jenkins, Bamboo, TFS, etc.
* Experience with source control systems such as Git, Bitbucket, TFS, etc.
* Hands on experience with monitoring and APM tools such as Nagios, Icinga, SCOM, SolarWinds, New Relic, AppDynamics, etc.
* Experience with log aggregation tools such as ELK stack, Splunk, SumoLogic, etc.
* Experience with system hardening and implementing security controls is a plus.
* Positive attitude and ability to work in a fast-paced environment.
* Prior successful experience as a Systems, DevOps or Site Reliability Engineer
* Experience with the Atlassian Tools (Jira, Confluence, Bitbucket)
* Networking: knowledge and understanding of network concepts and technology such as TCP/IP, UDP, MAC addresses, IP packets, DNS, OSI layers, ACLs, routing tables, VPN and load balancing.
* Ability to thrive in an environment of continuous change.