Senior Site Reliability Engineer - Cloud
The AKQA Managed Services team delivers cloud platform management and application support solutions, DevOps engineering, and consulting services that support our clients' ecommerce platforms and associated business operations.
Our Site Reliability Engineers are responsible for the overall monitoring, performance, reliability and quality of our clients’ application-hosted cloud environments. As a Site Reliability Engineer, you will use your extensive experience with web technologies and tools to diagnose, problem solve and optimise solutions to drive measurable Service Level Agreement outcomes for our clients. You will mentor junior team members and champion the principles of site reliability.
At AKQA Melbourne, you will work in a meritocratic culture, surrounded by some of the brightest minds in their fields. You will have the opportunity to learn and grow within a creative and technically advanced team and have access to ongoing personal and professional development. At AKQA, we are committed to your career growth, as well as to your work/life balance.
- Responsible for the overall monitoring, performance, reliability and quality of our clients’ application-hosted cloud environments.
- Work with Incident Management and project development teams to triage, diagnose and problem solve customer ecommerce platform and cloud production environment issues.
- Design and implement automated, repeatable solutions to reduce toil and maintenance effort.
- Champion knowledge transfer and quality documentation.
- Translate technical concepts to communicate easily to a wide range of stakeholders.
- Mentor and train junior team members.
- Champion continuous improvement of cloud infrastructure and application hosting performance, support tooling and apply industry best practices.
QUALIFICATIONS AND CHARACTERISTICS:
- Background in a Cloud Support Engineer or Site Reliability Engineer role.
- Proven experience with AWS and CloudFormation, Azure and ARM templates, and/or Azure Devops and SecOps.
- Experience with supporting CI/CD pipelines (Octopus Deploy, TeamCity, GIT).
- Knowledge of automation/scripting languages: Powershell, Python, Bash.
- Comprehensive understanding of the web: protocols, web architectures, infrastructure, web servers (IIS), proxies, load balancing, high availability, etc.
- Strong working knowledge of internet backbone technologies: TLS, DNS, TCP/IP, WAF/CDNs, networks and subnetting.
- Solid practical knowledge of complementary cloud technologies and architectures including databases (MySQL, MSSQL), storage and backup technologies.
- Windows/Linux System Administration experience.
- 2+ years working with cloud platforms AWS or Azure (IaaS, PaaS & SaaS).
- Experience leading teams.
- Highly organised and process driven.
- Customer outcomes and service excellence focused.
- Excellent problem solving and production environment diagnosis skills.
AKQA is an Equal Opportunities Employer, we believe that diversity is vital to AKQA’s ability to provide our clients with the best recommendations and are committed to fostering a varied and inclusive work environment. Your race, colour, ancestry, religion, gender, gender identity, national origin, sexual orientation, age, marital status, disability or veteran status have no bearing on our hiring decisions. If you have a disability or special need that requires accommodation, please let us know.