Site Reliability Engineer

Total Experience: 5-10 Years

Mandatory Skills : Python, Cloudflare, AWS WAF, CloudFront

Job Description:

  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Create sustainable systems and services through automation and upliftsto remove operational toil and manual processes
  • Participate in systems designreviewsand partner with teams on action items from Root Cause Analysis sessions
  • Build well-defined service level objectives(SLOs), metrics, monitors, and logs as required
  • Close collaboration with team members as well as cross-functional teams such as DevOps, Cloud, and development teams within Petco
  • When required, respond to incidents and concerns related to the production environments
  • Debug, troubleshoot, and solve for concernswith a proactive approach to problemsolving •5-10+ yearsas a Software Engineer, DevOps Engineer, or Site Reliability Engineer
  • Coding experiencewith at least 1 high-level language such as Python, Go, or Java.
  • Experience with supporting critical services in productionin the cloud (AWS) andon-premises
  • Infrastructure as Code (IaC)tools such as Terraform
  • Monitoring tools such as New Relic, SumoLogic, DataDog, SevOne, Sentry
  • Proactive approach to spotting problems, areas for improvement,removing manual process and toilusing code,andfixing performance concernsusing code
  • Shift hours 4PM–1AM IST
  • 10+ years as a Software Engineer, DevOps Engineer, or Site Reliability Engineer
  • Backend software development experienceusing Python
  • Expertise with CDN and WAF technologies such as Cloudflare, AWS WAF, CloudFront
  • Experiencewith addingtelemetry, distributed tracing, and performance debuggingandwith building solutions to fix themusing code
  • Experience with building SLIs, SLOs, and error budgets

Did't find your job ?

Subscribe to our job alerts

You will receive email notifications
when a position becomes available
that matches your job alert choices.

    filetype: doc/docx/pdf and max-file size: 20MB


    02 AUG 2021
    AWS Named as a Leader for the 11th Consecutive Year in 2021 Gartner Magic Quadrant for Cloud Infrastructure & Platform Services (CIPS)

    Amazon Web Services

    Know More
    27 JUL 2021
    Introducing Amazon Route 53 Application Recovery Controller

    Amazon Web Services

    Know More
    09 JUN 2021
    Amazon SageMaker Named as the Outright Leader in Enterprise MLOps Platforms

    Amazon Web Services

    Know More