Skip to main content

This job has expired

Site Reliability Engineer

Employer
Kelly Science, Engineering, Technology & Telecom
Location
Eden Prairie
Salary
Competitive

View more

Job Title: Site Reliability Engineer II & III

Location: 100% Remote

Type: Contract to hire

Pay Rate - Open for discussion

Site Reliability Engineer II - 2+ years

Site Reliability Engineer III- 5+ years

Note: This company doesn't provide sponsorship, so if you will require sponsorship now or in the future please do not apply.

Can only submit US Citizens without dual citizenship, no exceptions.

Responsibilities:

  • Provide and operate SRE functions within a Kubernetes / EKS environment in AWS GovCloud.
  • Serve as an SRE with an emphasis on Operations to reactively respond, triage, and remediate reported categorized issues based on severity.
  • Serve as an SRE to proactively establish the means (through tooling) to effectively monitor, analyze, report, and observe the health and
  • upkeep of the systems and/or environments.
  • Establish key practices to ensure the availability, stability, scalability, performance, monitoring, incident response are handled appropriately
  • through a means of Automation.
  • Provide on-call rotation to field issues and support issues as they may arise.
  • Collaborate with specific SMEs from various teams to investigate, troubleshoot, and resolve issues.
  • Implement automation to mitigate risks and faults based on reactive and proactive measures.
  • Construct and maintain an incident response playbook with documented corrective actions.
  • Adhere to an established and well-defined escalation process to handle reported incidents.
  • Function as an engineering team member in an agile environment, which includes but not limited to story writing workshops, backlog
  • refinement, planning, standups, all maintained through Jira.
  • Investigate and breakdown technical issues, thoroughly, and support in troubleshooting, identifying, and addressing root causes.
  • Establishes proactive solutions to prevent faults within the system and underlying infrastructure.
  • Build automation practices across applicable aspects that improve the overall efficiency and scalability of our applications and infrastructure.
  • Documents on a consistent basis for knowledge sharing and redundancy as a part of the definition of done.
  • Engage with key stakeholders, internal and external to help foster and strengthen working relationships.
  • Provides analytical, logical, and rational thinking abilities to build enterprise level, scalable, highly available, and performant systems.
  • Demonstrate proficiency and ability in creating reusable tools through scripting or development languages such as: Python, PowerShell, Perl,
  • Java, BASH, Shell or other languages.
  • Automates pipelines used for SRE functions in a continuous delivery and deployment (CI/CD) model.
  • Analyze all platform level changes and monitors for resulting issues to effectively formulate technical solutions.
  • Work with cross functional teams within the internal teams in North America and Europe.


Qualifications:

  • Bachelor's in computer science or a related field or equivalent work experience.
  • Experience with Agile development methodologies, SRE, and/or DevOps principles.
  • Demonstrates a solid understanding of cloud computing design and security principles.
  • Practical knowledge of system architectures and networking fundamentals.
  • Experience managing large scale environments.
  • Excellent problem solving and critical thinking skills.
  • Hands on working knowledge or familiarity of Observability Services such as: ELK stack, CloudWatch, Jaeger, Kiali, Grafana, Prometheus.
  • 2+ years of experience working with Cloud providers: AWS, Microsoft, Google.
  • 2+ years of experience working with Deployment Automation such as: Ansible, Helm, Chef, Puppet, Vagrant.
  • 2+ years of experience working with IaaC such as: Terraform, CDK, CloudFormation.
  • 2+ years of experiencing working with source control tools such as BitBucket, Git, SVN.
  • 2+ years of experience working with CI/CD tools such as Jenkins, Bamboo, TeamCity, GitLab.
  • 2+ years of experience working as an SRE or DevOps Engineer
  • Hands on working knowledge or familiarity with service mesh architectures (specifically Istio) is a plus


If this position may be interested to you, please email me back at (with your most up to date resume in word format) and advise the best time and number at which you can be reached


Get job alerts

Create a job alert and receive personalized job recommendations straight to your inbox.

Create alert