Senior Site Reliability Engineer
OracleSg-singaporeUpdate time: May 9,2023
Job Description

Overview 
Oracle is leading the digital revolution. We are empowering nearly half a million businesses to thrive in the age of skyrocketing connections. Join us and play an instrumental role in masterminding the software that will have a truly global impact.

Description

What You’ll Do

  • Engage in and improve the whole Java Management Service lifecycle of applications deployment and operation
  • Improve the existing continuous deployment pipeline for a wide range of functionalities across geographically separated zones
  • Improve JMS Observability platform, Security and Incident management to meet the SLAs and SLOs defined for all Oracle cloud services
  • Architect highly available and scalable service
  • Skills to troubleshoot and trace symptoms back to the root cause 
  • Document and present methodologies to operations, engineering, and executive teams 
  • Educate the wider engineering organization on design and operational best practices for distributed computing 
  • Helping to meet the SLAs/SLOs for internal and external services and continual improvement of operational processes (weekly ops meetings, metrics, etc)
  • Build tools and automation to improve system observability, availability, reliability, performance/latency, monitoring, emergency response
  • On-call duties

Required Skills/Experience

What You’ll Bring

  • Strong track record of implementing services on OCI/AWS/GCP/Azure in a variety of distributed computing environments, with good understanding on Docker, Kubernetes
  • Understanding of CNI/CNCF landscape is good to have
  • Strong knowledge of runtimes of Storage/RDBMS and NoSQL databases
  • Experience in implementing multi cloud networking and deployment architecture
  • Good understanding of the L3/4/7 network layers (including SDN) 
  • Hand on design, coding on any one of - Python, Shell, Go or Java
  • Strong debugging/troubleshooting skills
  • Experience on implementing observability platforms using any of products suites like DataDog, NewRelic, ELK, Prometheus preferably using Grafana
  • Strong Experience with infrastructure automation and monitoring tools- Terraform, Helm, Ansible, Puppet, Chef, etc
  • Experience with modern cloud development practices (microservices architectures, REST interfaces, etc.) 
  • Deep working knowledge on Linux servers and networking preferably Oracle Linux

Apply Now

Create the future with us. Apply now.

Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.

As a member of the software engineering division, you will assist in defining and developing software for tasks associated with the developing, debugging or designing of software applications or operating systems. Provide technical leadership to other software developers. Specify, design and implement modest changes to existing software architecture to meet changing needs.

Duties and tasks are varied and complex needing independent judgment. Fully competent in own area of expertise. May have project lead role and or supervise lower level personnel. BS or MS degree or equivalent experience relevant to functional area. 4 years of software engineering or related experience.

Get email alerts for the latest"Senior Site Reliability Engineer jobs in Sg-singapore"