Site Reliability Engineer
OracleAu-au,australia-north rydeUpdate time: April 1,2020
Job Description

A unique opportunity to join a rapidly growing world-class organization a NetSuite Global Business Unit (NSGBU) Oracle - Site Reliability Engineering team (SRE) which plays a key role in product availability, stability, performance and security. You will collaborate with many other engineering teams (system engineering, network team, infrastructure engineering, security, maintenance team) to support, design and implementation as well as tooling and automation platforms. NSGBU SRE team also supports NSGBU application operations in Oracle Cloud Infrastructure, one of the fastest-growing cloud services.

 

What will you do\:

            Be a member of a world-class Site Reliability team staffed with top-notch engineers with expertise in monitoring, backups, infrastructure, systems architecture, including mean time to resolution of any and all disruptions and next-level insights and debuggability of products and services at runtime.

            On a daily basis, resolve site incidents on various levels of infrastructure - from Hardware, Network, OS to Application issues

            Work in Linux terminal

            Work with monitoring and analytic tools like Kibana, Icinga to resolve incidents and identify problems

            Cooperate with multiple teams to build systems and services that improve operational efficiency, drive reliability, scalability, resilience, security, and performance across Oracle NSGBU products.

 

What should you know\:

            Linux systems internals, monitoring, networking and core cloud concepts

            Standard internet services, such as DNS, TCP/IP, NFS and Global Load Balancing

            Performance troubleshooting/tuning experience

            Understanding of web technologies, Apache, HTTPS/SSL, Web sessions

            Understand the software lifecycle development process

            Experience with database environments and requirements for high availability environments

            You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn

            Solid analytical skills for problems troubleshooting problems

            Excellent communication skills in English

            Architectural patterns for distributed systems, the composition of reliable services from unreliable components as well as cloud computing at scale.

 

 

Additional Qualifications\:

 

            Networking Monitoring, Networking protocols, SNMP, syslog, network telemetry, REST API

            Scripting languages such as Bash, Perl, Python

            Experience with orchestration and configuration management tools like SaltStack, Terraform, Kubernetes, Ansible, etc.

            Exposure to Gluster FS, Zookeeper, Kafka, ElasticSearch or other distributed platforms

            Basic knowledge of ITIL concepts

!|!Define, design, and implement network communications and solutions within a fast-paced, leading edge database/applications company.

Perform performance trend analysis and manage the server/network capacity. Propose client configuration and implement technical solutions to enhance and/or troubleshoot the system. Work with others to define, coordinate vendor purchase needs. Responsible for support documentation as well.

Job duties are varied and complex utilizing independent judgment. May have project lead role. 5 years of related experience in a medium to large network distributed and computing environment. BS in Computer Science or related field.!|!

Get email alerts for the latest"Site Reliability Engineer jobs in Au-au,australia-north ryde"