Site Reliability Developer 5
OracleUs-wa,washington-bellevueUpdate time: March 6,2020
Job Description

About Oracle SaaS Cloud SRE

Oracle SaaS Cloud SRE plays a critical role in delivering and supporting best-of-breed cloud solutions to Oracle customers.

Oracle Cloud is the industry's broadest and most integrated public cloud. It offers best-in-class services across software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS), and even lets you put Oracle Cloud in your own data center. Oracle Cloud helps organizations drive innovation and business transformation by increasing business agility, lowering costs, and reducing IT complexity.

 

The Oracle Cloud has shown strong adoption, supporting 70 million users and more than 30 billion transactions each day. It runs in 19 data centers around the world.

Our team delivers cross-team visibility and execution on the most challenging reliability issues impacting Oracle's SaaS customers. We engage deeply with service owners and stakeholders to deeply understand and improve critical issues that impair service experience.

 

About the Job

A unique opportunity to join a rapidly growing world-class team to improve the cutting-edge Oracle Cloud technologies and infrastructure that make up the Oracle Cloud solutions. As part of the SRE team, you will be continually challenged and have an opportunity to contribute to the Oracle Cloud success every day, working closely with our development partners.

As a Site Reliability Engineer, you will solve exciting technical challenges by analyzing, troubleshooting, and designing vital Oracle Cloud services, platforms, and infrastructure while always thinking about reliability, scalability, resilience, security, and performance.

 

What You'll Do

  • Service Accountability –You will be part of the SRE team, whose mission is the shared full stack reliability of a collection of services and technology areas, with our Development partners.
  • Ownership Scope – As an SRE, you will understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of the production services you collaborate with. In partnership with your Development collogues, you will have the responsibility to ensure that services are designed and delivered to be mission critical with a focus on security, resiliency, scale, and performance.
  • Operations Engineering – You will understand and be able to communicate the scale, capacity, security, performance attributes, and requirements of the services you own. We are subject matter experts, able to understand and communicate every characteristic of our service stack, such as\:
    • degradation and behavior under load of the services and their dependencies
    • end-to-end tuning needs, optimizing resource utilization, as load patterns fluctuate
    • Instrumentation and metrics that clearly describe the service behaviors
    • scaling requirements and patterns
    • resiliency and recoverability, ensuring that backup/restore and disaster recovery capabilities are implemented, tested and maintained
  • Automation – You will have a clear understanding of automation and orchestration principles, and will be eager to help automate, wherever and whenever the possibility arises, while simultaneously eliminating technical debt. Automation must be part of your DNA.
  • Technical Experts - You will have a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. You will bring this expertise to bear in driving reliability improvements in the services you engage with.
  • Database knowledge – Databases are foundational to the Oracle SaaS Cloud services, so you will bring a deep understanding of troubleshooting and tuning for Oracle RDBMS systems.
  • Broad Interests - SREs are a rare mix of sysadmins and Development Engineers, and as such, have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems. They are driven by professional curiosity, and a desire to develop deep understanding of their services and their dependencies.
  • Cross-team collaboration – You will engage with and present to a wide variety of audiences, ranging from individual contributors and teams to executive leadership

 

What You Need to Have

A BS or MS in Computer Science, or equivalent

 

Knowledge of\:

  • Server hardware configuration
  • Linux internals
  • Networking and TCP/IP
  • Standard Internet services, such as DNS, HTTP, etc.
  • Database performance metrics and fluency to understand reports
  • Oracle FMW database administration
  • Exadata architecture, design, best practices
  • Scripting languages, such as Python, Ruby, Bash, etc.
  • Cloud computing patterns
  • IT Security and compliance
  • 5 year experience of running large scale customer-facing web services
  • Most importantly, the aptitude to be a good team player and the willingness to learn and implement new Cloud technologies as needed
  • Methodical approach to troubleshooting complex problems

 

What the Perfect Candidate Will Have

Understanding of\:

  • Oracle SOA and BPEL
  • Oracle Fusion Middleware
  • FMW Administration, to include WebLogic and SOA
  • Oracle Enterprise Manager
  • Defining and documenting technical architecture of complex and highly scalable products

!|!Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

A BS or MS in Computer Science, or equivalent. Provides strategic and comprehensive complex business solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Provides strategic and comprehensive complex business solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 12 years experience of running large scale customer facing web services.


Oracle is an Affirmative Action-Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, protected veterans status, age, or any other characteristic protected by law.

!|!

Get email alerts for the latest"Site Reliability Developer 5 jobs in Us-wa,washington-bellevue"