Site Reliability Developer
OracleMexico-guadalajara/zapopanUpdate time: January 23,2023
Job Description
We're looking for a Site Reliability Engineer (SRE) to join our team and develop automated software solutions for the operational aspects of an organization.
- Incorporate SRE and DevOps practices, to develop and implement services that improve IT support team.
- Monitor computer systems and build alerts for various operational issues that systems can experience.
- Automate manual tasks by writing bash scripts
- Build tools and system capabilities for effective application support delivery
- Build an APEX portal with self-service capabilities.
- Champion best practices around system reliability, operability, and support
- Own the execution of incident and problem management processes, further improving reliability and positive CX.
Preferred qualifications:
- Understanding development and coding are critical since you will be working on automating processes and dealing with production issues.
- Good, working knowledge of server operating systems
- Experience working in a support/operations role
- Experience working with monitoring tools such as Grafana or Prometheus
- Strong analytics and investigative mindset, drive, and persistence in working on complex, challenging technical problems
- Proven track record automating processes and scripting experience
- Effective, well-developed communication skills
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.
Oracle is an Affirmative Action-Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, protected veterans status, age, or any other characteristic protected by law.
Get email alerts for the latest"Site Reliability Developer jobs in Mexico-guadalajara/zapopan"
