Explore exciting Reliability Engineer job opportunities. These roles are crucial for organizations aiming to optimize their operational efficiency and minimize downtime. Reliability Engineers analyze data, identify potential failure points, and implement strategies to improve system reliability and maintainability. They work across various industries, including manufacturing, aerospace, and energy, applying their expertise to enhance product and process performance.

Key responsibilities often involve conducting risk assessments, developing maintenance programs, and collaborating with cross-functional teams to address reliability issues. Professionals in this field utilize tools like statistical analysis, root cause analysis, and predictive maintenance techniques to proactively prevent failures and improve overall system performance. The demand for skilled Reliability Engineers is growing as companies increasingly prioritize operational excellence and cost-effectiveness.

Job seekers can find Reliability Engineer positions with varying levels of experience, from entry-level to senior management roles. These positions offer competitive salaries and opportunities for career advancement. Search for Reliability Engineer jobs and take the next step in your engineering career.

What People Ask

A bachelor's degree in engineering, such as mechanical, electrical, or industrial engineering, is typically required. Employers often seek candidates with experience in reliability engineering principles, statistical analysis, and maintenance planning. Certifications like Certified Reliability Engineer (CRE) can enhance job prospects.

Reliability Engineers analyze equipment and process performance, identify failure modes, and implement strategies to improve reliability. They develop maintenance plans, conduct root cause analysis, and work with other engineers to improve designs. They also monitor and analyze data to predict potential failures and recommend preventative measures.

The salary range for Reliability Engineers in the UAE typically falls between AED 120,000 to AED 360,000 per year, depending on experience, qualifications, and the specific employer. Senior-level positions and those requiring specialized skills may command higher salaries. Benefits packages can significantly impact overall compensation.

Important skills include a strong understanding of engineering principles, statistical analysis, and reliability methodologies. Problem-solving, communication, and teamwork skills are important, as is the ability to analyze data and make informed decisions. Experience with software tools for reliability analysis and maintenance management is beneficial.

Top employers include ADNOC, Emirates Global Aluminium (EGA), and DEWA (Dubai Electricity and Water Authority). These companies often have large-scale operations and require skilled reliability engineers to maintain their infrastructure. They offer opportunities for professional growth and development.

Industry

View All Reliability Engineer Jobs

45 Reliability Engineer jobs in the United Arab Emirates

Site Reliability Engineer

Abu Dhabi, Abu Dhabi D4 Insight

Posted today

Tap Again To Close

Job Description

Overview

We’re Hiring: Site Reliability Engineer

Join us as a Site Reliability Engineer and help build scalable, secure, and high-performance infrastructure for cutting-edge fintech platforms in wealth management, digital wallets, trading, and blockchain.

Responsibilities

Contribute to designing, deploying, and maintaining reliable cloud infrastructure (AWS/Azure).
Manage databases, integrations, and DevOps automation to streamline operations.
Support cybersecurity and compliance frameworks to ensure secure, compliant services.
Collaborate with cross-functional teams to deliver resilient services for fintech platforms.

Qualifications

Proven experience in cloud infrastructure (AWS/Azure).
Strong in DB management, integrations & DevOps automation.
Familiar with cybersecurity & compliance frameworks.
Bonus: Knowledge of fintech trends & emerging tech.

About the Team

We craft, deploy, and manage bespoke services in CRM, data and AI, cybersecurity and consulting.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Risk, Safety & Reliability Engineer

Abu Dhabi, Abu Dhabi RINA Piraeus Team

Posted today

Tap Again To Close

Job Description

Select how often (in days) to receive an alert:

Risk, Safety & Reliability Engineer

Location: Abu Dhabi, Abu Dhabi, AE

Contract Type: Fixed-Term

Division: Plant Engineering

Level of experience: Senior

RINA is currently recruiting for a Risk, Safety & Reliability Engineer to join its office in Abu Dhabi within the Plant Engineering Division.

Mission

This role of Risk, Safety & Reliability Engineer plays a critical part in ensuring the safety, reliability, and efficiency of chemical and petrochemical plants, energy production systems, and alternative fuel applications, ultimately contributing to the protection of personnel, the environment, and assets.

Key Accountabilities

Risk Analysis:
- Identify potential hazards.
- Assess the likelihood and consequences of these hazards.
- Develop risk matrices and prioritize risks based on severity.
Risk Evaluation:
- Evaluate risks associated with the presence of flammable and toxic materials.
- Conduct pros and cons evaluation to assess the overall risk profile of the plant.
- Recommend measures to mitigate identified risks, considering regulatory standards and industry best practices.
Safety Measures:
- Define and implement preventive or mitigative measures to limit the identified risks.
- Develop safety protocols, procedures, and guidelines for plant operation.
- Conduct training sessions to ensure staff are aware of and adhere to safety protocols.
Reliability Evaluation:
- Utilize RAM methodology to evaluate the reliability, availability, and maintainability of plant systems.
- Identify areas for improvement to enhance system reliability and reduce downtime.
- Analyze failure data and develop strategies for proactive maintenance and equipment replacement.
Design Optimization:
- Collaborate with design teams to optimize plant layout and infrastructure.
- Ensure that safety and reliability considerations are integrated into the design process.
- Recommend design modifications to improve safety, efficiency, and reliability.
Documentation and Reporting:
- Prepare comprehensive reports documenting risk assessments, reliability evaluations, and design recommendations.
- Communicate findings and recommendations to clients and stakeholders.
- Maintain accurate records of all analyses and evaluations conducted.

Education

Bachelor’s Degree in Chemical Engineering or Mechanical Engineering

Qualifications

- Experience (at least 3-5 years) in risk assessment and management within the chemical and petrochemical industry.
- Proficiency in using various risk analysis tools and methodologies.
- Strong understanding of safety regulations and industry standards.
- Knowledge of process safety management (PSM) principles.
- Familiarity with reliability engineering techniques, including RAM analysis.

Address the way - Have a big picture of different situations and reinterpret it in a perspective way
Build network - Forge trust relationships, across departments, and outside the organization
Client intimacy - Embrace internal and external client needs, expectations, and requirements to ensure maximum satisfaction
Earn trust - Take everyone's opinion into account and remain open to diversity
Make effective decisions - Structure activities according to priorities, actions, resources, and constraints
Manage emotions - Recognize one's and other's emotions and express and regulate one's reactions
Pioneer change - Actively embrace change and benefit from the new circumstances
Promote sustainable development - Promote commitment by keeping promises as a Role Model
Think forward - Capitalize on experiences and translate them into action plans for the future

RINA is a multinational company providing a wide range of services in the energy, marine, certification, infrastructure & mobility, industry, research & development sectors. Our business model covers the full process of project development, from concept to completion.

At RINA, we endeavor to create a work environment where every single person is valued and encouraged to develop new ideas. We provide equal employment opportunities and are committed to creating a workplace where everyone feels respected and safe from discrimination or harassment of any kind. We are also compliant with Italian Law n. 68/99.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer

Vng Solutions

Posted today

Tap Again To Close

Job Description

VSOL is a digital enabler with a mission to help public and private organizations evolve their businesses through data and technology. We provide an end-to-end service from consulting to execution that drives the growth and innovation of our clients. As VSOL is in a phase of rapid expansion, we offer a dynamic, creative environment that accelerates your personal and professional development. We are looking for talented individuals eager to develop in international markets while contributing to the company’s future in a constructive and supportive manner.

Responsibilities:

Lead deployment and management of web applications, ensuring stability, scalability and reliability.
Design and manage hybrid environment reliability solutions (cloud and on-premises), optimizing for availability and performance.
Knowledge of orchestrate and administer containerized applications using Kubernetes, focusing on efficient deployment and runtime management.
Administer, including Geographic Information System (GIS) and databases (SQL Server), maintaining data integrity and high performance.
Analyze and mitigate service disruptions, developing strategic preventative measures to minimize downtime.
Understanding of network engineering principles.
Participate in evaluation and integration of new technologies, enhancing service reliability and operational capabilities.
Develop and automate critical system health metrics, using tools like ELK stack.
Manage major incident response efforts, ensuring effective resolution to maintain system stability.
Coordinate with cross-functional teams to align SRE practices with business objectives and IT standards.
Create and review technical documentation for system architecture and operational procedures.
Assure regulatory compliance and security assessments, implementing best practices to protect system integrity.
Participate in pager-duty rotations, resolving critical incidents.

Note: The position may require international travel for periods of 6 months continuously. Candidates will be required to accept this requirement as part of the positions

Requirements

Over 4 years of experience with cloud environments and containerization technologies, including designing and implementing scalable, resilient infrastructure solutions using platforms GCP (and other cloud platforms) , and Kubernetes.
Experience with monitoring and logging tools such as ELK Stack.
Demonstrated excellence in network management, advanced troubleshooting, and system optimization, with a focus on enhancing efficiency and reducing downtime.
Awareness of experience in IT, with advanced expertise in network engineering and system administration.
Awareness of experience in site reliability practices, any experience with GIS platforms is a plus.
Strong skills in scripting and automation, particularly with Python and Bash is a big plus.
Good knowledge of GitOps tools (e.g., Argo CD, FluxCD).
Knowledge of security frameworks and compliance standards.

Qualifications:

Bachelor’s degree in Computer Science, Information Technology, or a related field.
Cisco Certified Network Associate (CCNA) is a plus.
Certified Kubernetes Administrator (CKA) is a plus.
Written and spoken English communication skills at CEFR B1 level or above.

Why you’ll love working here:

Working in start-up environment, English-speaking, with opportunity to be part of innovation team and global projects
Onsite opportunities in UAE (United Arab Emirates) and KSA (Kingdom of Saudi Arabia)
13th-month salary bonus
Premium Health insurance for employees and family members (depending on level), Annual Health Check, Government Insurance in probation
14+ days of Annual leave and 5 days of Outing leave
Lunch allowance and free parking
Taxi & phone allowance (depending on level)

Apply for this job

Job Application

Full name *

Email *

Phone number *

Attach Resume *

Maximum file size: 3MB

Accepted file types: DOC, DOCX, PDF

Profile URL

If you are human, leave this field blank.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer

Abu Dhabi, Abu Dhabi Cerebras

Posted today

Tap Again To Close

Job Description

About the Role:

Orbitworks is revolutionizing access to space by building reliable, shareable satellites that drastically reduce the time and complexity traditionally required to get to orbit. We operate satellites, fly customer payloads, and handle entire missions from end-to-end. Orbitworks is a joint venture between Marlan Space (UAE-based) and Loft Orbital.

As a Senior Site Reliability Engineer on our Infrastructure Team, you’ll play a pivotal role in maintaining and scaling our ground segment infrastructure. You’ll collaborate across development, operations, and IT to ensure the integration, delivery, and reliability of services that support our test infrastructure and our space operations on Earth and in orbit.

This is an exciting opportunity to work on cutting-edge technology and help build modern automated space infrastructure. This is not your typical SRE role, we apply DevOps principles even to spacecraft control.

Responsibilities

Collaborate with developers, test engineers and satellite operators to foster a strong SatDevOps culture .
Design and roll-out cloud solutions for our testing and operations infrastructure . Find the best trade-offs between existing and additional cloud resources to scale and help Orbitworks achieve its mission.
Design, implement, and maintain scalable, reliable, and secure infrastructure in a hybrid cloud environment.
Improve our developer and test engineers experience by building better tools, workflows, and environment to streamline
Lead efforts to automate and optimize systems, including CI/CD pipelines , infrastructure provisioning (IaC) , and deployment workflows for test on the ground and operations in space.
Own and evolve our observability stack (metrics, tracing, logs) to improve usability and performance. Grafana-centric ecosystems are a plus.
Implement and advocate for best practices in software reliability, fault tolerance, and performance tuning.
Proactively identify, investigate, and resolve system reliability issues , performing root cause analyses and implementing long-term fixes.
Partner with teams to design and operate Software Defined Network (SDN) solutions.
Contribute to a collaborative and inclusive team culture where respectful debate and continuous learning are celebrated.
Initially, handle and manage the link between cloud and network/software/hardware infrastructure. Assume I&T (Information technology) responsibilities as much as necessary to start with.

Must Haves:

Strong experience with public cloud infrastructure , ideally GCP.
Deep expertise in Kubernetes , architecture, deployment, ops, and resource optimization.
Demonstrated ability to design and build scalable, highly available systems .
Familiarity with Software Defined Networking (SDN) concepts and tools.
Experience implementing and maintaining observability stacks (Grafana, Prometheus, Loki, etc.).
Proficiency in at least one backend language: Go, Python, Rust, C/C++, or Java .
Deep understanding and hands-on experience with DevOps practices : CI/CD, infrastructure as code (IaC), and automation.
Proven track record of working in fast-paced, high-growth technical environments .
Strong networking knowledge (TCP/IP, DNS, routing, switching, firewalls, VPNs, secure networks).
Deep experience in Systems Administration.
Excellent problem-solving skills and ability to operate independently with a proactive, results-driven mindset .
Strong communication skills; thrives in a multicultural, cross-functional team.

Nice to Have:

Hands-on experience with GitOps frameworks (ArgoCD, FluxCD).
Interest or experience in FinOps and cost-optimized architectures .
Understanding of orchestration in resource-constrained environments , like space systems.
Knowledge of infrastructure as code frameworks (Terraform, Ansible or similar)
Knowledge of systems engineering tools and SDLC governance.
Cybersecurity Awareness.
Familiarity with security practices , vulnerability scanning, threat detection, risk mitigation.

Orbitworks' mission is to make space simple for organizations that want to deploy physical and virtual missions to space. Building on Loft Orbital's heritage, Orbitworks will be the first commercial firm in the United Arab Emirates to mass-manufacture satellites. Orbitworks aims to manufacture tens of satellites annually and operates out of a 50,000-square-foot facility in Abu Dhabi.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Senior DevOps / Site Reliability Engineer (SRE)

Abu Dhabi, Abu Dhabi Stellar Technologies

Posted today

Tap Again To Close

Job Description

We are looking for a Senior DevOps / Site Reliability Engineer (SRE) with strong expertise in infrastructure automation, CI/CD optimization, observability, and cloud reliability engineering . The role involves building scalable DevOps solutions on Azure , ensuring high availability, resilience, and security of mission-critical systems.

The ideal candidate will have hands-on experience in Infrastructure as Code (IaC) , container orchestration , monitoring frameworks , and incident response , while driving DevSecOps alignment across development, QA, and architecture teams.

Key Responsibilities

CI/CD Pipeline Development – Build and manage scalable CI/CD pipelines for web and mobile applications, enabling automated build, test, and deployment workflows.
Pull Request Validation Workflows – Implement automated PR pipelines with linting, static analysis, unit testing, and integration checks to enforce code quality.
Security & Code Quality Automation – Integrate SonarQube, SCA (Software Composition Analysis), and vulnerability scanning tools to enforce compliance and security.
Environment-Specific Deployments – Configure deployment strategies with approval gates, rollback mechanisms, and environment-specific variables.
Infrastructure as Code (IaC) – Automate infrastructure provisioning using Terraform and Helm charts .
Azure Cloud Management – Ensure availability, scalability, and resilience of applications hosted on Azure ( AKS, App Services, VMs, Functions, App Gateway, VNets, Key Vault ).
Observability & Monitoring – Implement monitoring with Azure Monitor, Grafana, Prometheus, Application Insights and set up custom alerts/dashboards.
Secrets Management – Manage and secure secrets via Azure Key Vault and integrate them with CI/CD pipelines.
Incident Response & SRE Practices – Establish on-call rotations, conduct postmortems, and apply reliability engineering practices for system stability.
Collaboration – Work closely with development, QA, and architecture teams to align with DevSecOps best practices.
Capacity & Reliability Planning – Contribute to scalability, cost optimization, and long-term infrastructure planning.

Must-Have Skills

Strong expertise in Azure DevOps (Pipelines, Repos, Artifacts) .
Deep knowledge of Terraform and Helm for IaC and Kubernetes management.
Hands-on experience with Azure Kubernetes Service (AKS) and related Azure services ( Functions, App Gateway, VNets, Key Vault ).
Proficiency in observability tools – Azure Monitor, Application Insights, Prometheus, Grafana.
Solid understanding of Linux, Docker, Kubernetes, and CI/CD workflows .

DevOps Tech Stack

Category Tools / Technologies

CI/CD Pipelines

Azure DevOps, GitHub Actions, GitLab CI, Jenkins, Bitrise

Version Control

Git, GitHub, GitLab, Bitbucket

Infrastructure as Code

Terraform, Ansible, Helm, Bicep

Containerization & Orchestration

Docker, Kubernetes, AKS/EKS/GKE, Dapr

Code Quality & Security

SonarQube, Snyk, Trivy, Checkmarx, ESLint, Prettier

Monitoring & Logging

Prometheus, Grafana, ELK Stack, Azure Monitor, App Insights

Artifact Management

JFrog Artifactory, Nexus, GitHub Packages

Mobile Build Automation

Fastlane, Bitrise, App Center, Firebase App Distribution

Release Management

Azure DevOps Releases, GitHub Environments, Argo CD

Secrets Management

Azure Key Vault, HashiCorp Vault, AWS Secrets Manager

Good to Have

Experience with GitOps, ArgoCD, and Service Mesh (Istio/Linkerd) .
Knowledge of security tools – Snyk, AquaSec, Trivy.
Familiarity with FinOps practices for cloud cost monitoring and optimization.

Soft Skills & Competencies

Strong problem-solving and analytical abilities.
Ability to manage complex projects and multiple environments .
Excellent communication and collaboration skills.
Passion for automation, reliability, and continuous improvement .

Work Environment

This is an on-site role in Abu Dhabi, UAE , within a fast-paced enterprise digital transformation environment . The candidate will be at the center of mission-critical projects, collaborating with cross-functional teams to deliver secure, resilient, and scalable DevOps and SRE solutions .

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer (SRE) 5-10 Yrs

Dubai, Dubai GSSTech Group

Posted today

Tap Again To Close

Job Description

We are hiring a Site Reliability Engineer (SRE) with 5–10 years of experience and a strong background in the banking domain. The role involves ensuring high availability, scalability, and performance of critical banking applications and infrastructure. You will work closely with development and operations teams to automate processes, monitor systems, and resolve incidents proactively. This position is based onsite in Dubai .

5–10 years of experience as a Site Reliability Engineer or similar role.
Proven banking domain experience is mandatory.
Strong skills in Linux/Unix administration and shell scripting.
Experience with cloud platforms (AWS, Azure, or GCP).
Proficiency in automation tools (Ansible, Terraform, Puppet, or Chef).
Hands-on experience with CI/CD pipelines and DevOps practices.
Knowledge of containerization and orchestration (Docker, Kubernetes).
Strong expertise in monitoring tools (Prometheus, Grafana, ELK Stack, etc.).
Experience in incident management, root cause analysis, and disaster recovery planning.
Excellent problem-solving, communication, and collaboration skills.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer (SRE) - 10+ Yrs

Dubai, Dubai GSSTech Group

Posted today

Tap Again To Close

Job Description

We are seeking a highly experienced Senior Site Reliability Engineer (SRE) with 10–15 years of experience and a proven background in the banking domain. The role involves ensuring system reliability, scalability, and performance across critical banking applications. You will be responsible for automation, incident management, performance optimization, and cloud infrastructure management. Strong problem-solving skills and the ability to work in high-pressure environments are essential.

10–15 years of experience in Site Reliability Engineering or related fields.
Mandatory banking domain experience with deep understanding of financial systems.
Expertise in monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Splunk).
Strong knowledge of cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes, Docker).
Proficiency in scripting/programming languages (Python, Shell, Go, or similar).
Experience in CI/CD pipelines, Infrastructure as Code (Terraform, Ansible).
Strong background in incident management, root cause analysis, and problem resolution.
Understanding of security best practices and compliance in banking environments.
Excellent troubleshooting skills for large-scale distributed systems.
Strong communication, collaboration, and stakeholder management skills.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Be The First To Know

About the latest Reliability engineer Jobs in United Arab Emirates !

Set Email Alert:

Enter your email

Job title

Location

Senior Site Reliability Engineer (SRE) - 10+ Yrs

Dubai, Dubai GSS Group

Posted today

Tap Again To Close

Job Description

10–15 years of experience in Site Reliability Engineering or related fields.
Mandatory banking domain experience with deep understanding of financial systems.
Expertise in monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Splunk).
Strong knowledge of cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes, Docker).
Proficiency in scripting/programming languages (Python, Shell, Go, or similar).
Experience in CI/CD pipelines, Infrastructure as Code (Terraform, Ansible).
Strong background in incident management, root cause analysis, and problem resolution.
Understanding of security best practices and compliance in banking environments.
Excellent troubleshooting skills for large-scale distributed systems.
Strong communication, collaboration, and stakeholder management skills.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Engineer (Reliability)

KBR

Posted 8 days ago

Tap Again To Close

Job Description

Title:
Engineer (Reliability)
Engineer (Reliability)
+ Mechanical (Rot & Static) / Instrumentation (I&C, F&G, Telecom) / Electrical / CMMS (SAP)
Division
KBR is looking for Engineers to support the maintenance of crude flexibility project critical equipment at Adnoc Refining, Ruwais, Abu Dhabi
Summary
Professional who applies scientific and mathematical principles to develop solutions to technical problems. Their role involves designing, developing, testing, and maintaining systems, structures, or products, ensuring functionality, safety, and efficiency. This category may include engineers in Mechanical, Electrical, Control, Corrosion, and Metallurgy engineering.
Experience:
Minimum 05 years of relevant experience in engineering tasks related to their field
Qualifications:
+ Bachelors or masters degree in mechanical, Electrical, Control, Corrosion, or Metallurgy Engineering. Others -subject to the specific Call-off requirement as applicable and subject to Company approval.
+ CRE - Certified Reliability Engineer
+ CMRP - Certified Maintenance and Reliability Professional
Decarbonization - Energy Transition - Sustainability
Belong. Connect. Grow. with KBR!

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer II - Real-Time and Big Data

Dubai, Dubai Esri

Posted today

Tap Again To Close

Job Description

Join us to work collaboratively with our talented team of dynamic and passionate engineers to deliver capabilities that enable our customers to make a difference. You'll deploy and operate ArcGIS Velocity and ArcGIS Workflow Manager SaaS solutions. You will also have the opportunity to design, deploy, and operate next-generation real-time and big data GIS software-as-a-service (SaaS) capabilities for thousands of cloud users worldwide.

Our teams have a broad mix of experience levels and tenures that support an environment that promotes professional development. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.

Our team also puts a high value on work-life balance, and we understand that striking a healthy balance between your personal and professional life is crucial to your happiness and success here. We offer a flexible hybrid schedule so you can have a more productive and well-balanced life both in and outside of work.

Responsibilities

Collaborate with a team of SRE engineers to operate SaaS capabilities across multiple regions on the cloud platform
Design, implement, configure, and utilize monitoring systems to monitor the health of SaaS products
Manage infrastructure used for ArcGIS Velocity and ArcGIS Workflow Manager, respond to alerts, and troubleshoot problems to resolution
Develop, implement, and maintain automation solutions for repetitive operational tasks, such as deployment pipelines, incident resolution, and scaling processes
Design and implement the deployment and upgrade containerized micro-service components that, when combined, power Esri’s SaaS offerings
Create and automate Git workflows to simplify code integration, testing, and infrastructure deployments
Participate in technical spike efforts, bringing new innovative ideas to future versions of our software
Troubleshoot the system incidents and provide root cause analysis reports
Provide rotational on-call technical support

Requirements

5+ years of experience managing Kubernetes (EKS), logging and monitoring (ELK, Prometheus), and container technologies (Docker)
Proficient in using Terraform for automating infrastructure provisioning and management
Ability to design and automate Git workflows for streamlined code integration, testing, and infrastructure deployment
Ability to write scripts to deploy infrastructure and/or applications (Bash, Python, Terraform)
Expert level understanding and experience with cloud computing platforms (AWS or Microsoft Azure)
Strong knowledge of Linux Operating system administration, including troubleshooting, performance tuning, and shell scripting
Proficient in cloud networking, including VPCs, subnets, security groups, and VPNs in platforms like AWS or Azure
Skilled in identifying and resolving system and application issues through effective troubleshooting and root cause analysis
Working knowledge of a source control and issue management system
Bachelor’s in computer science, computer engineering, GIS, or information systems

Recommended Qualifications

Experience designing, administering, and/or maintaining cloud environments, such as AWS or Azure, supporting 24×7 high-availability production environments
Interest in working with GitOps principles to automate the deployment of applications on Kubernetes clusters
Certifications: AWS Certified Solution Architect Associate, CKA/CKAD or similar
Experience managing OpenSearch (datastore or logstore), and Kafka for managing distributed data streams and ensuring high availability in large-scale systems
Ability to work with continuous integration and delivery best practices
Knowledge of operating resilient, highly available, scalable, and performance SaaS capabilities
Knowledge of Esri ArcGIS or other web mapping technologies
Working knowledge of GitHub

#LI-DR5

#LI-Hybrid

About Esri

At Esri, diversity is more than just a word on a map. When employees of different experiences, perspectives, backgrounds, and cultures come together, we are more innovative and ultimately a better place to work. We believe in having a diverse workforce that is unified under our mission of creating positive global change. We understand that diversity, equity, and inclusion is not a destination but an ongoing process. We are committed to the continuation of learning, growing, and changing our workplace so every employee can contribute to their life’s best work. Our commitment to these principles extends to the global communities we serve by creating positive change with GIS technology. For more information on Esri’s Racial Equity and Social Justice initiatives, please visit our website here .

If you don’t meet all of the preferred qualifications for this position, we encourage you to still apply!

Esri is an equal opportunity employer (EOE) and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law. If you need reasonable accommodation for any part of the employment process, please email and let us know the nature of your request and your contact information. Please note that only those inquiries concerning a request for reasonable accommodation will be responded to from this e-mail address.

Esri Privacy Esri takes our responsibility to protect your privacy seriously. We are committed to respecting your privacy by providing transparency in how we acquire and use your information, giving you control of your information and preferences, and holding ourselves to the highest national and international standards, including CCPA and GDPR compliance.

#J-18808-Ljbffr

Is this job a match or a miss?

This advertiser has chosen not to accept applicants from your region.

Industry

View All Reliability Engineer Jobs

Search Suggestions

Recent Searches

Popular Searches

Location Suggestions

Popular Locations

What People Ask

What qualifications are needed to become a Reliability Engineer? expand_more

What are the main responsibilities of a Reliability Engineer? expand_more

What is the expected salary range for Reliability Engineers in the UAE? expand_more

What skills are important for a Reliability Engineer? expand_more

Who are the top employers for Reliability Engineers in the UAE? expand_more

Nearby Locations

Other Jobs Near Me

Industry

45 Reliability Engineer jobs in the United Arab Emirates

Site Reliability Engineer

Job Description

Is this job a match or a miss?

Risk, Safety & Reliability Engineer

Job Description

Is this job a match or a miss?

Senior Site Reliability Engineer

Job Description

Is this job a match or a miss?

Senior Site Reliability Engineer

Job Description

Is this job a match or a miss?

Senior DevOps / Site Reliability Engineer (SRE)

Job Description

Is this job a match or a miss?

Site Reliability Engineer (SRE) 5-10 Yrs

Job Description

Is this job a match or a miss?

Senior Site Reliability Engineer (SRE) - 10+ Yrs

Job Description

Is this job a match or a miss?

Be The First To Know

Senior Site Reliability Engineer (SRE) - 10+ Yrs

Job Description

Is this job a match or a miss?

Engineer (Reliability)

Job Description

Is this job a match or a miss?

Site Reliability Engineer II - Real-Time and Big Data

Job Description

Is this job a match or a miss?

Nearby Locations

Other Jobs Near Me

Industry