https://bayt.page.link/iK7KAKVswYnbLfvQ8

Senior Data Site Reliability Engineer II

- Careem
- Pakistan

14 days ago 2024/09/15

Work From Home

Attach a Cover Letter

Complete Questionnaire

Apply on company site

Create a job alert for similar positions

Job Description

Careem is building the Everything App for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million Captains, simplified the lives of over 50 million customers, and built a platform for the region’s best talent to thrive and for entrepreneurs to scale their businesses. Careem operates in over 70 cities across 10 countries, from Morocco to Pakistan.

About the team:

We are looking for engineers who will work within the Cloud Engineering team. The team develops and maintains cloud-native technology for the Careem Service teams:

Highly scalable Kubernetes clusters
Cloud Access management automation and integration with k8s

About the role:

As Data Platform Site Reliability Engineering you will manage infrastructure and applications on cloud computing platforms to deliver data processing, governance, and storage. Our platform teams work with exabytes of data, terabytes of memory, and hundreds of thousands of jobs to enable predictable and performant data analytics enabling features across Careem Verticals.

As an SRE, you’ll need to solve problems that arise using empirical data, teamwork, and your own unique expertise.

The Data Platform SRE will work directly with our data platform and engineering teams in an embedded SRE model, operating in unison with the developers to deliver seamless experiences for our customers. 

We run a mix of open source, vendor licensed, and internally developed tools which you will use and have opportunities to improve upon. The cross functional team collaborates to ensure we apply a consistent incident management process across all data platform services and provide user journey based SLOs derived from exhaustive observability metrics, high availability architecture, and automation for deployments. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.

Key responsibilities include:

Make an impact from design phase, through development and operation of Data Platform over Kubernetes cluster and its ecosystem on AWS
Build core services, and tooling and create technical processes that simplify and enable engineers across multiple services
Identifying, automating and scaling system configurations without compromising on security and reliability.
Participate in on-call rotations and help improve incident response

Education and Experience:

BS/MS in Computer Science or Equivalent (7+ years of software development or production operations experience in a large-scale environment)

Qualifications:

Strong sense of ownership and integrity demonstrated through clear communication and collaboration
Experience in architecting, developing, operating, and troubleshooting Kubernetes clusters and/or other highly available systems at scale. 
Proficiency with the architecture, deployment, performance tuning, and troubleshooting of open-source data analytics technologies, especially Apache Spark, Trino and related software in a large-scale environment
The ability to design, author, and release code in languages like Go, Python, or Java
Acute drive to automate manual operations and to improve them through repeated iteration
Understanding of the Linux Operating System, standard networking protocols, and components
Experience with cloud-native services on AWS/GCP
Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Terraform, Cloudformation, ArgoCD, and Flux)
Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
Excellent troubleshooting and problem-solving skills
Experience with scale testing, disaster recovery, and capacity planning
Effective communication and collaboration skills: have the ability to drive and promote technical partnerships across teams
Incident response and/or incident management experience

What we’ll provide you

We offer colleagues the opportunity to drive impact in the region while they learn and grow. As a full time Careem colleague, you will be able to:

Work and learn from great minds by joining a community of inspiring colleagues.
Put your passion to work in a purposeful organisation dedicated to creating impact in a region with a lot of untapped potential.
Explore new opportunities to learn and grow every day.
Work 4 days a week in office & 1 day from home, and remotely from any country in the world for 30 days a year with unlimited vacation days per year. (If you are in an individual contributor role in tech, you will have 2 office days a week and 3 to work from home.)
Access to healthcare benefits and fitness reimbursements for health activities including gym, health club, and training classes.

Job Details

Job Location: Pakistan
Company Industry: Other Business Support Services
Company Type: Unspecified
Employment Type: Unspecified
Monthly Salary Range: Unspecified
Number of Vacancies: Unspecified

Apply on company site Email to Friend Add a Cover Letter Complete Questionnaire

Send Me Similar Jobs

Compare your profile with other applicants

Do you need help in adding the right mix of strong keywords to your CV?

Let our experts design a Professional CV for you.

Get Help

Cancel

You have reached your limit of 15 Job Alerts. To create a new Job Alert, delete one of your existing Job Alerts first.

MANAGE

Similar jobs alert created successfully. You can manage alerts in settings.

MANAGE

Similar jobs alert disabled successfully. You can manage alerts in settings.

MANAGE

Products By Bayt.com

Use Our Mobile App

Senior Data Site Reliability Engineer II

Job Description

Job Details

Do you need help in adding the right mix of strong keywords to your CV?