Senior DevOps Engineer
In the time it takes you to read this job description, RapidSOS will have handled ~1,380 emergencies.
At RapidSOS, we are committed to using technology to build a safer, stronger future and working together to save lives. We’re in an exciting phase of growth, welcoming new members from across the globe to our mission-driven, ambitious, and inclusive team. Our work is founded on our values of trust and safety, pioneering, urgency, and purpose over pride, all of which support a company culture where people can innovate, collaborate, grow, and, above all, make an impact. If that sounds like an exciting opportunity, we want to hear from you!
At RapidSOS, we are empowering safer, stronger communities with faster, data-driven emergency response. In partnership with public safety, RapidSOS created the world’s first intelligent safety platform that securely links life-saving data from 500+ million connected devices, apps and sensors and 90+ technology partners directly to RapidSOS Safety Agents, 911, and first responders globally. The platform is used by over 15,000+ first responder agencies and supports 165+ million emergencies each year. When people need help during an unsafe moment or an emergency, their connected device, home or building that is RapidSOS Ready, delivers essential data to the right place, when it matters most.
What this role is about:
As a Senior Site Reliability Engineer, your mission is to help drive our team of innovators and technologists towards creating next-level solutions that improve the way our mission critical business is run. Your hands-on knowledge in system design, application development, testing, and operational stability will help our team deliver high-quality products. Your quest to embrace leading-edge technologies and methodologies inspires your team to follow suit. You will excel in this role if you have innate curiosity when it comes to working on challenges with a bias for strong execution and are mission-driven with a self-starter mentality.
What you’ll do:
- You will be responsible for critical parts of our Software Development Lifecycle (SDLC) such as building and improving workflows that automate releasing, testing, and deploying RapidSOS Software-as-a-Service (SaaS) AWS-based products
- Create and maintain cloud infrastructure and resources using Terraform
- Managed shared services running on kubernetes like the ELK stack, RabbitMQ, and other core platform resources
- Research and build monitoring and analysis tools to optimize building and deploying our code base, manage distributed systems, and application resources
- Build workflows and tools for delivering our software to a variety of platforms including AWS and Azure using technologies like Kubernetes, Helm, and ArgoCD to automate deployments and scale our products
- Improve the observability of our applications and infrastructure by implementing and promoting best practices around metrics, tracing, and logging with Datadog and ELK
- Contribute to important architectural and operational decisions like microservices vs. monoliths, deployment techniques, technologies, policies, etc.
- Improve infrastructure capabilities, optimizing for cost, simplicity, maintainability, and scalability
- Assist the engineering team with deployments while ensuring reliability and availability
What we’re looking for in our ideal candidate:
- 5+ years of professional DevOps/Site Reliability Engineering experience
- 3+ years of hands-on experience with AWS
- Minimum one-year development experience with Kubernetes in a large scale production environment
- Experience with Infrastructure as Code (IaC), preferably using Terraform
- Strong experience with Python and building/deploying Python based applications
- Strong skills in network services such as DNS, TLS/SSL, and HTTP
- Experience implementing secure and highly-available distributed systems/microservices
- Strong knowledge of Linux/Unix systems
- Able to lead and fully own medium to large projects with a focus on execution and quality
Nice-to-have experience (but not required!):
- Ideally comes from a software development background using Python
- Hands-on experience with Azure
- Experience with Git, developing continuous/rapid release engineering (CI/CD)
- Experience with data pipelines (BigQuery, Kinesis, Airflow, Spark)
- Experience creating geo-redundant/HA systems
- Experience with building custom application monitoring (Datadog, OpenTelemetry)
- Strong knowledge of PostgreSQL
- Experience with DevOps/Automation platforms like Jenkins, ArgoCD, etc.
What we offer:
- The chance to work with a passionate team on solving one of the largest challenges globally
- Competitive salary and benefits and equity participation
- A dynamic, flexible and fun start-up work environment with a highly talented team
If you're curious to learn more about RapidSOS, you can check out https://rapidsos.com/blog/
Starting pay for a successful applicant will depend on a variety of job-related factors, which may include experience, relevant skills, training, education, location, business needs, or market demands. The salary range for this role is $139,000 - $165,000. This role will also be eligible to receive equity options.
If you are based in California, we encourage you to read this important information for California residents linked here: https://rapidsos.com/privacy/california/ #LI-Remote
RapidSOS is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, or Veteran status.
Interested in the role but you don’t meet 100% of the requirements? We’d love to hear from you! We encourage you to apply; we’d be excited to see if your unique skill set and experience could be a match.