Back to all roles

[Remote] Site Reliability Engineering Tech Lead

Remote-first Full-time Now hiring

Note: The job is a remote job and is open to candidates in USA. DataHub is an AI & Data Context Platform adopted by over 3,000 enterprises, including major companies like Apple and Netflix. They are seeking an experienced Site Reliability Engineering (SRE) Tech Lead to drive the reliability, scalability, and operational excellence of their platform offerings, focusing on technical leadership and architecture, enterprise platform development, and platform reliability operations.

Responsibilities

  • Design and implement robust, scalable infrastructure solutions for DataHub Cloud and enterprise deployments
  • Lead the technical vision for multi-cloud deployment strategies and distributed system integrations
  • Architect monitoring, observability, and alerting systems across diverse environments
  • Drive best practices for infrastructure as code, configuration management, and deployment automation
  • Partner with product and engineering teams to influence the development of advanced deployment capabilities
  • Collaborate with cross-functional teams to help build systems for seamless installation, upgrade, and rollback processes across various environments
  • Influence the design and help implement comprehensive monitoring and health check systems for distributed deployments
  • Partner with engineering teams to help develop self-healing and automated remediation capabilities
  • Establish and maintain SLAs/SLOs for both cloud and enterprise offerings
  • Lead incident response and post-mortem processes to drive continuous improvement
  • Implement chaos engineering practices to proactively identify system weaknesses
  • Optimize system performance, capacity planning, and cost efficiency
  • Mentor and guide a team of SRE engineers and collaborate with platform engineering teams
  • Work closely with product, engineering, and customer success teams to ensure reliable product delivery
  • Improve on-call practices, runbooks, and knowledge sharing processes
  • Drive cross-functional initiatives to improve overall system reliability

Skills

  • 8+ years of experience in Site Reliability Engineering, Platform Engineering, or DevOps roles
  • 3+ years of technical leadership experience managing engineering teams
  • Strong expertise with cloud platforms (AWS, GCP, Azure) and infrastructure automation tools
  • Proficiency in containerization technologies (Docker, Kubernetes) and orchestration
  • Experience with infrastructure as code tools (Terraform, CloudFormation, Pulumi)
  • Strong programming skills in Python, Java, or similar languages
  • Deep understanding of monitoring and observability tools (Prometheus, Grafana, Datadog, etc.)
  • Experience with CI/CD pipelines and deployment automation
  • Strong knowledge of networking, security, and database operations in cloud environments
  • Experience building and operating multi-tenant SaaS platforms
  • Background in developing customer-facing deployment and management tools
  • Knowledge of data infrastructure and metadata management systems
  • Experience with service mesh technologies and microservices architectures
  • Previous experience in a customer-facing technical role or working with enterprise clients
  • Experience with data governance or data catalog platforms

Benefits

  • Competitive compensation
  • Equity for everyone
  • Remote Work
  • Location flexibility
  • You’ll receive a monthly coworking stipend to use whenever you need a change of pace or in-person collaboration time.
  • Comprehensive health coverage
  • We cover 99% of medical, dental, and vision premiums employees, and 65% for dependents.
  • Flexible savings accounts
  • We offer FSAs to help cover planned or unexpected healthcare costs.
  • You can also opt into a Dependent Care FSA to support family needs.
  • Support for every path to parenthood
  • Through Carrot Fertility, we provide inclusive fertility benefits and family-forming support.
  • All U.S. employees have access, regardless of age, gender identity, or family structure.
  • Time off that works for you
  • Our unlimited PTO and sick leave policy is designed for flexibility, rest, and real life.

Company Overview

  • DataHub is an open-source metadata platform that unifies data discovery, observability, and governance for AI and data ecosystems. It was founded in 2021, and is headquartered in Palo Alto, California, USA, with a workforce of 51-200 employees. Its website is https://datahub.com.
  • Company H1B Sponsorship

  • DataHub has a track record of offering H1B sponsorships, with 3 in 2025, 1 in 2024, 2 in 2021. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    More remote roles

    [Remote] SAP MDG (Master Data Governance) Business Partner Analyst

    Remote-first Full-time

    [Remote] Administrative Coordinator

    Remote-first Full-time

    [Remote] Sr. Client Partner – Healthcare (Provider, Payer & HealthcareTech)

    Remote-first Full-time

    [Remote] Sr. Client Partner – Financial Services

    Remote-first Full-time

    [Remote] Senior Technical Recruiter

    Remote-first Full-time

    [Remote] Integration Engineer

    Remote-first Full-time

    [Remote] Senior Software Engineer

    Remote-first Full-time

    [Remote] Remote Customer Service Representative Job Details | EssilorLuxottica Group

    Remote-first Full-time

    [Remote] Data Engineer

    Remote-first Full-time

    [Remote] VoIP Operations Engineer (Mid-Level)

    Remote-first Full-time

    Work From Home Notary Signing Agent in Owings Mill, MD / Call Now 405-568-7539

    Remote-first Full-time

    Experienced Full Stack Security Analyst – Identity and Access Management (IAM) for arenaflex

    Remote-first Full-time

    Consulting Associate, Environmental Scientist (Assistant Wetland Delineator)

    Remote-first Full-time

    Data Entry Specialist, Remote

    Remote-first Full-time

    Remote Licensed Social Worker for Educational Settings

    Remote-first Full-time

    Experienced Full Stack Customer Service Representative – Remote Work Opportunity at arenaflex

    Remote-first Full-time

    Customer Service Representative (Licensed Vet Tech through VTNE) - Remote

    Remote-first Full-time

    Machine Learning Engineer

    Remote-first Full-time

    Remote Customer Service Representative – Client Success & Sales Enablement Specialist for arenaflex

    Remote-first Full-time

    Senior Manager, Philanthropy (Remote)

    Remote-first Full-time