Site Reliability Engineering Manager

Daxko

Site Reliability Engineering Manager

Birmingham, AL
Full Time
Paid
  • Responsibilities

    Job Description

    As a Site Reliability Engineering Manager, you will manage all production assets for each product. Your responsibilities include: batching, upgrading, deploying new servers, organizing the team's workload, supporting engineering efforts, compliance, uptime, and performance monitoring. You’ll be responsible for prioritizing, organizing, and leading your team's execution of all work. You'll assess operational capabilities and performance to ensure the on-time delivery of quality products and services to all customers, both internal and external.

    As a leader, you will:

    • Set and help the team understand performance targets and goals
    • Evaluate and provide real-time feedback on performance
    • Train and/or ensure that the team is properly trained for their specific roles
    • Coordinate on-call rotation
    • Coordinate training for staff
    • Assist in resolving emergencies, such as infrastructure or software outages
    • Manage headcount and make staffing decisions related to new hires and terminations

    In your day-to-day, you will:

    • Oversee progress in achieving operational/production goals and objectives, especially with respect to quality, cost, and customer service.
    • Take responsibility for uptime, data accuracy, and integrity.
    • Interact with Engineering Leads to ensure alignment between teams
    • Maintain business continuity for all production assets
    • Ensure proper planning and prioritization using agile practices.
    • Ensure operations are in full compliance with all company and regulatory requirements.
    • Be a technical escalation point for your team.
    • Provide weekly reports on system availability, response, and capacity.
    • Manage on-call rotation among team members.
    • Have budget responsibilities, including ensuring fiscal responsibility for hosting and software licensing.
  • Qualifications

    Qualifications

    • Bachelor’s degree - technical discipline preferred; OR equivalent experience
    • Three (3) to five (5) years of experience managing globally distributed team members
    • Three (3) to five (5) years of experience in a site reliability engineering capacity
    • Solid foundation in the following technologies:
      • Linux
      • Web Servers (NGiNX / PHP / Traefik / F5)
      • Virtualization Technologies (VMWare)
      • Cloud Platforms (AWS, Azure)
      • Containerization Systems (Docker, Kubernetes, Dynos)
      • Caching technology (Redis / rabbitmq )
    • Strong security mindset and experience implementing security controls
    • Excellent organizational skills and attention to detail.
    • Excellent time management skills with a proven ability to meet deadlines.
    • Strong analytical and problem-solving skills.
    • Strong supervisory and leadership skills.
    • Ability to prioritize tasks and to delegate them when appropriate.

    Bonus points for:

    • Strong observability experience with Monitoring Technologies, creating custom checks, and managing alert profiles and escalation policies. (OpenTelemetry, Instana, LogicMonitor, PagerDuty, OpsGenie)
    • Experience with Tooling (GitLab CI, Jenkins, Chef, Terraform, Elastic Search, Kubernetes, Rancher)
    • Scripting experience with the following languages: Ruby, Python, Bash
    • Experience with SOC, PCI, GDPR standards and regulations
    • Experience working tickets and managing priorities within issue tracking systems (Atlassian Suite, etc.)
    • Experience developing or supporting Java, php, or node applications
    • Experience automating repetitive tasks

    Additional Information

    The salary range for this role is $163,000 - $211,000 per year. Where you fall within the compensation range is based on how you demonstrate the attributes and competencies required for the role. __ We mostly reserve the upper half of our compensation bands for internal growth. In addition to base salary, we offer a comprehensive benefits package, performance-based incentives, and opportunities for growth.

    #LI-Remote

    Daxko is dedicated to pursuing and hiring a diverse workforce. We are committed to diversity in the broadest sense, including thought and perspective, age, ability, nationality, ethnicity, orientation, and gender. The skills, perspectives, ideas, and experiences of all of our team members contribute to the vitality and success of our purpose and values.

    We truly care for our team members, and this is reflected through our offices, and benefits, and great perks. These perks are only for our full-time team members. Some of our favorites include:

    Flexible paid time off
    ⚕️ Affordable health, dental, and vision insurance options
    Monthly fitness reimbursement
    401(k) matching
    New-Parent Paid Leave
    Casual work environments
    Remote work

    All your information will be kept confidential according to EEO guidelines.