HPC SRE Engineer IV

Omega Enterprise Solutions, LLC

HPC SRE Engineer IV

Annapolis Junction, MD
Full Time
Paid
  • Responsibilities

    HPC SRE Engineer IV

    Omega Enterprise Solutions is a Maryland-based, Service-Disabled Veteran-Owned Small Business (SDVOSB) with a special focus on the U.S. Department of Defense (DoD) and Intelligence Community (IC) mission and enabling technologies. We are building a team with shared values and a passion towards our Vision and Mission. Along with providing professional growth opportunities that are innovatively challenging and mission critical to our clients, we emphasize a life-focused everyday approach. We are committed to life-long learning and provide a life-coaching method of management to empower and inspire you to become the best version of yourself in pursuit of a happy, healthy, and fulfilling life.

    Description

    HPC SRE Engineer IV designs, develops, maintains, enhances and documents software systems. They work with the operations team to determine pain points and work to develop software tools to resolve those issues.

    In this role, candidates will create and maintain operations of site reliability engineering (SRE) efforts on multi-user High Performance Computing (HPC) systems using a variety of configuration management, IT monitoring, and automation tools within a Linux environment (RedHat, CentOS). Candidates will work to create a new Nagios Alerting Database, new SRE Database, and develop an effective consistent SRE automation protocols.

    Responsibilities

    Candidates will have experience and/or exposure with automation tools including: Puppet, Salt, Ansible, and Chef.

    Candidates shall also have experience with scripting in Bash, Python and/or Perl.

    Additionally, candidates will have experience or exposure to

    XFS/ZFS File Systems and NFS/Block Storage FS Sharing;

    SSH, TMUX, PDSH, CLUSH system access; VI, EMACS, AWK/SES, CRON system editing;

    Nagios

    SNMP information technology monitoring systems.

    Qualifications

    This Position requires an appropriate active Security Clearance and Polygraph.

    HPC SRE Engineer IV shall have a Bachelor’s degree in Computer Science or related field, and have ten years of demonstrable experience in system administration and support of a large client-server based IT enterprise. Or the individual shall have five years of full time computer science work that can be substituted for the Bachelor’s degree, and have ten years of demonstrable experience in system administration and support of a large client-server based IT enterprise. An industry recognized professional certification may substitute as one year experience.

    Omega Enterprise Solutions is an equal opportunity employer. All qualified applicants for employment are considered for all positions without regard to race, color, religion, national origin, sex, sexual orientation, gender identity, age, disability, veteran status, or any other protected class.