Big Data Architect/ Cloudera and Databrick

AHU Technologies Inc

Big Data Architect/ Cloudera and Databrick

Washington, DC
Full Time
Paid
  • Responsibilities

    Job Description:

    Short Description:

    The Client seeks an experienced IT Consultant to support the design, development, implementation and maintenance of an enterprise Big Data solution as part of the Client Data Modernization Effort.

    Complete Description:

    This role will provide expertise to support the development of a Big Data / Data Lake system architecture that supports enterprise data operations for the District of Columbia government, including the Internet of Things (IoT) / Smart City projects, enterprise data warehouse, the open data portal, and data science applications. This is an exciting opportunity to work as a part of a collaborative senior data team supporting DC's Chief Data Officer. This architecture includes an Databricks, Microsoft Azure platform tools (including Data Lake, Synapse) and data pipeline/ETL development tools (including Streamsets, Azure Data Factory). The platform will be designed for District wide use and integration with other Client Enterprise Data tools such as Esri, Tableau, MicroStrategy, API Gateways, and Oracle databases and integration tools.

    CONTRACT JOB DESCRIPTION

    Responsibilities:

    Coordinates IT project management, engineering, maintenance, QA, and risk management.

    Plans, coordinates, and monitors project activities.

    Develops technical applications to support users.

    Develops, implements, maintains and enforces documented standards and procedures for the design, development, installation, modification, and documentation of assigned systems.

    Provides training for system products and procedures.

    Performs application upgrades.

    Performs, monitoring, maintenance, or reporting on real- time databases, real-time network and serial data communications, and real-time graphics and logic applications.

    Troubleshoots problems.

    Ensures project life-cycle is in compliance with District standards and procedures.

    Skills:

    Experience implementing modern Big Data storage and analytics platforms such as Databricks and Data Lakes. Required 5 Years

    Knowledge of modern Big Data and Data Architecture and Implementation best practices. Required 5 Years

    Knowledge of architecture and implementation of networking, security and storage on cloud platforms such as Microsoft Azure. Required 5 Years

    Experience with deployment of data tools and storage on cloud platforms such as Microsoft Azure. Required 5 Years

    Knowledge of Data-centric systems for the analysis and visualization of data, such as Tableau, MicroStrategy, ArcGIS, Kibana, Oracle. Required 5 Years

    Experience querying structured and unstructured data sources including SQL and NoSQL databases. Required 5 Years

    Experience modeling and ingesting data into and between various data systems through the use of Data Pipelines. Required 5 Years

    Experience with API / Web Services (REST/SOAP). Required 3 Years

    Experience with complex event processing and real-time streaming data. Required 3 Years

    Experience with deployment and management of data science tools and modules such as JupyterHub. Required 3 Years

    Experience with ETL, data processing, analytics using languages such as Python, Java or R. Required 3 Years

    Databricks Certified Data Engineer Professional. Highly desired

    Experience with Cloudera Data Platform. Highly desired 3 Years

    16+ yrs planning, coordinating, and monitoring project activities. Required 16 Years

    16+ yrs leading projects, ensuring they are in compliance with established standards/procedures. Required 16 Years

    Bachelor’s degree in IT or related field or equivalent experience. Required