Manifest Solutions is currently seeking a Data Engineer - Cloud/Gen AI for an onsite position in Westerville, OH.
- Design, build, and maintain scalable and robust data pipelines and architectures.
- Perform data extraction, cleansing, standardization, and transformation.
- Ensure data is accurate, consistent, and accessible.
- Collaborate with data scientists, analysts, and product teams to deliver data solutions that meet user needs and expectations.
- Troubleshoot and resolve data issues and defects.
- Research new data technologies and tools to enhance skills and improve data quality.
Requirements:
- Bachelor’s Degree in Computer Science, Engineering, Mathematics, or related field.
- At least 5 years of experience in data engineering, data pipeline development, and ETL processes.
- Proficient in one or more data processing and storage technologies, such as Apache Spark/PySpark, Hadoop, or Amazon Redshift.
- Experience with Python for data core modernization and data ingestion.
- Experience with data pipeline frameworks and tools, such as Apache Airflow, Luigi, or AWS Glue
- Experience with data engineering, data warehousing, and ETL processes.
- Strong proficiency in SQL and database design principles
- Excellent communication, creativity, and problem-solving skills
Preferred Requirements:
- Experience with Cloud using Amazon Web Services (AWS), Microsoft Azure, and/or Google Cloud Platform (GCP)
- Understanding of emerging technologies (such as Generative AI) and advanced data architectures (i.e.: Multimodal Data Management & Model Design, Data Mesh, Data Fabric, Data Products, etc.)
- Understanding of the benefits of data warehousing, data architecture, data quality processes, data warehousing design and implementation, table structure, fact and dimension tables, logical and physical database design, data modeling, reporting process metadata, and ETL processes
- Experience with testing frameworks and tools, such as PyTest, Unittest, or Selenium
- Experience designing and implementing reporting and visualization for unstructured and structured data sets.
- Experience with batch processing and real-time streaming ingestions.
- Certifications or degrees in data engineering, computer science, or related fields