Position Summary: As a Data Engineer for our client, you'll be an integral part of an innovative team dedicated to leveraging data to drive cutting-edge solutions. Your role will involve designing, constructing, and maintaining data infrastructure and pipelines that empower our data-centric products and insights. Using tools such as Snowflake, Spark, Kafka, Airflow, and cloud data platforms, you'll architect robust data architectures to facilitate scalable and efficient data collection, storage, processing, and analysis.
Your responsibilities will also include ensuring data quality, security, and governance through the implementation of validation processes, access controls, and monitoring systems. Collaborating closely with analysts, data scientists, and engineers, you'll translate their requirements into trusted data products.
We're seeking individuals with substantial experience in distributed systems, data modeling, pipeline orchestration, and proficiency in programming languages like Python/Scala. Strong problem-solving abilities and excellent communication skills are essential. If you're passionate about building scalable data architectures and transforming raw data into actionable insights, we invite you to join our team and contribute to our data-driven products and strategy.
Responsibilities:
- Design, implement, and support cloud data platforms like Snowflake and Databricks, optimizing data performance and scalability.
- Architect and manage data lakes and cloud data warehouses to ensure secure, reliable, and flexible data storage and access.
- Develop and maintain scalable data pipelines using cloud services and tools such as AWS, Azure, Apache Airflow, and Prefect.
- Utilize Python, Scala, and Spark to optimize data processing workflows, managing data warehouse and data lake solutions.
- Apply version control and collaboration tools like Git, GitHub, and Azure DevOps, and automate infrastructure using Terraform and Infrastructure as Code (IaC) principles.
- Advocate for CI/CD pipelines to streamline development and deployment processes.
- Ensure data integrity and compliance with best practices in SQL and NoSQL database systems, troubleshooting data quality, security, and privacy issues.
- Explore new technologies to enhance data reliability, efficiency, and quality.
- Collaborate with stakeholders to understand requirements and deliver trusted data products.
- Maintain data documentation, metadata, and data dictionaries for accessibility and usability.
- Perform data testing and validation to ensure accuracy and consistency.
- Provide support and guidance to junior data team members.
- Stay updated with trends in data engineering and related fields.
- Apply best practices for data governance, security, and quality, ensuring compliance with data policies and regulations.
- Design and implement data APIs and services to enable data consumption and integration.
- Monitor and improve data pipeline performance, efficiency, and reliability, troubleshooting issues as they arise.
- Conduct data analysis and provide insights to support decision-making.
- Integrate machine learning models into production systems using AWS Sagemaker, MLFlow, and Jupyter Notebooks.
- Mentor other team members on data engineering best practices.
Qualifications: Education & Certificates:
- Bachelor's degree in Computer Science, Engineering, or related field.
- Relevant certifications in AWS, Azure, and modern data technologies are highly desirable.
Professional Experience:
- 5+ years of experience in data engineering, analysis, and pipeline development.
- Proficiency in AWS data services such as S3, Glue, Redshift, EMR, Athena, and Kinesis.
Competencies & Attributes:
- Proficiency in Snowflake, Databricks, and Azure data services.
- Expertise in SQL, Python, Scala, and Spark.
- Experience with Git, GitHub, Azure DevOps, Terraform, and CI/CD practices.
- Knowledge of data warehouse, data lake, and data mart concepts.
- Familiarity with pipeline orchestration tools like Apache Airflow and Prefect.
- Strong communication and collaboration skills.
Note: Only candidates selected for an interview will be contacted. Our client is committed to fostering diversity and inclusion within their teams.
3.5