Data Modeling and Transformation: Design, build, and maintain robust and scalable data models using dbt and SQL. Create reusable transformations and ensure data is structured for efficient analysis.
Pipeline Development: Develop and optimize ELT pipelines, leveraging dbt for transformations and Databricks for compute and orchestration when needed.
Quality Assurance: Implement and manage data quality tests withindbt to ensure data integrity, accuracy, and consistency.
Documentation and Governance: Maintain comprehensive documentation for dbtmodels, data lineage, and dependencies.
CI/CD and Version Control: Apply software engineering best practices using Git and set up CI/CD pipelines for automated dbtproject deployments.
Collaboration: Partner with data analysts, data scientists, and business stakeholders to deliver clean, reliable datasets for BI and ML use cases.
Essential Skills
Technical Skills:
dbt(Primary): Hands-on expertise with dbtCore anddbtCloud, including writing models, tests, macros, and using the CLI.
SQL: Expert-level proficiency in writing complex, efficient SQL queries for data transformation.
Databricks (Secondary): Experience with Azure Databricks for data processing and orchestration. Familiarity with Spark is a plus.
Azure Data Platform: Knowledge of Azure Synapse Analytics, Azure Data Lake Storage, and Azure Data Factory.
Programming: Proficiency in Python for scripting and pipeline automation. Familiarity with Jinja for dbt templating.
Data Warehousing: Strong understanding of dimensional modeling, star/snowflake schemas, and data warehousing concepts.
Version Control & DevOps: Experience with Git and CI/CD tools (Azure DevOps or similar).