vikneshwara-r-b/spotify_azure_de_project — reverse-engineered prompt

Reverse engineered prompt

Build me a complete Azure data engineering demo around Spotify style data that I can run and show as a portfolio project. I want a daily batch pipeline that pulls incremental data from an Azure SQL source using watermarks, lands raw files in a data lake, then transforms them through bronze, silver, and gold layers. Use Azure Data Factory to handle ingestion and trigger Databricks for the processing. In Databricks, make the silver layer do change detection so only changed records are updated, and make the gold layer use SCD Type 2 tables so history is preserved for users, artists, tracks, and stream events.

Please include the cloud setup as code, secret handling, metadata tracking for watermarks, and governance for the final tables. Use the sample Spotify dataset in the repo or generate synthetic data if needed. I also want the project to feel production ready, with clear docs, runnable notebooks or workflows, and enough comments that I can understand what is happening. Look up current Azure and Databricks docs online if you need to.

Want more depth? Deep Reverse