An Analytical Data Platform Metadata driven and scalable
Mehmet migrated an on-premises data solution to Azure SQL, running manual SQL scripts.
automated and created a scalable Analytic Data Platform. This platform, which uses Azure Data Factory, DevOps CI/CD Pipelines, Power Automate and Power Bl, collects and processes CRM, user, sales, customer and distributor data. The migration also included coaching and knowledge sharing, so the Akzo Nobel SESA organization now has a robust data platform and more experienced data engineers.
Assignment:
Mehmet was tasked with migrating the existing data solution, an on-premises environment, to Azure SQL. This environment collected sales, customer and distributor-related data through various channels (SharePoint Excel files, email submissions and Windows Network drives). The environment was primarily updated manually with various SQL scripts. The desire was to automate the manual SQL scripts and migrate to Azure SQL. There was also a need for coaching and knowledge sharing on Data Engineering and Modeling.
Approach:
1. Feasibility Study:
Mehmet started with a feasibility study to investigate the automation of SQL scripts. Copying 1-to-1 from on-premises to Azure proved easy (simple data pipeline that copied all tables to Azure SQL). The conclusion was that automation of existing scripts was limited and would take a lot of time, because there was no set sequence and the scripts were updated daily to keep working. In addition, files were loaded manually outside the scripts.
2. Demo and Proposal:
After sharing this conclusion, it was decided that Mehmet would give a demo of a previously built solution for AkzoNobel Global to managers and developers, and how it could be adapted for SESA. After the demo, it was decided to go with the new approach. The solution consisted of two Azure SQL databases (for the metadata a general purpose and for the data a hyperscale), two Azure Blob Storage Containers (external delivery and internal files), Power Automate (for automatically copying files from email deliveries to Blob Containers) and Power BI as dashboard tooling.
3. Design and Implementation:
Mehmet laid down a high-quality design for approval and documentation. Azure Data Factory (ADF) was used for loading and transforming datasets, building generic data pipelines. In DevOps Pipelines, a CI/CD release pipeline was proposed for deployment to different environments. Two Git Repositories were proposed: one for Microsoft Visual Studio Solution/projects for database data and metadata, and one for Azure Data Factory. Azure Key Vault was proposed for storing secrets.
4. Migration planning:
A migration plan was created, examining which dashboards used which data (transformations). Prioritization from the business was important to determine which dashboards were most important and
were most commonly used. After mapping this information, a rough schedule could be issued.
5. Coaching and Knowledge Sharing:
A training plan was proposed for the developers, consisting of basic knowledge (online videos/courses) on data modeling, data engineering, Azure in general and specifically the components used in the solution.
Results:
1. Analytic Data Platform:
The proposed solution was implemented and CI/CD release pipelines were set up. All known sources were included in the metadata definitions. In ADF, all datasets were updated daily with adequate logging. Data transformations were performed fully automatically without having to modify ADF for each new transformation. Testing a data pipeline was simplified to calling the “test” data pipeline with one parameter. Documentation was automatically generated and updated in the Git Repository.
2. Migration of Existing Data:
CRM, user Sales, customer and distributor datasets were successfully migrated to Azure SQL Database, implemented and validated by the developers and supervised by Mehmet.
3. Automation of E-mail Submissions:
With Power Automate, flows were built to automatically copy data files from emails to Azure Blob Storage. Files were timestamped for proper processing order by ADF.
4. Concrete Value:
The Akzo Nobel SESA organization now has a scalable Analytic Data Platform. The more experienced developers (now data engineers) have more knowledge of data engineering and modeling, and are no longer loading data, allowing time for expansion of new data sets and extracting more insight from the data.
