“ Bronson was excited to work with the Bank on this challenge. The challenge of data cleaning is one shared by many organizations and we were confident we could help not only with the technology but through our approach to data manipulation of large datasets.”
– Phil Cormier, CA, CPA, Senior Consultant, Bronson Consulting Group
Bronson set out to review, clean and align the spaghetti of financial security tombstone data provided to the Bank by it’s data providers. The data is fundamental to many Data Analytics challenges at the bank where the data is required.
Led by Phil Cormier and supported by Bronson’s President Martin McGarry, Bronson began by reviewing the data provided by the Bank’s Financial Markets Department (FMD). This data involved a specific use case that required matching organizational names across three different datasets. Using the Alteryx Fuzzy matching tool as the matching engine, workflows were created through Alteryx Designer to analyze, clean and standardize the data, which made linking the common fields across datasets easier.
After Bank staff and Bronson Consulting reviewed and discussed the initial results, we focused on generating outputs that could be used to enhance the accuracy of record matching. A second review showed the potential for a scalable, robust solution that could automate our data cleaning methods.
The Bank Partnered with Bronson to find a solution that would:
- reduce duplication of information
- automate the cleaning of the data
- improve the accuracy of results currently achieved in-house
Our Solution and Outcome
Using Alteryx, Bronson reviewed all the securities data and realised it wasn’t merely a matter of cleaning and matching the securities data. It was necessary to analyse all the data sets individually and engineer a data schema to proceed through the challenge in a stepwise fashion. Due to the source datasets themselves containing irregularities it was necessary to run a data assay to identify clear anomalies which would render downstream processing mute.
Once the irregularities of the sources data were identified Bronson set out to build a series of Alteryx workflows; all of which with the ability to provide constant, automated and reproduceable results. Using the Alteryx Fuzzy Logic tool, which has the Jaro-Winkler test for similarity embedded within it, Bronson was able to compare data sets of different source and naming nomenclatures.
Bronson was able to prove the simplicity and value of Alteryx and the Fuzzy matching toolsets embedded within it. Bronson left a roadmap for the Bank to both solve, and continuously automate, it’s securities tombstone data. Bronson look to continue it’s work with the Bank and create a Strategic solution above and beyond the PIVOT challenge.