BRI — Anti Money Laundry

Overview

BRI's Anti Money Laundry program required reliable data pipelines that could pull from many source systems, stage and reconcile, then deliver clean data to AML detection workloads. The pipelines needed to be repeatable, observable, and aligned with the regulatory data model.

Approach

Data modeling

Designed staging tables in Hive to hold raw source data prior to processing.

Pipelines

Wrote Python scripts using Spark to integrate and transform data inside Hive.
Built automated ETL pipelines moving data from staging into AML target tables.
Tested and debugged each pipeline path end-to-end.

Performance & observability

Tuned Spark configurations and ETL stages for production throughput.
Implemented logging and monitoring around ETL activity so issues surfaced fast.

Collaboration

Worked alongside the AML team and stakeholders to make sure data semantics matched regulatory requirements.
Produced technical documentation covering processes, architecture, and configuration.

Outcome

Repeatable, monitored pipelines that AML analysts could trust — and that operations could troubleshoot without paging the original author.