RL Environment Reference: Investment Banking
An organization-wide system that indexes every data attribute across code, databases, manual entries, Confluence, and third-party systems. Built for investment banks where senior regulatory, technology, and compliance leadership need unified visibility into data ownership, lineage, and PII — and where that data spans wealth management, trading, fund services, prime brokerage, securities lending, and customer onboarding.
The Problem
In a large investment bank, data attributes live in hundreds of places, owned by dozens of teams, with no single source of truth. Regulators demand answers the bank cannot provide without weeks of manual archaeology.
Attributes defined in code, databases, Confluence pages, Excel sheets, emails, manual processes. No index.
Who owns "client risk score"? Nobody knows. Original owner left three years ago. Data still flowing.
"Trade date" defined seven ways across trading, settlement, accounting, and reporting systems.
Regulators ask "where is customer PII processed?" — answer takes 3 months and is incomplete.
Can't trace how a regulatory report field was calculated. Source systems, transformations, and rules are opaque.
Confluence says one thing, code does another. Documentation drifted months ago; nobody updates it.
New engineer asks "does this attribute already exist?" — no way to find out. Duplicate work proliferates.
Publishing reference data to clients, counterparties, or regulators requires manual extraction each time.
Users
Regulatory · Technology · Compliance
Needs a clear view of data ownership, lineage, and risk across the organization. Makes decisions on audits, remediation priorities, and regulatory responses.
Needs
Platform Engineers · Developers · Data Engineers
Builds systems that produce or consume data. Needs to publish attributes to the glossary as part of the development lifecycle — not as a separate manual task.
Needs
Business SMEs · Product Owners · Ops Leads
Own the business meaning of data for their function (e.g., Portfolio Accounting SOPs, fund NAV definitions). Update manually-sourced data, approve changes, transition ownership.
Needs
Architecture
The platform is a layered architecture: ingestion channels feed a core metadata engine; processing modules enrich and validate; consumption interfaces surface insights to different users.
Platform Modules
Click any module to see detailed workflows, integrations, and interactions.
Data attributes are auto-discovered from source code, database schemas, and running systems. Engineers don't separately "catalog" their data — the platform catalogs itself as code is written and deployed.
Engineers annotate fields directly in their code. The scanner extracts these during the build process.
@DataAttribute annotationsNot all data is in code. Fund NAV definitions, trading rules, counterparty reference data, and operational SOPs often live in Excel or team heads. Data owners need a simple, non-technical UI to maintain this data.
People leave, teams reorganize, business responsibilities shift. Without an ownership transition workflow, data becomes orphaned and governance falls apart.
The most important module — without automated hygiene, the glossary rots within a year. This is what makes the platform maintain its value over time.
Every attribute gets a quality score (0-100) based on:
Interactive graph visualization of how data flows through the organization. PII is visually highlighted at every node so compliance can quickly assess exposure.
The differentiator. The platform ships with built-in ontologies for investment banking domains. When scanning code or databases, it automatically recognizes attributes based on domain patterns — not just generic metadata extraction.
Client lifecycle, portfolio management, financial planning, advisory
Order management, execution, allocation, reporting
Options, futures, swaps, structured products
NAV calculation, transfer agency, fund accounting, distribution
Margin, financing, securities lending, consolidated reporting
Loans, collateral, recalls, corporate actions on loaned securities
KYC, AML, CIP, documentation, approvals
Clearing, settlement, reconciliation, regulatory reporting
exec_px in trading databaseWhat senior executives actually see. The platform synthesizes the technical metadata into business-meaningful views for regulatory, compliance, and technology leadership.
Users search in plain English; LLM interprets against the metadata graph:
Reference data, regulatory reports, and client feeds all need to be published to external parties. The platform manages publishing as a governed, auditable function.
RL Environment Value
Building this platform is exactly the kind of multi-year, multi-team enterprise engineering effort that no synthetic RL environment can authentically reproduce. It spans every layer of the stack and every role in the organization.
Scanners in multiple languages, graph DB, ML classifiers, workflow engines, dashboards, publishing APIs. Every layer has its own complexity.
Requires deep understanding of wealth, trading, derivatives, fund services, prime brokerage — each with its own vocabulary, regulations, and edge cases.
Ownership transitions, approval chains, exception handling, orphan remediation. Real organizational friction captured authentically.
Duplicate detection across domains, semantic similarity, PII boundary cases, lineage reconciliation. Requires judgment, not just rule-following.