Bank consortium led by SocGen seeks to cure post-trade data ills

Project led by Societe Generale that uses privacy-enhancing technologies to solve data management issues hopes to sign up five banks and launch as a legal entity.

Banks maintain reference data at considerable cost and to no competitive advantage in order to perform functions such as know-your-customer (KYC) checks. As this information overlaps with that of their peers, ideally banks would be able to cross-check the data with each other. Does company X have a politically exposed person (Pep) on its board, for instance? Or does the legal entity identifier (LEI) we associate with company Y match the one you have on record?

In reality, however, this data is too sensitive to share with competitors. It is also sometimes governed by data privacy laws such as the General Data Protection Regulation.

In recent years, privacy-enhancing technologies (PET) have offered an enticing promise: to enable financial institutions to compare notes on encrypted data, and perform computations and reconciliations without breaching client confidentiality or allowing their peers to peek into their secrets.

A consortium of banks led by Societe Generale says it has successfully proved this concept and produced applications enabling users to reconcile data using PET. The consortium, known as Danie, now hopes to become formalized as a legal entity later this year.

Currently, Danie is a loose association of nine banks that operates under a memorandum of understanding. The banks have signed up to the same set of terms and conditions that allow them to share and reconcile the data, says Mark Davies, a partner at consultancy Element22 in London.

“But the goal is to have this legal entity of which the banks will be part owners and governors. So it will be an industry-owned and governed consortium that then uses the technology to build out the applications, the roadmap, and everything else,” Davies says.

Element22 is the operator of the Danie consortium. A start-up called Secretarium, founded by two former SocGen technologists and grown in SocGen’s incubator, is supplying the tech.

Danie began life as a blockchain project. In 2018, SocGen, via its global business service unit, joined a consortium called Massive Anonymous Distributed Reconciliation (Madrec), alongside UBS, Credit Suisse, Barclays and Belgian bank KBC. Madrec was intended to benchmark data quality by comparing it among the consortium’s members, and with data supplied by vendors including Six and what was then Thomson Reuters.

Madrec piloted a solution on the Ethereum blockchain that could reconcile its members’ encrypted and anonymized LEI records in real time. The idea was that members could help each other ensure that their records were up to date without giving away sensitive information such as client data or relationships, and without having to pay an intermediary vendor.

SocGen, however, didn’t agree with how Madrec should be structured once it became a legal entity, and broke away in late 2019 to pursue its own solution, which became Danie. Danie’s initial use case was also to benchmark reference data quality on third parties, using LEIs as a matching key and allowing reconciliations to be performed; each member sent its list of anonymized and encrypted LEIs to Danie for reconciliation within minutes. Like Madrec, it was based on tech that provided both robust cryptography and secure hardware for a neutral platform for collaboration between parties that do not trust one another.  

Strictly in confidence

However, Secretarium is not a blockchain company but rather a PET called confidential computing—an execution environment that is a physical piece of hardware known as “trusted enclave”. In Secretarium’s case, the hardware is Intel’s Software Guard Extensions (SGX), a set of instructions built into a CPU that allows users to partition off regions of the CPU—the enclaves—as a kind of black box, so that data can be decrypted within a secure environment but remain invisible to the operating system.

Confidential computing is not the only PET offering the opportunity to perform computations on data without having to do so in the clear. Homomorphic encryption and secure multiparty computation are also being developed for use cases in finance and beyond. But the Intel tech, which is relatively new, is more scalable and suitable for a use case like Danie, says Bertrand Foing, one of Secretarium’s two founders.

“Because all the operations that Secretarium does are happening in an encrypted environment but on decrypted data, we have much more flexibility in terms of operations than we would with pure cryptographic protocols. It’s also orders of magnitude faster,” Foing says.

Performance is key to Danie because its potential members want to be able to submit millions of records into the platform asymmetrically, and have outputs provided almost instantly, he adds.  

Danie has run successful proofs of concept on Secretarium, the first in March 2020 with 22,750 LEIs and 15 data fields, and the second in June 2020 with 200,000 LEIs and around 30 data fields. According to SocGen, banks were able to identify between 5% and 10% of errors within seconds.

Since then, Secretarium has built two apps—Datalign and Semaphore—both available from the same web application user interface, which is intended to be as user-friendly as possible and require no coding skills. Participants can drop their file in the web app, the data format of each field is verified, and finally it’s uploaded to the Danie reconciliation service.

“Participations are kept anonymized, and data remains encrypted at all times. Depending on how large your dataset is, it takes seconds to a few minutes to get a report showing if you are in full consensus with your peers, and where you have anomalies,” says Anthony Ta, project director and innovation and watch leader in SocGen’s corporate and investment banking unit.

In the proof of concept we did with the other banks… we managed to get, in less than a minute, a report showing all the discrepancies we have
Anthony Ta, Societe Generale

“The major difference when comparing Secretarium to other venues is that no one knows whose data is being processed or what data is being processed, and there is no way our data can be monetized,” Ta says.

The Datalign app uses LEIs as matching keys, allowing participants to benchmark their datasets linked to LEIs (names, addresses, and codes such as Bank Identifier Codes, or BICs).

“To reconcile the data, we use one matching key, the LEI. And thanks to this LEI, we can measure all the data linked to it: regulatory data, personal data, companies data,” Ta says. “For example, if I want to measure the BIC code of a bank based on its LEI, I just have to fill in the BIC code provided by the client, then send this LEI to the platform. Then I can see if this data is in consensus with my peers or potentially in error.”

That’s just one data field, Ta adds. “In the proof of concept we did with the other banks, it’s up to 30 data fields, and we managed to get, in less than a minute, a report showing all the discrepancies we have, data field by data field, LEI by LEI.”

Semaphore uses the same idea but for KYC data, such as risk flags or information on Peps. If one bank notices it has not recorded a Pep associated with a particular company, for instance, that becomes the impetus to check that its records are up to date.

Davies says Semaphore is not a replacement for a bank’s KYC processes, but rather a way to efficiently find outliers in the data and fix them. “You can check your homework against others, anonymously and securely,” he says.

Many firms refresh their KYC data on a cycle of one to three years, he adds: data on high-risk counterparties is checked annually, medium-risk data once every two years, and low-risk data can go three years or more with no one checking that it’s up to date.

“That might mean that you don’t look at a company from a KYC or anti-money-laundering perspective at all for three years. If during that three-year period other members of the community identify it as having a Pep associated with it—maybe there’s a new director or new CEO who’s politically exposed—Semaphore reconciliation can tell you that you’re an outlier.”

Critical mass

Possibly the biggest challenge Danie faces is getting the critical mass it needs to make sense of the use-cases. While nine banks have gone through Danie’s approvals process, loaded data into Secretarium and actively reconciled data, only SocGen has gone public with its involvement. The consortium is hoping to get five banks signed up in order to establish the legal entity, which in itself is expected to draw in more banks once it becomes official.

Joining any consortium with competitors involves performing intensive legal and information security checks internally, not to mention the parallel work the consortium must do to onboard members. And for a project like Danie, which involves the sharing of sensitive and often highly regulated data, the compliance checks are multiplied.

Internally, banks must obtain validation from the business, their compliance and legal departments, as well as their cybersecurity and IT teams, Ta says. “So it depends on how big your company is, for instance. And that’s why one of the challenges in collaborating with other banks is the level of maturity regarding technology: some banks might not have experts in this technology,” he says.

Another issue is that Danie has attracted interest from banks that are headquartered in Europe, the US, and South-east Asia, jurisdictions with very different attitudes towards and legislation around data sharing. In a bid to stay as neutral as possible, the consortium has chosen to deploy most of its services in Switzerland, where the country’s main infrastructure provider is more than half-owned by the government.

Ta says SocGen itself is still exploring the technology with its compliance and anti-money laundering department, as well as with French regulator the Autorité de Contrôle Prudentiel et de Résolution, to make sure it can share data with other banks, especially on client transactions.

Ta and Foing say five is the optimal number of banks required to launch Danie. Any fewer than that and the quality of the data would be diminished as there would be less chance that members share clients or have overlaps in their LEI data, for example, which makes pooling data in this way somewhat pointless.

Also, the fewer banks there are, the easier it is for other members to make inferences about whose data is whose, Foing says.

“Some of the fields that these banks are reconciling are very low entropy,” he says. In cybersecurity, low entropy data provides lots of potential to predict generated values—in other words, it makes it easier to guess what has been encrypted.

“And some of the legal entities they are willing to process are located in geographic areas that provide too much information to competitors. That is a privacy problem that encryption itself cannot solve, so the big challenge we have with a low number of banks is that we need to apply a big layer of differential privacy in the processing to ensure that every output the banks get out of this data-pooling service has sufficient entropy and doesn’t reveal any sensitive information.”

Foing says this is a “teething problem” that will only persist while Danie remains small.

“It’s this first step—having them sign a legal agreement and start creating a legal entity—that is the most difficult. As soon as this milestone is reached, then we’ll see more banks want to join,” he says.  

Prospective members can see the benefit of the tech, he says, and are struggling with KYC processes. “But these collaboration tools that we provide only have value when a good number of banks are participating. Unless we have at least five banks using the platform, it’s going to be hard to convince new banks to be the first to join and to do the hard work that would benefit others. But as soon as we get those five banks, then we get straight to 10 or even 15, because so many of them are in the position of wanting to join.”

He adds that SocGen’s buy-in has been very helpful in driving interest in the project.

If Danie can set up the legal entity, there are many possible use cases that haven’t been explored yet, Ta says. Danie could, for example, be used to benchmark corporate actions data from data vendors. “We could also use this kind of technology to provide confidential legal data with anonymization or encryption to a third party but not in the clear,” he adds.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Data catalog competition heats up as spending cools

Data catalogs represent a big step toward a shopping experience in the style of Amazon.com or iTunes for market data management and procurement. Here, we take a look at the key players in this space, old and new.

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here