Continuous evolution: Researchers work to specialize NLP for finance

From academics to data teams at investment banks, those in and adjacent to the capital markets are looking to specialize natural language processing models to understand and break down financial data.

Researchers are building on pre-trained natural language processing (NLP) models, introducing them to the idiosyncratic terminology of the finance world to find new uses for automation.

One of those researchers is Petter Kolm, a professor of mathematics at New York University’s Courant Institute of Mathematical Studies and a former quant at Goldman Sachs Asset Management. His work has included examining NLP models to understand how they can better make use of textual information.

“What researchers such as myself and others do is to take these pre-trained models, freeze the parameters of the neural networks—perhaps with the exception of the last few layers—and then train those last layers to specialize the model for a particular task,” Kolm says.

Over the past five years, pre-trained models like Google’s Bidirectional Encoder Representations from Transformers (Bert) and OpenAI’s GPT-3 have changed the game of enterprise-grade NLP. The use of open-source models has revolutionized the use of NLP in finance—an industry that works with an enormous amount of data, which is required to properly train deep-learning models. These open-source tools allow banks and asset managers to more efficiently experiment with this form of machine learning.

When Google published Bert, it supplied the code and downloadable versions of the model already pre-trained on Wikipedia and a dataset called BookCorpus, about 3 million words in total. Anyone could download and run the model without having to duplicate the costly and energy-intensive process of training it, so companies that offer NLP products and services have since been able to update their offerings to transformer models, for increased efficiency and speed. GPT-3 was trained on an even more gargantuan dataset—175 billion machine-learning parameters.

For use in specific domains, however, these models must be further trained, Kolm says. “Bert, for example, is trained more as a general language model, and will therefore not have a good understanding of financial concepts and terminology,” he says.

As more text sources have since become available to train NLP models, researchers can run more experiments. Different flavors of Bert have since emerged for domain-specific uses: For example, SciBert performs NLP tasks in science; FinBert, a model trained on Reuters news stories and a corpus called the Financial Phrasebook, performs sentiment analysis on stock market headlines.

Both Bert and GPT-3 are based on transformers, which include a mechanism that pays attention, so to speak, to how each word in a sentence derives its meanings in context of the other words.

Context is key in NLP, Kolm says. Before Bert and GPT-3, models weren’t great at understanding context. For instance, if you used the banking-specific term “current account,” the model would not understand you. It would read “current” and “account” as two separate words, and thus fail to grasp the full term’s true meaning. And even Bert must be trained on topically appropriate data to make sense of context in specific industries.

To train a model, a variety of sentences, either incomplete or full, are fed into a neural network. Words in a sentence are masked out, and it’s the task of the network to figure out what the missing word is based on sentences it has already seen. As an example, the full sentence may have been “this is a yellow car,” with yellow masked out and the network being fed data about New York City cabs. In the case that the network does not guess yellow, the parameters of neural network will be tweaked so it can get closer to the answer.

Following the release of Bert, a sentence-based version of the model called Sentence-Bert was developed by researchers Nils Reimers and Iryna Gurevych. The model looks at whole sentences and is trained with larger bodies of text to guess what sentence would come first.

“So for example, if I have a stereo with an equalizer, we can tune or fine tune the equalizer based on the music we are listening to,” Kolm says.

Last year, Kolm and a group of researchers from NYU, the Barcelona Supercomputing Center, and the University of Edinburgh published a study that proposed a new model specifically for use in finance called Financial Embedding Analysis of Sentiment (FinEAS).

To build the proposed model, the researchers took standard, or vanilla, Bert and fine-tuned it with Sentence-Bert to perform financial sentiment analysis. The researchers used samples from a dataset on US news provided by RavenPack, a data provider that has analytics on about 300,000 companies, or most of the investable market.

The results were that FinEAS outperformed not only the two baseline models—the vanilla Bert and another kind of machine learning model, called a long-term short memory network—but also the more specialized FinBert. Vanilla Bert performed worse than the memory network. The researchers concluded that vanilla Bert doesn’t perform well for domain-specific tasks like sentiment analysis out of the box.

A constant evolution

Kolm and his team of researchers aren’t the only ones looking at applying NLP to understand the world of financial information; the UBS Evidence Lab has had its eye on it as well.

Like many others, the Evidence Lab, which operates as the investment bank’s alternative data platform, saw the early challenge of generalized libraries and simple approaches in NLP that were considered cutting edge but didn’t meet the needs of financial markets.

Correlations in alternative data for finance can be difficult to understand, says Jason DeRise, head of data strategy at UBS Evidence Lab. The unit aims to serve as the translation layer for financial professionals.

“We’re looking at proxies; we’re looking at novel approaches to understand human behavior or corporate decision-making; [and we’re looking at] reverse engineering pricing decisions to understand demand trends,” he says.

DeRise continues: “Pre 2019, for Evidence Lab, we were using natural language processing and applying it to earnings calls, and we were able to get value out of it—getting sentiment scores for the companies in terms of management vs. what the analysts were saying. We were able to do some basic topic extraction and it moved us along in the right direction.”

But the team wanted to dig deeper. In 2020, the it rolled out the Deep Theme Explorer, a text analytics tool for sentiment analysis. At the end of that year, UBSUS equity strategy team used Deep Theme Explorer to look at Covid 19 vaccines, the press mentions of the vaccines and associated sentiment to understand how the market would move in response.

UBS declined to provide specifics on the training processes and libraries it uses for its models.

“It uses much more advanced libraries and capabilities, but we have to fine tune it for financial markets,” DeRise says. While some may see the fine tuning and training as tedious or soul crushing, that necessary step is embraced by the lab.

There are still barriers to be overcome. “Things lag in finance and for good reason, which is that it’s often a high-stakes game [and] it’s very regulated,” says Mikey Shulman, former head of machine learning at Kensho (which was acquired by S&P Global in 2018) and lecturer at the MIT Sloan School of Management. Additionally while there is understandable hype around what models like GPT-3 can do, the finance world doesn’t have as many applications that can fit.

Shulman points to the world’s understanding of what the word “coronavirus” means in 2022 vs. the utter confusion it would have brought in 2019 as an example of the need to continuously update knowledge. “We’re responsible for building these things—we’re going to have to do that and [figure out] exactly how and how often, and that stuff is hard. Somebody has the annoying problem of figuring out what text I should be exposing this to.”

And as language continues to evolve, machines will see that added as well. “Humans struggle to assess if a tweet is sarcastic or what the evolving meaning of slang words actually mean. Humans still need to accurately train these machines to handle those nuances without losing the accuracy of the base libraries as well,” says DeRise.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Data catalog competition heats up as spending cools

Data catalogs represent a big step toward a shopping experience in the style of Amazon.com or iTunes for market data management and procurement. Here, we take a look at the key players in this space, old and new.

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here