First there was ChatGPT, an artificial intelligence model with a seemingly uncanny ability to mimic human language. Now there is the Bloomberg-created BloombergGPT, the first large language model built specifically for the finance industry.
Like ChatGPT and other recently introduced popular language models, this new AI system can write human-quality text, answer questions, and complete a range of tasks, enabling it to support a diverse set of natural language processing tasks unique to the finance industry.
Mark Dredze, an associate professor of computer science at Johns Hopkins University's Whiting School of Engineering and visiting researcher at Bloomberg, was part of the team that created it. Dredze is also the inaugural director of research (Foundations of AI) in the new AI-X Foundry at Johns Hopkins.
The Hub spoke with Dredze about BloombergGPT and its broader implications for AI research at Johns Hopkins.
What were the goals of the BloombergGPT project?
Many people have seen ChatGPT and other large language models, which are impressive new artificial intelligence technologies with tremendous capabilities for processing language and responding to people's requests. The potential for these models to transform society is clear. To date, most models are focused on general-purpose use cases. However, we also need domain-specific models that understand the complexities and nuances of a particular domain. While ChatGPT is impressive for many uses, we need specialized models for medicine, science, and many other domains. It's not clear what the best strategy is for building these models.
In collaboration with Bloomberg, we explored this question by building an English language model for the financial domain. We took a novel approach and built a massive dataset of financial-related text and combined it with an equally large dataset of general-purpose text. The resulting dataset was about 700 billion tokens, which is about 30 times the size of all the text in Wikipedia.
We trained a new model on this combined dataset and tested it across a range of language tasks on finance documents. We found that BloombergGPT outperforms—by large margins!—existing models of a similar size on financial tasks. Surprisingly, the model still performed on par on general-purpose benchmarks, even though we had aimed to build a domain-specific model.
Why does finance need its own language model?
While recent advances in AI models have demonstrated exciting new applications for many domains, the complexity and unique terminology of the financial domain warrant a domain-specific model. It's not unlike other specialized domains, like medicine, which contain vocabulary you don't see in general-purpose text. A finance-specific model will be able to improve existing financial NLP tasks, such as sentiment analysis, named entity recognition, news classification, and question answering, among others. However, we also expect that domain-specific models will unlock new opportunities.
For example, we envision BloombergGPT transforming natural language queries from financial professionals into valid Bloomberg Query Language, or BQL, an incredibly powerful tool that enables financial professionals to quickly pinpoint and interact with data about different classes of securities. So if the user asks: "Get me the last price and market cap for Apple," the system will return get(px_last,cur_mkt_cap) for(['AAPL US Equity']). This string of code will enable them to import the resulting data quickly and easily into data science and portfolio management tools.
What did you learn while building the new model?
Building these models isn't easy, and there are a tremendous number of details you need to get right to make them work. We learned a lot from reading papers from other research groups who built language models. To contribute back to the community, we wrote a paper with over 70 pages detailing how we built our dataset, the choices that went into the model architecture, how we trained the model, and an extensive evaluation of the resulting model. We also released detailed "training chronicles" that contains a narrative description of the model-training process. Our goal is to be as open as possible about how we built the model to support other research groups who may be seeking to build their own models.
What was your role?
This work was a collaboration between Bloomberg's AI Engineering team and the ML Product and Research group in the company's chief technology office, where I am a visiting researcher. This was an intensive effort, during which we regularly discussed data and model decisions, and conducted detailed evaluations of the model. Together we read all the papers we could find on this topic to gain insights from other groups, and we made frequent decisions together.
The experience of watching the model train over weeks is intense, as we examined multiple metrics of the model to best understand if the model training was working. Assembling the extensive evaluation and the paper itself was a massive team effort. I feel privileged to have been part of this fantastic group.
Was Johns Hopkins connected to this effort in other ways?
The team has strong ties to John Hopkins. One of the lead engineers on this project is Shijie Wu, who received his doctorate from Johns Hopkins in 2021. Additionally, Gideon Mann, who received his PhD from Johns Hopkins in 2006, was the team leader. I think this shows the tremendous value of a Johns Hopkins education, where our graduates continue to push the scientific field forward long after graduation.
How will Johns Hopkins benefit from this work?
There is a large demand from our students to learn about how large language models work and how they can contribute to building them. In the past year alone, the Whiting School of Engineering's Department of Computer Science has introduced three new courses that cover large language models to some degree.
The latest advances in this area have been coming from industry. Through my role on this industrial team, I have gained key insights into how these models are built and evaluated. I bring these insights into my research and the classroom, giving my students a front-row seat to study these exciting models. I think it speaks volumes about Johns Hopkins' AI leadership that our faculty are involved in these efforts.
How does this work connect to your role as a director of research in the new AI-X Foundry?
The goal of the AI-X Foundry is to transform how Johns Hopkins conducts research through AI. Johns Hopkins researchers are among the world's leaders in leveraging artificial intelligence to understand and improve the human condition. We recognize that a critical part of this goal is a strong collaboration between our faculty and industry leaders in AI, like Bloomberg. Building these relationships with the AI-X Foundry will ensure researchers have the ability to conduct truly transformative and cross-cutting AI research, while providing our students with the best possible AI education.