Unpacking the bias of large language models

In a new research, researchers find out the root cause of a sort of bias in LLMs, paving the way for more correct and dependable AI systems.

Research has proven that large language models (LLMs) tend to exaggerate information at the starting and end of a document or conversation, at the same time as neglecting the middle.

This “position bias” means that, if a lawyer is using an LLM-powered virtual assistant to retrieve a certain phrase in a 30-page affidavit, the LLM is much more probably to locate the right text if it’s at the preliminary or very last pages.

3 Questions: The pros and cons of synthetic data in AI

AI takes flight: Venture to enhance machine learning in aircraft surface inspections

New AI system could change how autonomous vehicles navigate without GPS

Toward a latest framework to accelerate large language model inference

MIT researchers have determined the mechanism at the back of this phenomenon.

They generated a theoretical framework to research how information flows through the machine-learning architecture that forms the spine of LLMs. They determined that certain layout choices which control how the models processes input data can cause position bias.

Their experiments disclosed that model architectures, specially those affecting how information is spread throughout input words in the model, can give upward push to or enhance position bias, and that training data also contribute to the trouble.

In addition to identifying the origins of position bias, their framework may be used to diagnose and accurate it in future model designs.

This could cause more dependable chatbots that stay on topic throughout long conversations, medical AI structures that reason more relatively while dealing with a trove of patient records, and code assistants that pay closer attention to all components of a program.

“These models are black boxes, so as an LLM user, you likely don’t know that role bias can cause your model to be inconsistent. You just feed it your documents in whatever order you need and anticipate it to work. But by understanding the basic mechanism of these black-box models better, we will enhance them through addressing those obstacles,” stated Xinyi Wu, a graduate student in the MIT Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems (LIDS), and primary author of a paper on this research.

Her co-authors include Yifei Wang, an MIT postdoc; and senior authors Stefanie Jegelka, an accomplice professor of electrical engineering and computer science (EECS) and a member of IDSS and the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Ali Jadbabaie, professor and head of the Department of Civil and Environmental Engineering, a center college member of IDSS, and a principal investigator in LIDS. The research might be presented at the International Conference on Machine Learning.

Analyzing attention

LLMs like Claude, Llama, and GPT-4 are powered via a type of neural network structure known as a transformer.
Transformers are designed to process sequential data, encoding a sentence into chunks referred to as tokens and then learning the relationships between tokens to anticipate what words comes next.

These models have gotten very good at this because of the attention mechanism, which uses interconnected layers of data processing nodes to make sense of context by permitting tokens to selectively focus on, or attend to, associated tokens.

But if every token can attend to every other token in a 30 page document, that quickly turns into computationally intractable.

So, when engineers constructs transformer models, they often employ attention masking methods which restrict the words a token can attend to.

For instance, a causal mask only allows words to attend of those who got here earlier than it.

Engineers also use positional encodings to support the model understand the location of each word in a sentence, enhancing performance.

The MIT researchers built a graph-primarily based theoretical framework to find how those modeling alternatives, attention masks and positional encodings, ought to affect position bias.

“Everything is coupled and tangled within the attention mechanism, so it’s very difficult to examine. Graphs are a flexible language to demonstrate the dependent relationship among words within the attention mechanism and trace them throughout multiple layers,” Wu says.

Their theoretical analysis suggestive that causal masking offers the model an inherent bias toward the starting of an input, even when that bias doesn’t exist within the data.

If the earlier words are tremendously insignificant for a sentence’s meaning, causal masking can cause the transformer to pay more attention to its beginning besides.

“While it’s often true that in advance words and later words in a sentence are more essential, if an LLM is used on a task that is not natural language generation, like ranking or data retrieval, these biases can be extremely harmful,” Wu says.

As a model develops, with extra layers of attention mechanism, this bias is developed because earlier parts of the input are used more often in the model’s reasoning process.

They also found that using positional encodings to link words more strongly to close by words can mitigate position bias. The approach refocuses the model’s interest in the right place, but its impact may be diluted in models with more attention layers.

And these layout choices are simplest one purpose of position bias — some can come from training data the model makes use of to learn how to prioritize words in a sequence.

“If you recognize your data are biased in a certain way, then you ought to additionally finetune your model on top of adjusting your modeling choices,” Wu says.

Lost in the middle

After they’d established a theoretical framework, the researchers carried out experiments in which they systematically various the location of the appropriate solution in text sequences for an information retrieval task.

The experiments confirmed a “lost-in-the-middle” phenomenon, wherein retrieval accuracy followed a U-shaped pattern. Models performed great if the right answer was located at the starting of the sequence. Performance declined the nearer it were given to the middle before rebounding a bit if the correct solution was close to the end.

Ultimately, their work suggests that using a distinctive masking approach, getting rid of extra layers from the attention mechanism, or strategically employing positional encodings may want to reduce position bias and enhance a model’s accuracy.

“By doing a combination of theory and experiments, we were a cable to look at the effects of model layout picks that weren’t clear at the time. If you need to use a model in high-stakes applications, you must know when it’s will work, while it won’t, and why,” Jadbabaie says.

In the future, the researchers want to future explore the consequences of positional encodings and observe how role bias could be strategically exploited in certain applications.

“These researchers offer an extraordinary theoretical lens into the attention mechanism at the heart of the transformer model.

They offer a compelling analysis that clarifies longstanding quirks in transformer behavior, showing that attention mechanisms, specially with causal masks, inherently bias models toward the starting of sequences. The paper achieves the best of both worlds — mathematical readability paired with insights that attain into the center of real-world structures,” says Amin Saberi, professor and director of the Stanford University Center for Computational Market Design, who was not concerned with this work.

This research is supported, in part, by the U.S. Office of Naval Research, the National Science Foundation, and an Alexander von Humboldt Professorship.