Free Quiz
Write for Us
Learn Artificial Intelligence and Machine Learning
  • Artificial Intelligence
  • Data Science
    • Language R
    • Deep Learning
    • Tableau
  • Machine Learning
  • Python
  • Blockchain
  • Crypto
  • Big Data
  • NFT
  • Technology
  • Interview Questions
  • Others
    • News
    • Startups
    • Books
  • Artificial Intelligence
  • Data Science
    • Language R
    • Deep Learning
    • Tableau
  • Machine Learning
  • Python
  • Blockchain
  • Crypto
  • Big Data
  • NFT
  • Technology
  • Interview Questions
  • Others
    • News
    • Startups
    • Books
Learn Artificial Intelligence and Machine Learning
No Result
View All Result

Home » Like human brains, large language models reason about diverse data in a standard way

Like human brains, large language models reason about diverse data in a standard way

Tarun Khanna by Tarun Khanna
March 21, 2025
in Machine Learning
Reading Time: 5 mins read
0
Like human brains, large language models reason about diverse data in a standard way

Photo Credit: https://news.mit.edu/

Share on FacebookShare on TwitterShare on LinkedInShare on WhatsApp

A new study shows LLMs represents exclusive data types based totally on their underlying meanings and reason about statistics of their dominant language

While early language models may process text, contemporary large language models now accomplish highly diverse task on various types of data. For example, LLMs can understand many languages, generate computer code, resolve math problems, or answer questions on images and audio.

MIT researchers explored the inner workings of LLMs to good understanding in, how they process such classified records, and found proof that they share some resemblance with the human brain.

Also Read:

AI learns how vision and sound are connected, without human intervention

New study explores role of generative AI in the use of copyrighted material

How machine learning can spark many discoveries in science and medicine

“Periodic table of machine studying” could fuel AI discovery

Neuroscientists believe the human brain has a “semantic hub” inside the anterior temporal lobe that combines semantic information from numerous modalities, such as visual records and tactile inputs. This semantic hub is associated to modality-specific “spokes” that path information to the hub. The MIT researchers located that LLMs use a similar mechanism by the theoretical processing data from diverse modalities in a central, generalized way. For example, a model that has English as its dominant language might rely upon English as a central medium to process the inputs in Japanese or reason about arithmetic, computer code, etc. Moreover, the researchers display that they could interfere in a model’s semantic hub by way of the using the text content in the model’s dominant language to alternate its outputs, even if the model is processing statistics in different languages.

These findings could help scientists train future LLMs which are better able of handle various data.

“LLMs are big black boxes. They have achieved very impressive overall performance, however we have little knowledge about their inner working mechanisms. I hope this will be an early step to better recognize how they work so we are able to improve upon them and better manipulate them while required,” stated Zhaofeng Wu, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on this research.

His co-authors including Xinyan Velocity Yu, a graduate student at the University of Southern California (USC); Dani Yogatama, an accomplice professor at USC; Jiasen Lu, a research scientist at Apple; and senior creator Yoon Kim, an assistant professor of EECS at MIT and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The studies will be presented at the International Conference on Learning Representations.

 

Integrating diverse data

The researchers based totally the new study upon prior work which suggested that English-centric LLMs use English to carry out reasoning procedures on diverse languages.

Wu and his collaborators developed this concept, introducing an in-depth study a look at into the mechanisms LLMs use to process diverse data.

An LLM, which consists of many interconnected layers, separates input text content into words or sub-words called tokens. The model allots a depiction to each token, which permits it to explore the relationships among tokens and create the next phrase in a series. In the case of images or audio, those tokens correspond to specific regions of an image or sections of an audio clip.

The researchers determined that the model’s first layers processes data in its precise language or modality, just like the modality-precise spokes in the human brain. Then, the LLM transform tokens into modality-agnostic depiction because it reasons about them during its internal layers, akin to how the brain’s semantic hub combines numerous information.

The model allots similar depiction to inputs with similar meanings, notwithstanding their data kind, consisting of images, audio, computer code, and arithmetic problems. Even though an image and its text content caption are specific data types, due to the fact they share the identical meanings, the LLM would allots them similar depiction.

For example, an English-dominant LLM – “thinks” about a Chinese-text input in English earlier than producing an output in Chinese. The model has a similar reasoning proneness for non-text inputs like computer code, math problems, or even multimodal data.

To test this theory, the researchers surpassed a pair of sentences with the same that means however written two different languages via the model. They measured how similar the model’s depiction were for each sentence.

Then they directed a second set of experiments in which they fed an English-dominant model text in a different language, like Chinese, and measured how similar its inner depiction was to English versus Chinese. The researchers performed similar experiments for other data types.

They continuously observed that the model’s depicted had been similar for sentences with similar meanings. In addition, across many data types, the tokens the model processed in its internal layers have been more like English-centric tokens than the input data type.

“A lot of these input data types seem incredibly different from language, so we were very amazed that we can probe out English-tokens whilst the model processes, for example, mathematic or coding expressions,” Wu stated.

 

Leveraging the semantic hub

The researchers think LLMs may examine this semantic hub strategy all through training due to the fact it is an economical way to process numerous records.

“There are thousands of languages obtainable, but a whole lot of knowledge is shared, like common sense understanding or real knowledge. The model doesn’t want to copy that knowledge across languages,” Wu stated.

The researchers also attempted intervening within the model’s inner layers the use of English text when it become processing different languages. They located that they may predictably change the model outputs, even though the ones outputs were in different languages.

Scientists ought to leverage this phenomenon to encourage the model to share as a lot facts as viable across numerous information types, potentially boosting performance.

But then again, there could be ideas or knowledge that are not translatable across languages or data sorts, like culturally unique knowledge. Scientists might need LLMs to have a few language-particular processing mechanisms in the ones instances.

“How do you maximally proportion each time viable however also allow languages to have some language-specific processing mechanisms? That might be explored in future work on model architectures,” Wu stated.

In addition, researchers ought to use these insights to enhance multilingual models. Often, an English-dominant model that learns to talk every other language will lose a number of its accuracy in English. A higher expertise of an LLM’s semantic hub ought to assist researchers prevent this language interference, he stated.

“Understanding how language models procedure inputs throughout languages and modalities is a key query in artificial intelligence. This paper makes an thrilling connection to neuroscience and indicates that the proposed ‘semantic hub hypothesis’ holds in contemporary language models, in which semantically comparable representations of different data kinds are created in the version’s intermediate layers,” stated Mor Geva Pipek, an assistant professor within the School of Computer Science at Tel Aviv University, who became not involved with this works. “The hypothesis and experiments well tie and expand findings from preceding works and could be influential for future studies on growing higher multimodal models and reading hyperlinks among them and mind function and cognition in human.”

This studies is funded, in part, by using the MIT-IBM Watson AI Lab.

ShareTweetShareSend
Previous Post

AI will soon be smarter than human beings

Next Post

What are AI hallucinations? Why AIs sometimes make things up

Tarun Khanna

Tarun Khanna

Founder DeepTech Bytes - Data Scientist | Author | IT Consultant
Tarun Khanna is a versatile and accomplished Data Scientist, with expertise in IT Consultancy as well as Specialization in Software Development and Digital Marketing Solutions.

Related Posts

Making AI-generated code more correct in any language
Machine Learning

Making AI-generated code more correct in any language

April 24, 2025
The Rise of AI: Leading Computer Scientists anticipate a Star Trek-Like Future
Machine Learning

The Rise of AI: Leading Computer Scientists anticipate a Star Trek-Like Future

April 15, 2025
Researchers teach LLMs to solve complex planning challenges
Machine Learning

Researchers teach LLMs to solve complex planning challenges

April 3, 2025
machine learning
Technology

Accelerating Machine Learning Model Deployment with MLOps Tools

November 16, 2024
Next Post
What are AI hallucinations? Why AIs sometimes make things up

What are AI hallucinations? Why AIs sometimes make things up

TRENDING

History of Neural Networks

History of Neural Networks

History of Neural Networks

by Tarun Khanna
February 9, 2021
0
ShareTweetShareSend

How SSL Encryption Secures Big Data In Cloud Computing?

by Tarun Khanna
April 14, 2022
0
ShareTweetShareSend

Machine Learning Role In Paraphrasing Tools To Avoid Plagiarism

Machine-Learning-Role-In-Paraphrasing-Tool
by Tarun Khanna
June 9, 2022
0
ShareTweetShareSend

Accelerating Machine Learning Model Deployment with MLOps Tools

machine learning
by Tarun Khanna
November 16, 2024
0
ShareTweetShareSend

Top three Online Data Science Courses to Boost your Career

career-in-data-science
by Tarun Khanna
August 17, 2021
0
ShareTweetShareSend

Politicians’ memecoins, dropped court cases fuel crypto ‘crime supercycle’

Politicians’ memecoins, dropped court cases fuel crypto ‘crime supercycle’

Photo Credit: https://cointelegraph.com/

by Tarun Khanna
June 20, 2025
0
ShareTweetShareSend

DeepTech Bytes

Deep Tech Bytes is a global standard digital zine that brings multiple facets of deep technology including Artificial Intelligence (AI), Machine Learning (ML), Data Science, Blockchain, Robotics,Python, Big Data, Deep Learning and more.
Deep Tech Bytes on Google News

Quick Links

  • Home
  • Affiliate Programs
  • About Us
  • Write For Us
  • Submit Startup Story
  • Advertise With Us
  • Terms of Service
  • Disclaimer
  • Cookies Policy
  • Privacy Policy
  • DMCA
  • Contact Us

Topics

  • Artificial Intelligence
  • Data Science
  • Python
  • Machine Learning
  • Deep Learning
  • Big Data
  • Blockchain
  • Tableau
  • Cryptocurrency
  • NFT
  • Technology
  • News
  • Startups
  • Books
  • Interview Questions

Connect

For PR Agencies & Content Writers:

connect@deeptechbytes.com

Facebook Twitter Linkedin Instagram
Listen on Apple Podcasts
Listen on Google Podcasts
Listen on Google Podcasts
Listen on Google Podcasts
DMCA.com Protection Status

© 2024 Designed by AK Network Solutions

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Artificial Intelligence
  • Data Science
    • Language R
    • Deep Learning
    • Tableau
  • Machine Learning
  • Python
  • Blockchain
  • Crypto
  • Big Data
  • NFT
  • Technology
  • Interview Questions
  • Others
    • News
    • Startups
    • Books

© 2023. Designed by AK Network Solutions