Improving Large Language Models

The rules of syntax are well known but the rules of semantics (beyond predicate calculus) are not and this limits the capability of Large Language Models (LLMs). My PhD thesis, “A Conversation Problem Solving System Represented as Procedures with Natural Language Conceptual Structure” (www.shafe.co.uk/wp-content/uploads/PhD-Thesis-A-Conversational-Problem-Solving-System.pdf) attempted to approach this problem. It was based on the early work of Roger Schank (en.wikipedia.org/wiki/Roger_Schank) on “Primitive Concepts Underlying Verbs of Thought” (apps.dtic.mil/sti/tr/pdf/AD0744634.pdf) explained more fully in Chapter 5 of “Computer Models of Thought and Language”(ed. R. Schank, 1973, W.H. Freeman).
The interesting aspect of his paper is that he reduced the number of verbs to about 10 (later increased to seventeen). This is profound as it results in a semantic grammar that represents the underlying meaning of a statement. Research into this area progressed slowly over the years and neural networks took centre stage. Now seems to be the time to combine the best of both approaches.
My idea is to process the input during tokenisation and produce a semantic set of tokens, à la Schank, with each verb expanded into its appropriate linear semantic structure. The neural network weights would then be adjusted based on the meaning rather than the text and the system could be said to understand what it was outputting.
Note that nouns can be handled using, for example, Word2Vec, but similar verbs are not close in a multi-dimensional space because they require a complex structure to represent their meaning. For example, “I took the money” and “I gave the money” are, according to Schank, something like “I TRANS money from place1 to myself” and “I TRANS money from myself to place2”. By explicating verbs we reveal their underlying structure and meaning and if we feed that into the neural network then it should be faster to learn and better ‘understand’ the data and so produce better outputs.
I asked ChatGPT, “Human understanding partly requires a knowledge of the way verbs change the world. If all verbs could be processed into a structure that represented such knowledge then LLMs would learn more quickly and have a better understanding of the world. Roger Schank proposed such a structure in his paper “Primitive Concepts Underlying Verbs of Thought”.”
It replied, “You’re right about the importance of understanding how verbs can change or reflect states in the world, and Roger Schank’s work in conceptual dependency theory addresses this aspect. In his paper “Primitive Concepts Underlying Verbs of Thought,” Schank proposes a structure for categorizing and understanding verbs, especially those related to cognitive processes, in a more systematic and meaningful way.
Schank’s approach involves breaking down verbs into primitive actions or concepts. This methodology aims to represent the essential meaning of verbs in a simplified, universal form. By doing so, it becomes easier for a system (like a language model or an AI) to understand and process the implications and nuances of different actions or states described by verbs.
Incorporating such structured knowledge into the training of Large Language Models (LLMs) could indeed enhance their understanding of language and the world. By having a more nuanced grasp of how verbs signify changes or states, LLMs could make more accurate inferences and generate more contextually appropriate responses.
However, implementing this approach in LLMs would require a significant amount of structured data and sophisticated algorithms to integrate this structured understanding of verbs with the model’s broader language understanding capabilities. It’s a challenging task but one that could lead to significant advancements in the field of natural language understanding.
Schank’s work is a pivotal contribution to the field of AI and language understanding, highlighting the importance of deep semantic processing in achieving more human-like understanding in machines.”
I believe the task is challenging but straightforward as the number of verbs in the LLM’s dictionary is finite and each one has fixed linear semantic structure that can replace it. The complication is that the structure will contain placeholders that must be filled by carrying out a syntactic analysis of the sentence.