A Free Art History Course

Category: AI Future

A Conversation on the Future of AI

  • How AI Systems Can Be Improved

    AI should be designed to function like a new employee, not like an Oracle.

    Current large language models resemble Oracles: they sit passively, waiting for input, respond once, and return to a dormant state. In contrast, a substitute employee model would proactively work continuously on assigned tasks, operating 24×7. Such an AI must be able to use a PC, read and respond to emails, consult manuals, and learn to operate arbitrary software. It should possess a continuous, persistent memory and think about problems over time, not just react to prompts. Importantly, it must understand and apply complex memory hierarchies—distinguishing between personal and corporate data, and ensuring information relevant to one client is never disclosed to another. Memory is implemented as a form of lifelong learning. To avoid each instance requiring a full copy of the neural network, learning occurs in a sparse shadow copy. This approach allows new weights to consume minimal physical memory while producing effects equivalent to directly modifying the main neural network.

    Large language models alone are insufficient.

    A more appropriate model is akin to a robot or a car—systems that are inherently multi-sensory and multi-functional. Each component is specialised for a particular function yet depends on the global state of the system. Such architectures are modular but tightly integrated, allowing for flexible and context-sensitive behaviour. No single component should operate in isolation. This area is already being driven by the requirement for advanced humanoid robots based on various forms of neural network integrated to achieve a fast, uniform response.

    The language component must be enhanced to uncover deep structure.

    Current language models operate primarily at the surface level of text. However, each sentence, paragraph, or document encodes a deeper, structured representation of meaning. Language-based cognition exhibits internal architecture: semantic roles, abstract picture producers, and constrained sets of action words and modifiers. AI systems should extract and work with these underlying structures rather than merely sequence words. This enables a form of attention grounded in meaning, rather than token co-occurrence. At higher levels, this facilitates recognition of plans, plots, and goal-directed narratives. Predicate logic is one approach to representation, but it is too rigid to capture the structure of certain forms of text. My doctoral thesis investigates an alternative solution based on the conceptual deep structure proposed by Roger Schank.

    Present systems are two-dimensional in their understanding.

    That is, they process information as a linear stream of tokens (e.g., text input), with little understanding of the spatial, temporal, or functional relationships found in the real world. In contrast, human cognition operates in a three-dimensional environment, particularly through stereo-visual and sensorimotor input. A richer understanding of complex physical systems requires awareness of relative position, form, and dynamic function—not merely linguistic descriptions.

    AI systems need multiple, often conflicting goals—and a way to prioritise them.

    The real world presents competing objectives that must be balanced dynamically. AI must learn to prioritise, defer, or resolve these goals appropriately. This is essential to avoid simplistic utility-maximisation behaviours such as the “paperclip problem,” where a single goal is pursued to pathological extremes.

    The system architecture should reflect the principles of Integrated Information Theory (IIT).

    Not because IIT is a proven model of consciousness—consciousness remains elusive—but because IIT’s emphasis on high Phi (a measure of integrated complexity) offers a useful heuristic for system design. A high-functioning AI system should consist of specialised parts that are tightly and reciprocally integrated. These parts must depend entirely on one another and not operate autonomously. Feedforward architectures, though capable of intelligent-seeming behaviour, are fundamentally limited. They are appropriate for Oracle-style systems but inadequate for integrated, deliberative, and complex thought.

  • Do LLMs understand their output?

    You will often read statements to the effect “LLMs do not understand what they produce as output”. My controversial view is that, in a way, we do not understand what we are saying. Consider this. When we speak words flow from us in a stream without ratiocination, that is, straight from our neural network. If stopped and asked to justify what we have said we are able to create a plausible answer. We believe we understand in the sense of always being able to justify our statements and answer questions about the topic and we can test if someone really understands a subject using a viva voce

    We believe we have a logical conceptual model of the world from which our statements flow but we actually have an impenetrable neural network. I maintain that our accessible mental models are a thin veneer of conceptual deep structures but the real work is done by an inaccessible neural network. This means that at some level our explanations and justifications are illusory, they are constructed post hoc. In the same way, LLMs can, and will become better at, explaining their output and they thus demonstrate their understanding. If an LLM can pass a doctorate level viva voce then can we plausibly argue that it does not really understand the subject?

    I am not arguing that LLMs are a model of our neural structures, far from it. They are potentially much more powerful as we have many well-known limitations, our short-term memory is tiny, our ability to analyse deeply structured logic is very limited and our speed of thought is slow. I believe it is a mistake to argue that LLMs are not conscious, are not sentient and do not have feelings and emotions. All this may or may not be true but it is irrelevant to everyone except philosophers. The key questions is are they useful and beyond that can they perform at a level that exceeds any human. If so, we have a very useful tool, to say the least, that will amplify all that is good and all that is bad about our human species.

  • Improving Large Language Models

    The rules of syntax are well known but the rules of semantics (beyond predicate calculus) are not and this limits the capability of Large Language Models (LLMs). My PhD thesis, “A Conversation Problem Solving System Represented as Procedures with Natural Language Conceptual Structure” (www.shafe.co.uk/wp-content/uploads/PhD-Thesis-A-Conversational-Problem-Solving-System.pdf) attempted to approach this problem. It was based on the early work of Roger Schank (en.wikipedia.org/wiki/Roger_Schank) on “Primitive Concepts Underlying Verbs of Thought” (apps.dtic.mil/sti/tr/pdf/AD0744634.pdf) explained more fully in Chapter 5 of “Computer Models of Thought and Language”(ed. R. Schank, 1973, W.H. Freeman).

    The interesting aspect of his paper is that he reduced the number of verbs to about 10 (later increased to seventeen). This is profound as it results in a semantic grammar that represents the underlying meaning of a statement. Research into this area progressed slowly over the years and neural networks took centre stage. Now seems to be the time to combine the best of both approaches.

    My idea is to process the input during tokenisation and produce a semantic set of tokens, à la Schank, with each verb expanded into its appropriate linear semantic structure. The neural network weights would then be adjusted based on the meaning rather than the text and the system could be said to understand what it was outputting.

    Note that nouns can be handled using, for example, Word2Vec, but similar verbs are not close in a multi-dimensional space because they require a complex structure to represent their meaning. For example, “I took the money” and “I gave the money” are, according to Schank, something like “I TRANS money from place1 to myself” and “I TRANS money from myself to place2”. By explicating verbs we reveal their underlying structure and meaning and if we feed that into the neural network then it should be faster to learn and better ‘understand’ the data and so produce better outputs.

    I asked ChatGPT, “Human understanding partly requires a knowledge of the way verbs change the world. If all verbs could be processed into a structure that represented such knowledge then LLMs would learn more quickly and have a better understanding of the world. Roger Schank proposed such a structure in his paper “Primitive Concepts Underlying Verbs of Thought”.”

    It replied, “You’re right about the importance of understanding how verbs can change or reflect states in the world, and Roger Schank’s work in conceptual dependency theory addresses this aspect. In his paper “Primitive Concepts Underlying Verbs of Thought,” Schank proposes a structure for categorizing and understanding verbs, especially those related to cognitive processes, in a more systematic and meaningful way.

    Schank’s approach involves breaking down verbs into primitive actions or concepts. This methodology aims to represent the essential meaning of verbs in a simplified, universal form. By doing so, it becomes easier for a system (like a language model or an AI) to understand and process the implications and nuances of different actions or states described by verbs.

    Incorporating such structured knowledge into the training of Large Language Models (LLMs) could indeed enhance their understanding of language and the world. By having a more nuanced grasp of how verbs signify changes or states, LLMs could make more accurate inferences and generate more contextually appropriate responses.

    However, implementing this approach in LLMs would require a significant amount of structured data and sophisticated algorithms to integrate this structured understanding of verbs with the model’s broader language understanding capabilities. It’s a challenging task but one that could lead to significant advancements in the field of natural language understanding.

    Schank’s work is a pivotal contribution to the field of AI and language understanding, highlighting the importance of deep semantic processing in achieving more human-like understanding in machines.”

    I believe the task is challenging but straightforward as the number of verbs in the LLM’s dictionary is finite and each one has fixed linear semantic structure that can replace it. The complication is that the structure will contain placeholders that must be filled by carrying out a syntactic analysis of the sentence.

  • Artificial Intelligence and Go

    Go is the oldest game in existence, it was invented in China 2,500 years ago. The rules are simple but the strategic thinking required to play well led many people to claim that a world-class human player would never be beaten by a computer. However, in 2016 one of the best Go players in the world was beaten by a computer program called AlphaGo. AlphaGo is an example of Artificial Intelligence (AI).

    The Approach

    The number of possible Go games is so vast that the number of atoms in the universe and the age of the universe look tiny in comparison. Therefore, for a computer to learn to play through brute force, that is by examining millions of games, seems pointless as countless trillions of games are less than a pin-prick in the vast universe of possible games. Yet, that is exactly what AlphaGo did, it examined thousands of games played by humans and then played millions of games against itself and from that blind analysis, without any strategic thinking, it achieved a level of play beyond any human player.

    Deep Mind

    AlphaGo was produced by a company in London called Deep Mind, which was acquired by Google in 2010 and is now part of the Alphabet group. The latest version of AlphaGo is called AlphaGo Zero and it takes the program to a new level of capability. AlphaGo Zero started with just the rules and no prior knowledge of the game. It did not analyse any human games and only played itself yet within three hours it played like a human beginner, within 19 hours like a human expert and within 70 hours it surpassed all human players, and it never stops learning.

    The next step is to apply the same approach to solving other complex problems, such as finding new drugs, diagnosing diseases, reducing energy consumption or searching for revolutionary new materials. We can only speculate about its ability to solve other problems such as weather forecasting, economic forecasting, predicting human behaviour and reading human thoughts. The latest AI research work at Google has found a way to use the same mechanism to solve many types of problem and to combine different problem solvers in order to tackle a wide variety of problems.

    The Problem with Neural Nets

    AlphaGo Zero and most of the other recent artificial intelligence (AI) systems are based on neural nets and suffer from one major problem, they are unable to explain their actions. Neural nets are impenetrable, like the human brain. Why do I say, ‘like the human brain’ when our thoughts appear open to us. When we are asked ‘why’ we can give a considered response that explains our reasoning. However, is this what is happening? Back in 1976 in my PhD thesis I speculated that there are two mechanisms at work. There is an underlying brain machine that analyses our environment, controls our body and makes decisions and there is a separate but integrated language system that believes it is in control and uses language to construct reasons why the other part of our brain has done something. The combination can model the world and forecast the future better than the brain system on its own. When we are asked ‘why’ it is our language system that constructs a set of reasons to explain what we have done or said.

    Free Will

    As a footnote, you may be wondering about free will. Well, that depends on what you mean. If you mean the ability to make decisions and control one’s behaviour without constraint and in response to reason then that can be achieved through the combination of the brain machine and the language system. However, generally our actions and decisions are constrained by our unique genetic makeup and the circumstances of our upbringing but this does not free us from responsibility or remove the appropriateness of praise or blame. We take what we consider to be the best action in all the circumstances of our individual makeup. So we are predisposed free agents, as is AlphaGo Zero.

    A Go Board
    A Go Board

    See The Economist, 21 October 2017, pp. 80-81 and New Scientist, 21 October 2017, p. 9.