Compilation is not understanding

Innovation is rightly greeted with amazement and is celebrated in many ways, including the host of inevitable copycats. That is the nature of evolution. However, before we jump to unfounded conclusions about sentient machines and Skynet domination, or mass unemployment resulting from super-human machines, let us take a moment to reflect.

Large Language Models (LLMs) have resulted in a novel, conversational man-machine interface. We can now speak to machines, and they can respond and converse in a way that is human-like in its close approximation of “true” human dialogue. However, authentic interaction is not, and probably will never be possible. Will it be close enough to not matter in most circumstances? Probably. But consider a study, conducted by the Departments of Linguistics and Neurobiology of the University of Chicago, on a subject “Kim” who was born with the rare condition of having no sense of touch or somatosensation. Basically, having no experience or memory of a sense of touch affects the ability to understand linguistic elements referencing touch and tactile concepts.

So, will LLMs and machines ever be able to “understand” touch? Probably, once they have been fitted with appropriate sensors. But how will they ever be able to experience and interpret the more subtle emotions and interpretations for which there are no sensors? How can empathy be expressed authentically, and not simply as a regurgitation of referenced empathic responses?

In the evolving landscape of artificial intelligence, Large Language Models (LLMs) straddle a fascinating boundary. While their ability to generate human-like text is impressive, it's a leap to equate this with true intelligence, as their capabilities largely hinge on complex patterns of search and summarization. Thus, the debate continues: are LLMs truly intelligent, or are they simply masters of mimicry on a grand scale?

Standing on the Shoulders of Giants

A good way to understand what LLMs, neural networks, and AI in general is, to look at large-scale collective works, where many thousands of people have created a tool that is useful to individuals seeking to perform a task. Tools like Wikipedia. ChatGPT is really a conversing Wikipedia. Prompts and queries result in it referencing a massive database of inputs from millions of people and giving back a statistically derived answer one hopes is correct. Unlike Wikipedia, tools like ChatGPT are “black boxes” that do not explicitly show references to the sources of their responses.

This makes it a great tool for tasks such as blog articles, but a terrible idea for mission-critical work. Imagine asking for medical advice only to be handed back life-threatening quack therapies elevated to a sort of “trust the machine” level of the gospel.

Herein there also lies the difficult issue of data provenance and attribution. Unless AI and LLMs can be retrofitted, or new systems built with intrinsic and provable data provenance, we face potentially catastrophic outcomes for many people.

Digital Bandits

The issue is if a machine ingests your work (even if you are one of thousands), and spits it out somewhere for someone to use in theirs, do you deserve to be recognised for your contribution? There is a spectrum of contributions ranging from being a participant in a survey, and having someone use the statistics ‐ clearly not worthy of mention or remuneration ‐ to having your work plagiarised as some anonymous machine “invention” which is clearly an act of intellectual property theft. But how would anyone know or prove theft without the provenance? It would be a case of your word against Big Machine.

Also worthy of consideration is whether your privacy can be maintained while providing some form of attribution.

The Full Circle of Trust and Fairness

A likely answer to this lies in the realm of Digital Ledger Technologies and Zero-Knowledge Proofs. These solutions give us fairly mature and strongly tested tools for embedding anonymised provenance data into the massive ocean of data being ingested so that the eventual product produced can come with a list of linked data points (contributors) to which a weighted portion of the attribution or even compensation can be directed.

Our early work in constructing such a system, in a very tightly constrained subject area, is looking extremely promising. We are confident that systemic data provenance can be demonstrated in an LLM solution. This could also lead to a practical solution to some of the questions of ethics and legality being asked in this bright new era of Really Useful Non-Intelligence (RUNI).

So, do yourself a favour. Stop talking about “Artificial Intelligence” and rather use RUNI (“Runny”), because we are not on solid ground just yet.