The Fufu Test and the Problem with LLMs
It started off as an aside comment, an exchange between friends. Do you know if you ask one of the most popular image generators, “show me fufu and groundnut soup”, you get this abomination. Friends are often surprised. So I went further.
➡️ Show me rice and peas! Er, not not that rice and peas I once saw served at Ikea stores. They were heavily criticised and had to remove it from their menu.
➡️ Show me Kenkey and Fish! Er, wt?
➡️ Show me fried plantain! OK, but what’s with the lime?
➡️ And show me fish and chips. Not a Ghanaian dish, but just for reference. Yep that looks fine.
So what’s the problem?
Well given these are national dishes from Ghana enjoyed by tens of millions in the country and the diaspora you’d think they’d be a genuine reflection of the food.
Perhaps in time there will.
But the lesson here is one whose realisation is way more significant for the future.
By 2035 or even before LLMs will be our external brains. We’ll rely on them to deliver facts, truth and reality, much as one generation does now with Google.
But search engines will soon to be usurped by LLMs. Hence generations will become dependent on them. Ghanaian food representations may be corrected by then. But what else could be skewed?
The issue is a general one and resides in the very idea of LLMs (emphasis on “Large”) These models will invariably reference what’s general by their standards based on the data they ingest, which is relatively huge.
So the key is to set up bespoke and smaller Language Models that could cater for info like the aforementioned focusing on cultures. This merits the creation of LLMs that are deeply encyclopedic about Ghanaians, made by Ghanaians.
That’s what the amazing Sean M. Tracey and I spoke about. Sean is Star Trek’s Data in real life with exception, he has warmest personality you’ll find anywhere. He’s a brilliant friend from my DisLAB story days, when I headed up a programme in innovative storytelling.
He modelled, whilst I prodded with an emphasis on storytelling. Smaller LLMs can be augmented so they reflect specific knowledge that future generations can benefit from.
Take the AI film I made last year looking at the forgotten histories of Ghanaians, like my father, emigrating to the UK in the 1950s. Firstly archive of these journeys are comparatively rare because filming by Africans was either prohibited or wilfully discouraged by authorities.
But smaller LLM pose a looming problem, much like social media’s individualism. Soon, everyone will crave their own smaller LLM. That could be a flaw in knowledge development and its historical context.
Take my practice which I’m an expert in, cinema journalism. I’d be ill-advised to solely set up an LLM about what I know in spite of years of practicing, researching and presenting.
The key I believe is to form communities, where like minded individuals sharing knowledge with various “thick descriptions” themes and narratives reside.
Then, and this is the key, you can allow users to access this knowledge with a url — your own internet in a way. There’s an added point that will blow your mind about how to access that information, I can’t speak about
Fufu and kenkey are allegories for the wider issue. In 2009, several students I taught had not heard of 911. By 2030, there will be a new generation who may not know who George Floyd is, let alone Black Lives Matter’s link to the the lengthy struggle for racial equity (I wrote about this for the British Library’s exhibition 500 Years of News).
That’s a sombre thought. Fufu or not, that deserves attention. Now, there’s work to be done on my part.