There has been an explosion of services in the field of generative AI. These systems, typically using large language models at their core, are expensive to build and train but much cheaper to operate.Ā
I want to historicise this trend. Back in July 2020, I wrote that large language modelsāI refer to them as transformers back then, like GPT-3āare ācapable of synthesising the information and presenting it in a near usable form.ā They represented an improvement over previous knowledge technologies, because they would present synthesised information rather than disparate search queries.Ā
In my simplistic framing, transformers were about synthesis. I studiously avoided defining what I meant by synthesis, but 30 months later, itās time for me to refine that model. I reckoned that I was trying to suggest āsynthesisā meant responses to a query that could be drawn from many different sources, as we were seeing with GPT-3 over text. But what I missed was the power of what these large models could do. I didnāt pick up how quickly they would become multimodal, across text, images and audio; how they might be capable of de novo synthesis; how verticalisation would make them more powerful within specific domains.Ā
Letās go over these.
Search
Iāve been playing around with two āsearchā style services: Metaphor, a general search, and Elicit, for academic research. Metaphor is a bit weird. I havenāt been able to write good queries for it, but I have found it helpful in surfacing useful results even in topics that I know something about. (See this search on ātechnology transitionsā. Log in required.)
Elicit is really impressive. It searches academic papers, providing summary abstracts as well as structured analyses of papers. For example, it tries to identify the outcomes analysed in the paper or the conflicts of interest of the authors, as well as easily tracks citations. (See a similar search on ātechnology transitionsā. Log in required.)
But I have nerdy research needs for my work.Ā
For my common everyday search terms, like āhelium network statsā , āwater flosser reviewsā, ābest alternative to nestā, āput a VPN on a separate SSID at homeā, ācanada population pyramidā, Google still works pretty well. I havenāt quite been able to figure out how to use Metaphor to replace my Google searching.Ā
It feels a bit like the Dvorak keyboard. Better than QWERTY but QWERTY had the lock-in and we use QWERT today. Metaphor may be better the Google but I canāt yet grok it.
My sense is that Elicitās focus on a use case makes more sense.Ā
Cross-domain
While GPT-3 showed text generation capabilities, weāre still getting used to cross-modal tools.Ā
Text-to-image is now commonplace. But Google and others have already shown off text-to-video: type in a prompt and get a short movie.
Generative approaches are now finding their way into molecular design. Researchers at Illinois University have prototyped a system to translate between molecular structure and natural language. One could imagine a system being able to generate molecules that match really specific requirements. (āGive me a molecule that is translucent in its solid form and smells of mint.ā)
Keep reading with a 7-day free trial
Subscribe to Exponential View to keep reading this post and get 7 days of free access to the full post archives.