Recently, I wrote about complexity of defining a generic technology like LLM, and how we can understand it through prism of different product categories. In current piece, I am pondering more about it’s future evolution. Definitely an important thought exercise, as everyone in tech ecosystem is contemplating about ways to integrate it in our everyday systems. Not to mention, there are more questions than answers at this stage.
For LLMs to become important force behind everyday products we consume, they have to make products more useful than they already are. This sounds cliche, but it is still common to forget it during initial periods of an exciting new tech wave. We can think of 3 different dimensions on which usefulness of AI can be evaluated. These dimensions are - interface (graphical vs chat), sophistication (casual vs professional) and depth (light vs deep intelligence). Lets discuss each one individually.
Interface (Graphical vs Chat)
Historically in software products, there always has been a battle between two user interfaces. The graphical+touch interface, and chat+voice interface. Earliest player was DOS with a chat interface, soon to be replaced by Windows. Since then, graphical interface has ruled for majority of workflows, but will this be a comeback moment for chat?
Every user interface is a way to achieve a goal via some input actions. If the inputs are standard and limited, pushing buttons on graphical interface turns out to be better than typing or speaking. Why would an user do typing to order food items, when she can achieve it just by few taps. But for the workflows where inputs can be non-standard abstract descriptions, chat is the right interface to go. Precisely why ‘search’ has been a chat interface. And understandably why many of generative AI stuff, like essays, images etc will be chat based as well.
Sophistication (Casual vs Professional user)
Tech products have always tried to expand the market through bringing casual users into the folds of highly desired use-cases. Substack has lowered the barriers for journalism, TikTok for video production and Instagram for photography. Thanks to this tendency, lots of new use-cases will continue to get split between casual use vs professional use. Casual use-case is quick but compromised on fine-control. Professional use-case is time consuming, deep with lots of control on output. Lets take two specific examples i.e. making movies and building software.
Making movies —> there will be a flurry of new generative AI tools that would allow any normal person to just use some prompts, light touch actions to generate videos. At the same time, professionally made movies will continue to use deeper tools that allows for deeper control on the output. LLMs will give superpowers to large number of casual users, and also give productivity boost to professional users.
Building software —> This is trickier to analyse. If we see the trend of past decades, there has been a huge increase in overall amount of software being built, as well as increase in percentage of that software being built via no-code tools. Even though handicraft code has been declining as percentage, but YoY increase in sheer amount of total code has ensured that there is an ever increasing demand for handicraft coders along with parallel demand for no-code developers. No-code has traditionally attacked the use-cases where handicraft code is too costly to justify the output value, and this trend will continue. LLMs will transform both workflows i.e. creating new layer of no-code products with prompting as input, as well as making the professional handicraft coders 10x more effective. I don’t think handicraft coding will go away, but more and more time will be spend on understanding, reviewing and editing code than writing it.
Junior jobs will increasingly look like senior jobs. Creating blog post will mean, not writing complete blog yourself, but more like sitting with a junior writer, and helping her with timely feedback and edits. Similarly, coding will be increasingly like sitting with a junior coder, giving him feedback about code, right directions to search, and occasionally taking laptop in hand and fixing up the code.
Depth (light vs deep intelligence)
Another distinction can be around ‘intelligence depth” between the products. At one end, there will products which do not require much intelligence, and LLMs are just going to play a role of incrementally enhancing the user experience. On the other end of spectrum, there will be AI native products which weren’t even possible without the latest advancements. Based on the aspect of intensity, I can think of three broad categories for LLMs driven products i.e. Chatbots (light), Co-pilots (medium deep) and Agents (highly deep).
1. LLM powered Embedded Chatbots
Many of the software systems are data management, and mainly replace the old age paper file systems. These data management softwares have removed huge frictions around data entry, storage, retrieval, search, edit and security. But there isn’t much intelligent analysis or decision making involved, hence there is no scope of introducing AI in their workflows. But there is one area, where LLM powered AI can play an effective role i.e. customer support. Users ping support when software proves incomprehensible or inadequate to get the job done. Also, interaction on support is in natural language. Since LLMs are accessible as an API, this would result in adoption of embedded support chatbots in many such applications. These chatbots will be fine-tuned on application specific data and past support interactions.
2. LLM powered Co-pilots
Best example out there is GitHub co-pilot, it suggests code, corrects errors in existing code and writes unit-tests. Many of top end coders have reported a productivity enhancement of around 40% while using it.
Similarly, in other domains, there will be many LLM-powered co-pilots that will help with task completions, data and decisions.
Since LLMs have characteristic of a developer platform, these co-pilots can execute tasks on third party apps via user prompts. So in near future, many tasks like creating an event in calendar, sending acknowledgement emails, creating goal lists, will be accomplished by these assistants. Interestingly, this also happens to be the original vision behind voice assistants like Siri/Alexa.
It is easy to imagine movie-making co-pilot that helps with script-writing, story-boarding, editing and hence boosting overall creativity and productivity. Similarly, there can be a co-pilot of doctors that can help with appointments, diagnosis, prescriptions, medical history and follow-ups.
3. LLM powered Agents
Some independent builders achieved an impressive feat, where they stitched together multiple chatGPT instances, give one GPT the role of task manager, and other GPTs the role of task executors. The manager GPT ensures that task executor GPTs complete their tasks as per the original prompt. Result is a GPT Agent, which can run for hours in background till it achieves its goals.
This looks like baby version of a powerful and scary vision. Whereas co-pilot is like a smart obedient assistant that empowers a human agent, AI agents just need high level goals. They would keep running in background and will be able to achieve much more complex feats compared to a assistant responding to commands. Often they will create their own tasks to accomplish stated goals and the underlying methodology will be a black box.
Its tough to visualise at this stage how these systems going to look like. An example that I can think of is a sales AI agent that is scrapping the web for best of potential leads, evaluating them and cold emailing them to build a pipeline. Or a personal finance agent that reports irregular expenditures, pays monthly bills, and keeps looking for new investment opportunities.
In summary, this piece focuses on different dimensions of products that are now possible with LLMs. At the same time, I feel categorisation is more of a crutch while dealing with new technologies, where possibilities are endless. Still, this analysis can be treated as an ignition point that structures the thought process, while facing the big question - what is possible with LLMs?