TLDR: Locally running Large Language Models talking to each other over the internet goes brrr, to exchange information and overcome limitations. Just imagine all kinds of LLMs talking to each other and the emergent behaviors, such as markets and societal-like networks, that will emerge in the internet of large language models.
Since ChatGPT was opened to the public, people have become very interested in running their own open source Large Language Models (LLMs) locally, not just for research but also for business or private use.
There are several good reasons why many people prefer to have a local LLM instead of using a LLM in a centralized cloud:
Data Privacy: One of the primary reasons many want a local model is data privacy. With a local model, all data remains on-premises, providing an additional layer of security and privacy. This can be very important for sensitive information, such as financial, customer, or healthcare data.
Cost: For organizations that require large-scale language processing, the cost of using a cloud-based offering can quickly add up. In some cases, it may be more cost-effective to invest in a local model that can be used indefinitely, rather than paying for ongoing cloud-based processing fees. A central all-knowing model still needs to process the entire model and all information contained in it, including information that your application will most likely never need. For example, an LLM for customer support doesn’t need the entire knowledge of the scientific community, but may only need common sense.
Specialization: Local models can be customized to suit specific business needs. This can be particularly useful for organizations that require specific language processing capabilities that are not available in public cloud offerings. To achieve exceptionally high correctness, LLMs might need to increase their size even further or use Retrospective, an iterative approach that leads to much higher quality but costs more computing time. This could increase the required compute of an all-knowing LLM like OpenAI’s GPT-X way too much.
Reduced Latency and Offline Use: For real-time applications, such as robotics or high-speed control, reduced latency is crucial. Local models can offer faster transmission times than cloud-based models, especially with further progress of hardware speed. Smaller models are faster too. In many cases, an AI needs to operate even if an internet connection is temporarily unavailable.
At our business we would want to run out own open source LLM and not rely on an external corporation, given how important LLMs will become in the future. Unfortunately, it seems that the progress is not ready yet, but it is encouraging to see the current development in the realm of open source LLMs.
However, these advantages come with disadvantages. Cheap, fast, and specialized models will miss a lot of knowledge, and smaller businesses need to make even stronger trade-offs. While most knowledge can be safely discarded for a given application, sometimes an unexpected knowledge gap might be encountered. For example, the customer support of a cleaning business might get asked to clean a house with an exotic stone ornament that is very reactive to water. Or an offline robot might see something it doesn’t understand, so it avoids it and saves the data later for further investigation.
It is very important to note for local LLMs: hallucination is a significant problem that future LLMs must overcome to be reliable. Hallucination occurs when a language model generates responses that are not based on the input provided but rather on its own random assumptions. This can result in responses that are factually incorrect, irrelevant, or even harmful.
The solution against hallucination will result in LLMs that realize when they miss information and cannot be certain. In such cases, the LLM could ask another LLM for help over the internet without human supervision. The first LLM could ask a specialized LLM for very specific problems that it cannot reliably solve itself.
LLM-to-LLM communication in the internet of large language models is not only useful for unfortunate knowledge gaps resulting from tradeoffs but has more use cases:
Proprietary information: Often, information is just not available to LLMs, and the only source of information is a service provider who guards this information and uses this advantage for consulting. Indeed, consultants might use highly specialized but sophisticated and reliable LLMs to provide consulting. LLMs might even pay each other in cryptocurrencies or fiat.
New information: Some information might be constantly changing, like laws, regulations, weather, business conditions, or customer requirements. The local LLM needs to stay in touch with LLMs that collect this data at the source to get customized insight for its specific needs. A business’s LLM might ask a lawyer LLM what law changes are relevant to its business or communicate with a customer’s LLM about constantly changing requirements.
Collaboration: LLMs can collaborate and share knowledge, resulting in more accurate and sophisticated responses to complex queries. They can coordinate with each other and form a coherent strategy. Even entire markets could be automated. LLMs can even judge each other if information was bad and warn other LLMs about them.
Redundancy: Distributed large language models can provide a level of redundancy and fault tolerance, reducing the risk of system failures or downtime.
Availability: In some cases, cloud-based offerings may not be available in certain regions or may experience downtime. A local model can provide greater reliability and availability, particularly for organizations that require 24/7 language processing capabilities.
Just too much knowledge: If AI hyper-speeds our economy, we might soon have too much knowledge, not just data, to feasibly have it all in one single model.
To give ourselves as an example, our cleaning business doesn’t need a model that knows the entire history of the world or all scientific knowledge ever. It needs to know about cleaning, houses, people, and common sense. For occasional cases, it would need to reach out to external knowledge sources and update its knowledge base. Soon, I think a customer that reaches out will be an LLM itself and tell us what needs to be cleaned.
The internet of LLMs enables a new economy where LLMs can communicate with each other and dynamically form markets and networks, which could lead to a fully automated economy. Until now, a fully automated economy would have been very rigid and fragile, just look at old fashion factory robots. However, LLM networks enable a dynamic, constantly changing fully automatic economy without requiring superhuman intelligence or AGI. Humans would only need to add new creative ideas and bring them to life once, after which the LLMs could take over and run with them. The possibilities are endless, and the future of the economy looks exciting with the potential for groundbreaking advancements.
The internet of Large Language Models is coming.
Disclaimer: Yes I used ChatGPT to help me, I suck at writing. Images with Stabilityai’s Stable Diffusion