AnythingLLM simplifies access to a new trend: chatting with your own data using a GPT model.
“Chatting with your documents is the “hello world” of LLM use-cases, why not make it more accessible?” asks Tim Carambat, developer of AnythingLLM. The same idea is being pursued by projects such as PrivateGPT or GPT4All, but unlike them, Carambat is eschewing locally hosted language models and vector databases in favor of an easy-to-use chat interface, simple data collection, and integration with services such as OpenAI’s GPT-3.5-turbo, GPT-4 or Pinecone. If you wish, however, you can still replace these with local instances.
AnythingLLM is a comprehensive suite of applications and tools that can transform any document, resource, or piece of content into data that can be used by language models as a reference during a chat. For example, transcripts of entire YouTube channels, reference books, or business documents can be queried. By using external models and databases, AnythingLLM remains an application that can run in the background and does not require massive computing power.
AnythingLLM comes with data collection tools and a chat interface.
AnythingLLM allows you to collect data from pre-defined sources or add your own, provides a cache for documents once processed to save costs, and has the ability to set up multiple workspaces that can share pre-defined documents. This allows teams to collaborate and still have certain content visible only to certain members.
Currently AnythingLLM provides data collection tools for YouTube, Substack, Medium and Gitbooks. URLs and local documents can also be vectorized. The system also provides the source of the response, such as a URL.
AnythingLLM is open source
In addition to Pinecone, Carambat plans to support other vector databases and language models in addition to OpenAI products. Chroma support has recently been added. Further integrations with Google Drive or Github repos are planned.
To use AnythingLLM you need:
- Python 3.8+ (for data collection)
- Node16+ (for the local server)
- OpenAI API key (for embedding + chatting)
- Pinecone DB API key or locally running ChromaDB instance (for vector storage).
AnythingLLM is open-source and available on GitHub.