Feed local data to LLM using Langchain Node.js
Recently I’ve been curious about Artificial Intelligence specially these buzzwords (ChatGPT, Open AI, Large Language Model (LLM)) are all over internet. There are a lot of use cases using these technologies which can made everyone’s lives easier and convenient.
What interests me is that we can use our own local data such as txt, pdf, json using Langchain and process our prompts.
Why use local data?
There are some use cases that we only want our prompts to query from a private and secured local source. This also works without the need of internet.
E.g. Summarising applicant’s resume, Do frequently ask questions related to company’s handbook.
We can query and pass information to LLMs without our data or responses going through third parties and we have total control of our data. Operating our own LLMs could have cost benefits as well.
LangChain is a framework for developing applications powered by language models. It allows AI developers to combine Large Language Models (LLMs) like GPT-4 with external data.
How it works?
- Langchain processes it by loading documents locally.
- It works by taking big source of data, take for example a 50-page PDF and breaking it down into chunks called Vector Store which serves as a database.
- It also accepts other file formats.
- Langchain can process user prompts either by using OpenAI or other LLM
Sample Application
Clone this repository, and follow prerequisite in README.md
- https://github.com/ChunAllen/langchain-local
- This sample application, already have a local data placed in docs/
- Comments has been added in index.js
Running Prompts
- Asking questions related to the document
$ node index.js "Describe this applicant's employment history"
{
text: ' This applicant has 5+ years of experience in IT, with experience in System Administration, Network Configuration, Software Installation, Troubleshooting, Windows Environment, Customer Service, and Technical Support. They worked as a Senior IT Specialist at XYZ Global from 2018-Present, an IT Support Specialist at Zero Web from 2015-2017, and a Junior Desktop Support Engineer at Calumcoro Medical from 2014-2015.'
}
- Asking questions not related to the document (since our local data is only about the applicant’s resume)
$ node index.js "What is 1+1?"
{ text: " I don't know." }
- We can also process user prompts based on local vector store or LLM
// Refer only to local vector store
// const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever())
// combine OpenAI LLM and Local vector store
const chain = new RetrievalQAChain({
combineDocumentsChain: loadQARefineChain(model),
retriever: vectorStore.asRetriever(),
});
node index.js "What is 25019 * 25?"
{ res: { output_text: '625,475' } }
Resources:
https://js.langchain.com/docs