Feed local data to LLM using Langchain Node.js

Allen Oliver M Chun
3 min readJul 10, 2023
Overview of processing local files, and responding to user’s prompt using Langchain.

Recently I’ve been curious about Artificial Intelligence specially these buzzwords (ChatGPT, Open AI, Large Language Model (LLM)) are all over internet. There are a lot of use cases using these technologies which can made everyone’s lives easier and convenient.

What interests me is that we can use our own local data such as txt, pdf, json using Langchain and process our prompts.

Summarise applicant’s resume

Why use local data?

There are some use cases that we only want our prompts to query from a private and secured local source. This also works without the need of internet.
E.g. Summarising applicant’s resume, Do frequently ask questions related to company’s handbook.

We can query and pass information to LLMs without our data or responses going through third parties and we have total control of our data. Operating our own LLMs could have cost benefits as well.

LangChain is a framework for developing applications powered by language models. It allows AI developers to combine Large Language Models (LLMs) like GPT-4 with external data.

How it works?

A diagram that shows creating vector store using local files, from LangChain Blog
  • Langchain processes it by loading documents locally.
  • It works by taking big source of data, take for example a 50-page PDF and breaking it down into chunks called Vector Store which serves as a database.
  • It also accepts other file formats.
  • Langchain can process user prompts either by using OpenAI or other LLM

Sample Application

Clone this repository, and follow prerequisite in README.md

Running Prompts

  • Asking questions related to the document
$ node index.js "Describe this applicant's employment history"
{
text: ' This applicant has 5+ years of experience in IT, with experience in System Administration, Network Configuration, Software Installation, Troubleshooting, Windows Environment, Customer Service, and Technical Support. They worked as a Senior IT Specialist at XYZ Global from 2018-Present, an IT Support Specialist at Zero Web from 2015-2017, and a Junior Desktop Support Engineer at Calumcoro Medical from 2014-2015.'
}
  • Asking questions not related to the document (since our local data is only about the applicant’s resume)
$ node index.js "What is 1+1?"
{ text: " I don't know." }
  • We can also process user prompts based on local vector store or LLM
A diagram of the process used to create a chatbot on your data, from LangChain Blog
// Refer only to local vector store
// const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever())

// combine OpenAI LLM and Local vector store
const chain = new RetrievalQAChain({
combineDocumentsChain: loadQARefineChain(model),
retriever: vectorStore.asRetriever(),
});
node index.js "What is 25019 * 25?"
{ res: { output_text: '625,475' } }

Resources:
https://js.langchain.com/docs

https://www.youtube.com/watch?v=9AXP7tCI9PI

--

--

Allen Oliver M Chun

Full Stack Software Developer based in Singapore. Software Developer since 2012. Specialized in Ruby on Rails, Vue.js. AWS Development.