### Lang Chain Overview #### Connecting AI to External Data Sources - **Problem**: Large language models (LLMs) can't answer recent or specific queries. - **Solution**: Lang chain fetches relevant info from sources like websites, APIs, and databases. ### Setting Up Retrieval Chain - **File Creation**: Create a file named `retrieval_chain.js`. - **Model Initialization**: - Import `chat_open_ai` from `@langchain/openai`. - Instantiate the model with properties like model name "GPT 3.5 turbo" and temperature. - **Prompt Template**: - Create a prompt template for user questions. - Use a placeholder named `input` for user inputs. ### Fetching Data from External Sources - **Manual Context Injection**: - Example of hardcoding context in the prompt (not recommended). - **Using Documents**: - Import `documents` from `@langchain/core`. - Create documents with text and metadata. - Use `create_stuff_documents_chain` to format and inject documents into the prompt. ### Scraping Data Programmatically - **Cheerio for Web Scraping**: - Install and import `cheerio` via `@langchain/document-loaders/web`. - Use `CheerioWebBaseLoader` to scrape website data. - **Loading and Splitting Documents**: - Convert scraped content into documents. - Import and use `recursive-character-text-splitter` to divide documents into smaller chunks. ### Improving Context Relevance - **Avoid Overloading Context**: - Split large document content to avoid exceeding model limits. - Import `RecursiveCharacterTextSplitter`. ### Storing and Retrieving Documents - **Vector Store**: - Utilize `openai_embeddings` for document conversion. - Use `memory_vector_store` to store documents in an in-memory database. - **Creating a Retriever**: - Fetch relevant documents from the vector store based on the query. - Specify the number of documents to retrieve with a `K` value. ### Tying it All Together - **Creating a Retrieval Chain**: - Import and use `create_retrieval_chain`.
- Combine document retrieval with context embedding to accurately answer queries. - Ensure prompt context and user input variables are correctly named (`context` and `input`). ### Final Notes - **Testing and Debugging**: - Test responses to ensure accurate answers. - Adjust document splitting and retrieval parameters as needed. - **Practice Makes Perfect**: - Familiarize yourself with retrieval processes—an essential AI development skill.