Fine-Tuning AI Models: Unlocking RAG's Potential with IDP Tools

When working with Retrieval-Augmented Generation (RAG) techniques, AI experts face a significant challenge: preparing and organizing documents in various media formats before loading them into graph databases. This tedious task can hinder the efficiency of AI models and slow down innovation. Fortunately, Intelligent Document Processing (IDP) solutions can help.

RAG is now a must-have feature of GenAI workflows. While it’s not easy to implement, data teams might get platforms to help.

In this article, we’ll explore how IDP can take the reins in organizing and preparing documents, enabling AI experts to focus on what matters most – fine-tuning their RAG implementations.

What is RAG?

Retrieval-Augmented Generation (RAG) enhances the performance of a large language model by making it refer to an authoritative knowledge base outside of its original training data before it generates a response. Large Language Models (LLMs) are trained on huge amounts of data and use billions of parameters to create original outputs for tasks like answering questions, translating languages, and completing sentences. RAG boosts the powerful abilities of LLMs by tailoring them to specific fields or an organization’s internal knowledge base, without needing to retrain the model. This is a cost-effective way to keep LLM outputs relevant, accurate, and useful in different situations.

Meta AI Generated Image - AI Refinement Hub — Meta AI Generated Image – AI Refinement Hub

source: https://aws.amazon.com/what-is/retrieval-augmented-generation/

What is Graph databases?

Graph databases are a new way of handling data that changes the game for how we store, access, and analyze information. Unlike traditional databases, which use rows and columns, graph databases use graph theory to organize data into nodes (which represent entities) and edges (which represent relationships). This section explores how graph databases are structured, compares them to conventional databases, and looks at their strengths and weaknesses.

Graph databases solve big problems we face every day. Modern data issues often involve complex, many-to-many relationships with diverse data, leading to needs like:

Navigating deep hierarchies.
Finding hidden connections between distant items.
Discovering inter-relationships between items.

Whether it’s a social network, payment network, or road network, everything is an interconnected web of relationships. When you ask questions about the real world, many of those questions are about the relationships between things rather than the individual data points.

source: https://aws.amazon.com/nosql/graph/

What is IDP?

Intelligent Document Processing (IDP) is a cutting-edge workflow automation technology that scans, reads, extracts, categorizes, and organizes meaningful information from large streams of data into accessible formats. This technology can handle various types of documents, including papers, PDFs, Word documents, spreadsheets, and many other formats. The main goal of IDP is to extract valuable information from large data sets without needing human input.

source: https://powerautomate.microsoft.com/en-us/intelligent-document-processing/

Why IDP is an Essential Tool in a RAG Project?

RAG harnesses the power of existing company documents to enrich vector databases and fine-tune AI models, enabling more accurate and context-specific responses. By leveraging company-specific information, RAG models move beyond generic responses based on pre-training data, instead providing tailored answers that reflect the company’s unique context and needs.

Generative AI models are designed to generate responses to user prompts, but they can sometimes go off the rails. When lacking sufficient context, these models can produce utterly irrelevant or inaccurate responses – a phenomenon known as ‘AI hallucinations’. However, by implementing RAG effectively, you can significantly reduce the occurrence of hallucinations and ensure that your AI models remain grounded in reality, providing accurate and relevant responses that align with your company’s specific context and needs.

The challenge lies in harnessing the vast array of documents within an organization, spanning diverse formats and including image-only files without text. To accurately categorize and vectorize this data for AI model fine-tuning, a robust solution is needed. This is where Intelligent Document Processing (IDP) solutions come into play, capable of efficiently processing and extracting insights from disparate documents, unlocking their value for AI-driven applications.

IDP solutions have been handling a wide variety of document types and formats for years, so they have a well-developed set of features. These features can help with collecting documents, extracting text (even from images), categorizing by date, extracting important information (from both structured and unstructured documents), integrating with other systems, and more. When the IDP tool is enhanced with AI capabilities, these features become even more powerful and easier to configure, making the job of AI experts much easier during the RAG implementation.

That’s why IDP solutions are crucial for RAG implementations. These tools help define, prioritize, and process important company documents, extract valuable information, and integrate with various AI solutions. This enables the creation of vectors, identification of similarities, and storage in a vector database in a structured manner.

Advantages of Effective RAG Implementation

Let’s examine a brief example that illustrates the significance of incorporating RAG solutions into an AI project. Imagine a scenario where a bank has an AI solution that responds to inquiries about the bank’s offerings, aiding bank executives in providing more appealing products to their customers.

The code below is Python code that utilizes the “gemini-1.5-pro-latest” model from Google AI solutions to answer questions from bank executives.

import google.generativeai as genai
api_key = "YOUR API"
genai.configure(api_key=api_key)
model = genai.GenerativeModel('gemini-1.5-pro-latest')
while True:
    prompt = input("Enter your question: ")
    if prompt == "0":
        break
    print(model.generate_content(prompt).text)

Now, let’s execute this code and test it with various questions to verify the responses.

Question 01: Given a customer’s age of 35 and credit score of 750, which banking product would be the best fit for them?

The AI model suggests that more context is necessary to properly address the question and proposes generating a general overview of bank products based on customer information. However, this approach may not assist the bank executive in deciding which product to offer to the customer.

Question 02: What is the best credit card match for a 35-year-old with a 750 credit score?

Once more, the AI model emphasizes the need for additional context to accurately address the question and offers some less valuable advice to the bank executive.

Now, let’s modify the Python code to incorporate RAG capabilities, enabling it to load a vector database containing valuable information about bank products, customer characteristics, etc. We’ll then instruct the Google AI model to utilize this vector database to respond to questions from bank executives.

In the code, we’ll include a small paragraph extracted from two different document types: one from the product policy, specifically the credit card section, and the other from finance news related to the stock market.

import requests as req
import pandas as pd
import numpy as np
import google.generativeai as genai

api_key = "YOUR API"
genai.configure(api_key=api_key)

product_policy_text = {'title':'Credit Card Section',
                   'text':'The available credit card options: Visa (under 21), Master (21+), Discover (21+ with credit score 750+), and Amex (21+ with credit score 800+)'}
finance_news_text = {'title':'Finance Section',
                   'text':'Meta stocks has reached all time highs due its big push in AI segment'}

product_embedding_vector = genai.embed_content(model='models/embedding-001',
                                             content=product_policy_text['text'],
                                             task_type='retrieval_document')

finance_embedding_vector = genai.embed_content(model='models/embedding-001',
                                             content=finance_news_text['text'],
                                             task_type='retrieval_document')
def embed_text(text):
    # returns embedding "array"
    return genai.embed_content(model='models/embedding-001',
                                             content=text,
                                             task_type='retrieval_document')['embedding']
df = pd.DataFrame()
documents = [product_policy_text, finance_news_text]
df = pd.DataFrame(documents)
df.columns = ['Title','Text']
df['Embeddings'] = df['Text'].apply(embed_text)

def query_simlarity_score(query, vector):
    query_embedding = embed_text(query)
    return np.dot(query_embedding, vector)

def most_similar_document(query):
    df['Similarity'] = df['Embeddings'].apply(lambda vector: query_simlarity_score(query, vector))
    title = df.sort_values('Similarity', ascending=False)[['Title','Text']].iloc[0]['Title']
    text = df.sort_values('Similarity', ascending=False)[['Title','Text']].iloc[0]['Text']
    return title, text

def RAG(query):
    title, text = most_similar_document(query)
    model = genai.GenerativeModel('gemini-1.5-pro-latest')
    prompt = f"Answer this query:\n{query}.\nOnly use this context to answer:\n{text}"
    response = model.generate_content(prompt)
    return f"{response.text}\n\nSource Doc Title: {title}"

while True:
    prompt = input("Enter your question: ")
    if prompt == "0":
        break
    print(RAG(prompt))

Let’s execute the code, pose the same question, and analyze the responses.

Question 01: Given a customer’s age of 35 and credit score of 750, which banking product would be the best fit for them?

Note that the AI model not only answers the question appropriately but also includes the source used to generate the answer. This indicates that no generic context was utilized to generate the response.

Question 02: What is the best credit card match for a 35-year-old with a 750 credit score?

Once more, a corrected answer is provided along with a referenced source, making the bank executive comfortable offering these products to the customer.

Conclusion

In this article, we learned about the significance of RAG in preventing generic responses and hallucinations in AI implementation. We also grasped the complexity involved in categorizing and prioritizing various company documents to refine AI models. One potential solution (though not the only one) to simplify this task is to incorporate IDP solutions into the project. This will not just streamline the work of AI experts but also enhance the accuracy of AI models with precise information, resulting in the best possible responses to every question.

Key Acronyms Discussed in the Article

RAG (Retrieval-Augmented Generation)
IDP (Intelligent Document Processing)
LLM (Large Language Models)
AI (Artificial Intelligence)
GenAI (Generative AI)
IDP (Intelligent Document Processing)
PDF (Portable Document Format)
API (Application Programming Interface)