RAG and Knowledge Bases

What is RAG?

RAG stands for Retrieval Augmented Generation. Behind this very fancy name, there is a very simple concept. This allows your foundation model to reference a data source from outside of its training data without being fine-tuned.

How RAG Works

Now that we understand what RAG is, let's see how it actually works. We have a knowledge base that is being built and managed by Amazon Bedrock. For this, it must rely on a data source, for example Amazon S3.

The RAG Process:

Your data is stored in Amazon S3
Bedrock automatically builds a knowledge base from this data
A user asks a question to your foundation model (e.g., "Who is the product manager for John?")
The foundation model doesn't know anything about John because this is specific company data
A search happens automatically in the knowledge base (all behind the scenes)
The knowledge base retrieves relevant information from the vector database
Retrieved text is combined with the original query as an "augmented prompt"
The foundation model generates a response using both the original question and the retrieved context

Example Response Flow:

Query: "Who is the product manager for John?"
Retrieved information: Support contacts, product manager Jesse Smith, engineer Sarah Ronald
Final response: "Jesse Smith is the product manager for John"

This is called Retrieval Augmented Generation because we retrieve data outside of the foundation model, and it's augmented generation because we augment the prompt with external data that has been retrieved.

Knowledge Bases in Amazon Bedrock

RAG in AWS Amazon Bedrock is implemented as a knowledge base. This is very helpful when you need to have data that is very up-to-date, in real time, and needs to be fed into the foundation model.

Example Use Case: When you ask "Give me talking points for benefits of air travel," the response includes citations linking back to source documents like "Air travel.pdf" stored in Amazon S3.

Vector Databases

Everything goes into a vector database. Vector databases on AWS and Amazon Bedrock can be of several kinds:

AWS Services:

Amazon OpenSearch Service
Amazon Aurora

Third-Party Options:

MongoDB
Redis
Pinecone

If you don't specify anything, AWS will create an OpenSearch Service serverless database for you automatically.

Choosing the Right Vector Database

High Performance Options:

Amazon OpenSearch Service - Search and analytics database with scalable index management and very fast nearest neighbor search capability (KNN). Best for real-time similarity queries and storing millions of vector embeddings
Amazon DocumentDB - NoSQL database with MongoDB compatibility, also excellent for real-time similarity queries and millions of vector embeddings

Relational Database Options:

Amazon Aurora - Proprietary AWS database that's cloud-friendly
Amazon RDS for PostgreSQL - Open source relational database

Graph Database Option:

Amazon Neptune - For graph database requirements

Embeddings Models

We need an embeddings model to convert data into vectors. Options include Amazon Titan or Cohere. The embeddings model and the foundation model can be different - they don't need to match.

The Process:

S3 documents are chunked (split into different parts)
These parts are fed into the embeddings model
The model generates vectors
Vectors are placed in the vector database
Vectors become easily searchable for RAG queries

Data Sources for Amazon Bedrock

Amazon Bedrock supports several data sources:

Amazon S3 - Cloud file storage
Confluence
Microsoft SharePoint
Salesforce
Web pages - Including websites and social media feeds

Amazon Bedrock will likely add more sources over time, but from an exam perspective, remembering Amazon S3 and these core sources should be sufficient.

Use Cases for Amazon Bedrock RAG

Customer Service ChatBot:

Knowledge base: Products, features, specifications, troubleshooting guides, FAQs
Application: ChatBot that answers customer queries

Legal Research and Analysis:

Knowledge base: Laws, regulations, case precedents, legal opinions, expert analysis
Application: ChatBot for specific legal queries

Healthcare Question Answering:

Knowledge base: Diseases, treatments, clinical guidelines, research papers, patient data
Application: ChatBot for complex medical queries

RAG opens up a lot of possibilities for doing generative AI on AWS, making it possible to create intelligent applications that can access and reason over your specific organizational knowledge.

Keyboard shortcuts

AWS AI Practitioner Study Guide