Questions tagged [rag]

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) systems make use of a knowledge base (e.g. a database, a knowledge graph) to provide context information to queries to LLMs for the LLM to use such context in their answer to the query.

Useful pointers:

13 questions
2
votes
1 answer

LangChain and Ollama endpoint: why LLM is considered obsolete when it's the only one that connect to the `/generate` endpoint?

Ollama has two endpoints: /api/chat and /api/generate. As stated in this Ollama gitHub issue: The /api/chat endpoint takes a history of messages and provides the next message in the conversation. This is ideal for conversations with history. The…
robertspierre
  • 216
  • 2
  • 11
2
votes
0 answers

in RAG, for large dataset, which similarity works? Why? how to handle problem with size of matrix in cosine similarity?

If we want to implement RAG for large dataset, which similarity works? Why? Also, how to handle problem with size of matrix in cosine similarity?
user10296606
  • 1,906
  • 6
  • 18
  • 33
1
vote
0 answers

How to Get Started with Retrieval-Augmented Generation (RAG) for Research

Greeting StackExchanger's, I am a software engineer interested in exploring Retrieval-Augmented Generation (RAG) for my research. However, I am new to this field and do not have practical experience with NLP, NLU, or Deep Learning. I do have some…
1
vote
0 answers

RAG System Design: Context-Aware Customer Support for Property Management with Mixed Property-Specific and Global Information

Background I manage a property portfolio on platforms like Airbnb, handling customer support through the entire guest journey (pre-booking to post-stay). I'm building a RAG system to help automate responses to guest inquiries. Data Sources and…
dowjones123
  • 111
  • 2
1
vote
0 answers

Ollama with llama3.2-3b - take 3.5 minutes to finish a query

I build a RAG solution on local. I use 2 models, downloaded from Ollama nomic-embed-text (embedding model) llama3.2:3b (llm model) For testing, I only have one pdf document of around 100 pages, which are…
Duy Bui
  • 261
  • 2
  • 5
1
vote
0 answers

Understanding the embeddings model (dunzhang/stella_en_400M_v5) by Alibaba. The details about the retrieve task and the s2s task

The model I am talking about is hosted here: From the documentation: We simplify usage of prompts, providing two prompts for most general tasks, one is for s2p, another one is for s2s.Prompt of s2p task(e.g. retrieve task): ..., Prompt of s2s…
1
vote
0 answers

Retrieval-Augmented Generation for Identifying Similar and Duplicate Controls in a Dataset

I'm exploring the feasibility of implementing a Retrieval-Augmented Generation (RAG) system to tackle a specific use case involving control identification in a dataset. The objective is to identify similar and duplicate controls within the dataset…
priyanka
  • 11
  • 4
1
vote
1 answer

Implementing Data Isolation in an RAG System in GCP using any of the LLM models

I am currently working on developing a Retrieval Augmented Generation (RAG) system where User-1 and User-2 each have their unique set of documents. My goal is to create a system where User-1's queries only receive responses from their own documents…
ENAT
  • 11
  • 2
1
vote
1 answer

How does RAG (Retrieval Augmented Generation ) work around limited context length?

My understanding of the RAG pipeline can be summarized with the following diagram: I understand steps 1-7 splits and vectorizes an external text data source into chunks and steps 8-11 retrieves n relevant chunks based off the user input query and…
0
votes
0 answers

Generate answer to my prompts via RAG techniques. Is it possible?

I have a document which I want to vectorize and then ask multiple prompts to it. I am expecting a natural language text generation rather than doing a cosine similarity to find the matching sentence vectors. RAG systems generally find the most…
Sand T
  • 29
  • 4
0
votes
0 answers

Document clustering and summarisation via GraphRAG

Suppose I have a corpus of documents that I want to cluster and summarise. There are an indeterminate number of parent clusters, and each parent may in turn have several tributary child clusters. I would like to identify both parent and child…
Jeff
  • 113
  • 3
0
votes
1 answer

How does RAG query affect the similarity search?

I have a RAG pipeline where I want to extract a piece of information called "X" In a regular RAG pipeline, there is a query entered by the user. Then, this query will be embedded, and the resulting embedding vector will be compared by some metric…