Highest Voted 'rag' Questions - Data Science Stack Exchange

2

votes

1 answer

LangChain and Ollama endpoint: why LLM is considered obsolete when it's the only one that connect to the `/generate` endpoint?

Ollama has two endpoints: /api/chat and /api/generate. As stated in this Ollama gitHub issue: The /api/chat endpoint takes a history of messages and provides the next message in the conversation. This is ideal for conversations with history. The…

asked Jun 10 '25 at 15:51

robertspierre

216
2
11

2

votes

0 answers

in RAG, for large dataset, which similarity works? Why? how to handle problem with size of matrix in cosine similarity?

If we want to implement RAG for large dataset, which similarity works? Why? Also, how to handle problem with size of matrix in cosine similarity?

similarity llm semantic-similarity rag vector-database

asked Feb 11 '25 at 05:15

user10296606

1,906
6
18
33

1

vote

0 answers

How to Get Started with Retrieval-Augmented Generation (RAG) for Research

Greeting StackExchanger's, I am a software engineer interested in exploring Retrieval-Augmented Generation (RAG) for my research. However, I am new to this field and do not have practical experience with NLP, NLU, or Deep Learning. I do have some…

data-science-model information-retrieval llm research rag

asked Dec 01 '24 at 14:58

Usama Bukhari

11
2

1

vote

0 answers

RAG System Design: Context-Aware Customer Support for Property Management with Mixed Property-Specific and Global Information

Background I manage a property portfolio on platforms like Airbnb, handling customer support through the entire guest journey (pre-booking to post-stay). I'm building a RAG system to help automate responses to guest inquiries. Data Sources and…

embeddings llm vector-database rag

asked Nov 12 '24 at 20:08

dowjones123

111
2

1

vote

0 answers

Ollama with llama3.2-3b - take 3.5 minutes to finish a query

I build a RAG solution on local. I use 2 models, downloaded from Ollama nomic-embed-text (embedding model) llama3.2:3b (llm model) For testing, I only have one pdf document of around 100 pages, which are…

llm rag

asked Nov 05 '24 at 16:24

Duy Bui

261
2
5

1

vote

0 answers

Understanding the embeddings model (dunzhang/stella_en_400M_v5) by Alibaba. The details about the retrieve task and the s2s task

The model I am talking about is hosted here: From the documentation: We simplify usage of prompts, providing two prompts for most general tasks, one is for s2p, another one is for s2s.Prompt of s2p task(e.g. retrieve task): ..., Prompt of s2s…

embeddings generative-models information-retrieval rag

asked Aug 21 '24 at 07:24

figs_and_nuts

903
1
5
17

1

vote

0 answers

Retrieval-Augmented Generation for Identifying Similar and Duplicate Controls in a Dataset

I'm exploring the feasibility of implementing a Retrieval-Augmented Generation (RAG) system to tackle a specific use case involving control identification in a dataset. The objective is to identify similar and duplicate controls within the dataset…

llm rag

asked Apr 29 '24 at 06:21

priyanka

11
4

1

vote

1 answer

Implementing Data Isolation in an RAG System in GCP using any of the LLM models

I am currently working on developing a Retrieval Augmented Generation (RAG) system where User-1 and User-2 each have their unique set of documents. My goal is to create a system where User-1's queries only receive responses from their own documents…

nlp information-retrieval llm rag

asked Apr 24 '24 at 07:45

ENAT

11
2

1

vote

1 answer

How does RAG (Retrieval Augmented Generation ) work around limited context length?

My understanding of the RAG pipeline can be summarized with the following diagram: I understand steps 1-7 splits and vectorizes an external text data source into chunks and steps 8-11 retrieves n relevant chunks based off the user input query and…

information-retrieval llm text-generation prompt-engineering rag

asked Jan 02 '24 at 22:24

Clement

11
3

0

votes

0 answers

Generate answer to my prompts via RAG techniques. Is it possible?

I have a document which I want to vectorize and then ask multiple prompts to it. I am expecting a natural language text generation rather than doing a cosine similarity to find the matching sentence vectors. RAG systems generally find the most…

llm rag

asked Sep 17 '24 at 06:55

Sand T

29
4

0

votes

0 answers

Document clustering and summarisation via GraphRAG

Suppose I have a corpus of documents that I want to cluster and summarise. There are an indeterminate number of parent clusters, and each parent may in turn have several tributary child clusters. I would like to identify both parent and child…

embeddings llm knowledge-graph rag

asked Jul 24 '24 at 05:01

Jeff

113
3

0

votes

1 answer

How does RAG query affect the similarity search?

I have a RAG pipeline where I want to extract a piece of information called "X" In a regular RAG pipeline, there is a query entered by the user. Then, this query will be embedded, and the resulting embedding vector will be compared by some metric…

similarity information-retrieval llm semantic-similarity rag

asked Jan 12 '24 at 20:26

ahmedmoh123

3
1

Questions tagged [rag]