I have a RAG pipeline where I want to extract a piece of information called "X" In a regular RAG pipeline, there is a query entered by the user. Then, this query will be embedded, and the resulting embedding vector will be compared by some metric (cosine similarity) to other embeddings of the saved documents.
If I write the query like this: "What information does this document contain about X?". The result from the similarity search should be worse than using a query containing just "X"
My question is: why is the entered query in a question form? And if it is not in question form, will it produce better or worse results, and why?