Do LLMs actually want to be retrieved? Or are we just forcing them to fake memory? #19539
Unanswered
onestardao
asked this question in
Q&A
Replies: 1 comment
-
|
100% agreed. The issue is that we are confusing search with cognition. Vector stores are great at finding "similar sounding" text, but terrible at understanding relationships. That's where the context spam comes from—the model gets 5 chunks that share keywords but lack the causal link the user is actually asking about. I've found that moving away from pure vector stores toward Knowledge Graphs (GraphRAG) helps a bit. It structures the data in a way that maps closer to how an LLM (and a human) actually follows a thread of thought, rather than just doing a keyword fuzzy match. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey folks — I’ve been spending a lot of time trying to get RAG stacks to feel... well, natural.
But the more I build, the more it feels like I’m forcing the model to pretend it remembers stuff — when in reality it never asked to remember anything at all.
Like, we're injecting these retrieval chunks mid-convo, praying they make sense...
but it often just feels like it’s hallucinating politely.
“Thank you for the irrelevant context, I’ll now proceed to make up something nice about it.”
Is this an alignment issue? Or just the wrong retrieval paradigm?
I get that retrieval is powerful.
But what if the whole “indexed chunk + vector store” model is fundamentally misaligned with how LLMs process flow?
Are there alternatives being explored here — or ways to make it feel more like real cognition and less like context spam?
Would love to hear from folks actually shipping things with LlamaIndex — what pain have you run into?
And is it just me, or do the elegant demos start to fall apart when you push it past toy scale?
No links, no plug, no pitch — just trying to think through the shape of the problem.
Beta Was this translation helpful? Give feedback.
All reactions