As new model releases support longer and longer context windows, there is a lot of discussion around whether RAG is still relevant.
RAG is here to stay for a while:
(1) Enterprises have much more data than reasonably will fit in a context window any time soon
(2) Even if you can technically put 1M tokens in, that does not mean the model can effectively use it all
(3) Longer input = higher latency and cost for inference
As new model releases support longer and longer context windows, there is a lot of discussion around whether RAG is still relevant.
RAG is here to stay for a while:
(1) Enterprises have much more data than reasonably will fit in a context window any time soon (2) Even if you can technically put 1M tokens in, that does not mean the model can effectively use it all (3) Longer input = higher latency and cost for inference
Would love any other thoughts on the topic!
I think there is a strong usecase for custom knowledge bases. It has always been hard to organize knowledge and wikis and others go only so far.