The problem with CAG is not just that it hogs memory, but to keep it fresh you have to keep re-indexing. If the corpus is large and dynamic, it can easily fall out of date and, at runtime, blow out the context window.
It’ll probably have the same issues with reindexing, but that will be a common problem, until someone comes up with better incremental training/indexing.
deleted by creator
The problem with CAG is not just that it hogs memory, but to keep it fresh you have to keep re-indexing. If the corpus is large and dynamic, it can easily fall out of date and, at runtime, blow out the context window.
GraphRAG has some promise. NVidia has a playbook for converting text into a knowledge graph: https://build.nvidia.com/spark/txt2kg
It’ll probably have the same issues with reindexing, but that will be a common problem, until someone comes up with better incremental training/indexing.