This is a cool addition.
It’s more than just chunking too.
Generally, lexical search algorithms like BM25 and TF-IDF are tailored for a world of whole documents. Then, lots of modern embeddings and semantic retrieval benefit from smaller text chunks (and maybe prefix/suffix… https://x.com/vespaengine/status/1924494592205791450