sort ngrams before looking them up (#617)
We believe this will improve performance of the btree lookups. We are
investigating this to make it faster to rule out a shard (when freq==0).
Testing locally on a large corpus we halved the time spent in IO.
Locally Sort shows up in the profiles significantly, but there are two
facts mitigating that:
- Locally my file page cache is primed so IO rarely is going to disk.
- We likely will implement an IR for Zoekt which will amortize the Sort
to once per search rather than once per shard.
Test Plan: go test ./... and performance profiling via via ./cmd/zoekt.
Co-authored-by: Stefan Hengl <stefan@sourcegraph.com>