alpha
Login
or
Join now
boltless.me
/
zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
fork of https://github.com/sourcegraph/zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
Overview
Issues
Pulls
Pipelines
zoekt
/
testdata
/
shards
/
at
456196a74a627db053099c7cb7795380bf1be645
5 files
Keegan Carruthers-Smith
repoID bitmap for speeding up findShard in compound shards (#899)
1y ago
456196a7
ctagsrepo_v16.00000.zoekt
repoID bitmap for speeding up findShard in compound shards (#899) We add a new section to shards which contains a roaring bitmap for quickly checking if a shard contains a repo ID. We then can load just this (small amount) of data to rule out a compound shard. We use roaring bitmaps since we already have that dependency in our codebase. The reason we speed up this operation is we found on a large instance which contained thousands of tiny repos we spent so much time in findShard that our indexing queue would always fall behind. It is possible this new section won't speed this up enough and we need some sort of global oracle (or in-memory cache in indexserver?). This is noted in the code for future travellers. Test Plan: the existing unit tests already cover if this is forwards and backwards compatible. Additionally I added some logging to zoekt to test if older version of shards still work correctly in findShard, as well as if older versions of zoekt can read the new shards. Added a benchmark to check the impact. See comments in the code. --------- Co-authored-by: Stefan Hengl <stefan@sourcegraph.com>
1 year ago
ctagsrepo_v17.00000.zoekt
all: compound shard support (#95) This commit adds support for compound shards. A shard now has multiple repositories associated with it, rather than always one. Most of zoekt is document based, so minimal changes are required in the core search evaluation codepath. The only change here is the addition of a mapping from document to repo and storing the subrepo paths per repo. The other change is the addition of tombstones. A tombstoned repository is hidden from List and Search results. This was added so we can index a new version of a repository in a compound shard without the need of recomputing the whole shard. This commit is mostly focussed on the read path. It ensures everything keeps working correctly once compound shards are introduced. However, the write path is mostly missing. We add a merge command for manual merging. However, zoekt-sourcegraph-indexserver is mostly unaware of compound shards and has no way to mutate them. This will be follow-up work. To support compound shards we had to bump the indexed format version. This is since the repoMetaData field is not backwards compatible. However, we know we plan on making other changes to the index format. So we introduced NextIndexFormatVersion. With this change we will continue to use v16, unless a common opts into v17. This will allow us to effectively feature flag the new format while we work on it. Co-authored-by: Stefan Hengl <stefan@sourcegraph.com>
4 years ago
repo17_v17.00000.zoekt
compute precise language information with go-enry for lang: queries (#220) use go-enry to compute more precise language information than ctags make lang: use filename fallback for older index versions
4 years ago
repo2_v16.00000.zoekt
repoID bitmap for speeding up findShard in compound shards (#899) We add a new section to shards which contains a roaring bitmap for quickly checking if a shard contains a repo ID. We then can load just this (small amount) of data to rule out a compound shard. We use roaring bitmaps since we already have that dependency in our codebase. The reason we speed up this operation is we found on a large instance which contained thousands of tiny repos we spent so much time in findShard that our indexing queue would always fall behind. It is possible this new section won't speed this up enough and we need some sort of global oracle (or in-memory cache in indexserver?). This is noted in the code for future travellers. Test Plan: the existing unit tests already cover if this is forwards and backwards compatible. Additionally I added some logging to zoekt to test if older version of shards still work correctly in findShard, as well as if older versions of zoekt can read the new shards. Added a benchmark to check the impact. See comments in the code. --------- Co-authored-by: Stefan Hengl <stefan@sourcegraph.com>
1 year ago
repo_v16.00000.zoekt
repoID bitmap for speeding up findShard in compound shards (#899) We add a new section to shards which contains a roaring bitmap for quickly checking if a shard contains a repo ID. We then can load just this (small amount) of data to rule out a compound shard. We use roaring bitmaps since we already have that dependency in our codebase. The reason we speed up this operation is we found on a large instance which contained thousands of tiny repos we spent so much time in findShard that our indexing queue would always fall behind. It is possible this new section won't speed this up enough and we need some sort of global oracle (or in-memory cache in indexserver?). This is noted in the code for future travellers. Test Plan: the existing unit tests already cover if this is forwards and backwards compatible. Additionally I added some logging to zoekt to test if older version of shards still work correctly in findShard, as well as if older versions of zoekt can read the new shards. Added a benchmark to check the impact. See comments in the code. --------- Co-authored-by: Stefan Hengl <stefan@sourcegraph.com>
1 year ago