alpha
Login
or
Join now
boltless.me
/
zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
fork of https://github.com/sourcegraph/zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
Overview
Issues
Pulls
Pipelines
zoekt
/
index
/
at
83afd4cccb397632f22699feeb7b8acc56d8de86
41 files
John Mason
Cache docMatchTree results for Meta conditions (#982)
9mo ago
98307ca8
bits.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
bits_test.go
Fix compilation on 32 bit architectures (#936) This PR fixes a bug where Zoekt would not compile on 32-bit architectures. It also takes the opportunity to start using the `math` library everywhere instead of our own constants like `maxUInt32` to help prevent this sort of issue in the future by encouraging devs to select the most accurate "max" type for their specific situation. Closes https://github.com/sourcegraph/zoekt/issues/935
1 year ago
btree.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago
btree_test.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago
builder.go
Add support for indexing and searching custom fields for repositories (#962) At GitLab, we encountered limitations when searching within large namespaces containing thousands of repositories. Specifically, we cannot pass a complete list of RepoIDs due to size constraints. This change introduces support for indexing and searching on custom repository metadata by extending Repository to include an additional Metadata field. All fields within Repository.Metadata are searchable using a regular expression evaluator. This enables more scalable filtering by allowing clients to express regular expression prefix queries on metadata fields, such as: traversal_ids:123-456-.* Or any field really: haystack:nee.*le
1 year ago
builder_test.go
Add support for indexing and searching custom fields for repositories (#962) At GitLab, we encountered limitations when searching within large namespaces containing thousands of repositories. Specifically, we cannot pass a complete list of RepoIDs due to size constraints. This change introduces support for indexing and searching on custom repository metadata by extending Repository to include an additional Metadata field. All fields within Repository.Metadata are searchable using a regular expression evaluator. This enables more scalable filtering by allowing clients to express regular expression prefix queries on metadata fields, such as: traversal_ids:123-456-.* Or any field really: haystack:nee.*le
1 year ago
contentprovider.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago
contentprovider_test.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
ctags.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
ctags_test.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
docmatchtreecache.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
docmatchtreecache_test.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
document.go
ranking: downweight binary files (#924) In testing, I noticed another problem with BM25: sometimes a binary file is ranked highly because of a match on its filename. In classic Zoekt scoring, these are ranked low because they are skipped, and we always sort skipped docs to the end of the index. This PR ensures they're also ranked low for BM25 by adding a 'binary' category, and marking it low priority. Adding this category required updating `SkipReason` to track the reason a document was skipped. This is necessary because we set the content of skipped docs to `nil`, and `SkipReason` is the only lasting indication that it was binary.
1 year ago
eval.go
Add support for indexing and searching custom fields for repositories (#962) At GitLab, we encountered limitations when searching within large namespaces containing thousands of repositories. Specifically, we cannot pass a complete list of RepoIDs due to size constraints. This change introduces support for indexing and searching on custom repository metadata by extending Repository to include an additional Metadata field. All fields within Repository.Metadata are searchable using a regular expression evaluator. This enables more scalable filtering by allowing clients to express regular expression prefix queries on metadata fields, such as: traversal_ids:123-456-.* Or any field really: haystack:nee.*le
1 year ago
eval_test.go
Add support for indexing and searching custom fields for repositories (#962) At GitLab, we encountered limitations when searching within large namespaces containing thousands of repositories. Specifically, we cannot pass a complete list of RepoIDs due to size constraints. This change introduces support for indexing and searching on custom repository metadata by extending Repository to include an additional Metadata field. All fields within Repository.Metadata are searchable using a regular expression evaluator. This enables more scalable filtering by allowing clients to express regular expression prefix queries on metadata fields, such as: traversal_ids:123-456-.* Or any field really: haystack:nee.*le
1 year ago
file_category.go
ranking: downweight binary files (#924) In testing, I noticed another problem with BM25: sometimes a binary file is ranked highly because of a match on its filename. In classic Zoekt scoring, these are ranked low because they are skipped, and we always sort skipped docs to the end of the index. This PR ensures they're also ranked low for BM25 by adding a 'binary' category, and marking it low priority. Adding this category required updating `SkipReason` to track the reason a document was skipped. This is necessary because we set the content of skipped docs to `nil`, and `SkipReason` is the only lasting indication that it was binary.
1 year ago
file_category_test.go
ranking: incorporate file signals into BM25F (#922) This PR reworks the way we incorporate file signals into BM25. Previously, we were applying them as a tie-breaker. But in dogfooding, we found that these rarely impact results, because it's so rare to have a tie in BM25 scores. Now, we take the file signal into account when computing BM25F. The interpretation is that this data lives in a separate 'field' that is half the priority of regular content.
1 year ago
hititer.go
Fix compilation on 32 bit architectures (#936) This PR fixes a bug where Zoekt would not compile on 32-bit architectures. It also takes the opportunity to start using the `math` library everywhere instead of our own constants like `maxUInt32` to help prevent this sort of issue in the future by encouraging devs to select the most accurate "max" type for their specific situation. Closes https://github.com/sourcegraph/zoekt/issues/935
1 year ago
hititer_test.go
refactor(all): goimports -w -local github.com/sourcegraph/zoekt (#948)
1 year ago
index_test.go
Add support for indexing and searching custom fields for repositories (#962) At GitLab, we encountered limitations when searching within large namespaces containing thousands of repositories. Specifically, we cannot pass a complete list of RepoIDs due to size constraints. This change introduces support for indexing and searching on custom repository metadata by extending Repository to include an additional Metadata field. All fields within Repository.Metadata are searchable using a regular expression evaluator. This enables more scalable filtering by allowing clients to express regular expression prefix queries on metadata fields, such as: traversal_ids:123-456-.* Or any field really: haystack:nee.*le
1 year ago
indexdata.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
indexdata_test.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
indexfile.go
Allow building on FreeBSD (#968) This fixes https://github.com/sourcegraph/zoekt/issues/967. I reported that also here: https://gitlab.com/gitlab-org/gitlab-zoekt-indexer/-/issues/91
11 months ago
limit.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
limit_test.go
refactor(all): goimports -w -local github.com/sourcegraph/zoekt (#948)
1 year ago
matchiter.go
Fix compilation on 32 bit architectures (#936) This PR fixes a bug where Zoekt would not compile on 32-bit architectures. It also takes the opportunity to start using the `math` library everywhere instead of our own constants like `maxUInt32` to help prevent this sort of issue in the future by encouraging devs to select the most accurate "max" type for their specific situation. Closes https://github.com/sourcegraph/zoekt/issues/935
1 year ago
matchiter_test.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
matchtree.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
matchtree_test.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
merge.go
Add support for indexing and searching custom fields for repositories (#962) At GitLab, we encountered limitations when searching within large namespaces containing thousands of repositories. Specifically, we cannot pass a complete list of RepoIDs due to size constraints. This change introduces support for indexing and searching on custom repository metadata by extending Repository to include an additional Metadata field. All fields within Repository.Metadata are searchable using a regular expression evaluator. This enables more scalable filtering by allowing clients to express regular expression prefix queries on metadata fields, such as: traversal_ids:123-456-.* Or any field really: haystack:nee.*le
1 year ago
merge_test.go
refactor(all): goimports -w -local github.com/sourcegraph/zoekt (#948)
1 year ago
read.go
Cache docMatchTree results for Meta conditions (#982)
9 months ago
read_test.go
index: decide between tenant and non-tenant shard name in one place (#953) We have two places with duplicated logic around how it decides the layout of shards on disk. This now moves that decision into one place. Additionally we can now unexport index.ShardName. It was only used in one place outside the package, and that was easy to replace with a hardcoded string since it is just a test. Test Plan: Just CI. This has no actual change in functionality, just refactoring.
1 year ago
score.go
refactor(all): goimports -w -local github.com/sourcegraph/zoekt (#948)
1 year ago
section.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
shard_builder.go
copy languages package from Sourcegraph to Zoekt (#979) We want Zoekt and Sourcegraph to use the same language package. In this PR we move the languages package from Sourcegraph to Zoekt, so that Zoekt can use it and Sourcegraph can import it. Notes: - Zoekt doesn't need to fetch content async which is why I added a little helper func `GetLanguagesFromContent` to make the call sites in Zoekt less awkward. - Sourcegraph's languages package always classified .cls files as Apex, while Zoekt did a content based check. With this PR we follow Zoekt's approach. Specifically, I removed .cls from `unsupportedByEnryExtensionToNameMap`. I added an additional unit test to cover this case. Test plan: I appended the test cases from the old Zoekt languages packages to the tests I copied over from Sourcegraph
10 months ago
shard_builder_test.go
index: decide between tenant and non-tenant shard name in one place (#953) We have two places with duplicated logic around how it decides the layout of shards on disk. This now moves that decision into one place. Additionally we can now unexport index.ShardName. It was only used in one place outside the package, and that was easy to replace with a hardcoded string since it is just a test. Test Plan: Just CI. This has no actual change in functionality, just refactoring.
1 year ago
toc.go
ranking: incorporate file signals into BM25F (#922) This PR reworks the way we incorporate file signals into BM25. Previously, we were applying them as a tie-breaker. But in dogfooding, we found that these rarely impact results, because it's so rare to have a tie in BM25 scores. Now, we take the file signal into account when computing BM25F. The interpretation is that this data lives in a separate 'field' that is half the priority of regular content.
1 year ago
tombstones.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago
tombstones_test.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
write.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago