alpha
Login
or
Join now
boltless.me
/
zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
fork of https://github.com/sourcegraph/zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
Overview
Issues
Pulls
Pipelines
zoekt
/
internal
/
e2e
/
at
5c1352c15efb3fd226d314ea8347d000de98dc0c
2 folders
5 files
Keegan Carruthers-Smith
all: run modernize across codebase (#919)
1y ago
36d24194
examples
ranking: downweight binary files (#924) In testing, I noticed another problem with BM25: sometimes a binary file is ranked highly because of a match on its filename. In classic Zoekt scoring, these are ranked low because they are skipped, and we always sort skipped docs to the end of the index. This PR ensures they're also ranked low for BM25 by adding a 'binary' category, and marking it low priority. Adding this category required updating `SkipReason` to track the reason a document was skipped. This is necessary because we set the content of skipped docs to `nil`, and `SkipReason` is the only lasting indication that it was binary.
1 year ago
testdata
scoring: use repo freshness as tiebreaker (#832) We ignore priority and instead use the latest commit date as repo rank. This has a big impact for Sourcegraph because it means we switch from star count to repo freshness as tiebreaker. As a minor tweak, we also separate query based scores from tiebreakers. To achieve this we reserve the last 7 digits of a score for tiebreakers: - 5 digits (maxUint16) for repo rank - 2 digits ([0,10]) for file order (2 digits). Example: Before: score: 8775.35 <- atom(2):200, fragment:8550.00, repo-rank: 19, doc-order:6.35 After: score: 8750_00019_06.35 <- atom(2):200, fragment:8550.00, repo-rank: 19, doc-order:6.35
2 years ago
doc.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago
e2e_index_test.go
all: run modernize across codebase (#919) The latest release of gopls has a feature called modernize which will update your code where it can to use modern go features/pkgs. https://github.com/golang/tools/releases/tag/gopls%2Fv0.18.0 Generated with: go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./... Test Plan: CI
1 year ago
e2e_rank_test.go
Move root-level index code to index package (#902) In the repo root, we have a bunch of low level logic around index building and searching. So we end up exposing internal logic through the main public `zoekt` package, for example `zoekt.Merge(...)`. This PR moves it into the `build` package, so all code related to index building lives together. It then renames `build` to `index` to reflect the broader focus on indexing and searching the index.
1 year ago
e2e_test.go
zoekt-archive-index: split out ranking tests and archive indexing (#712) We had ranking e2e tests living in the zoekt-archive-index cmd for convenience since that contained useful functions for indexing a remote tarball from the GitHub API. This commit splits the archive functionality into a new internal/archive package and the ranking tests into a new internal/e2e package. The zoekt-archive-index code is now quite minimal. This is similiar to how zoekt-git-index mostly just calls out to the gitindex package. What is different is that archive package is marked internal, unlike gitindex. gitindex should also be internal, but the code predates go's support for internal. I suspect more of our e2e tests will end up in this package. Test Plan: go test ./...
2 years ago
scoring_test.go
ranking: downweight binary files (#924) In testing, I noticed another problem with BM25: sometimes a binary file is ranked highly because of a match on its filename. In classic Zoekt scoring, these are ranked low because they are skipped, and we always sort skipped docs to the end of the index. This PR ensures they're also ranked low for BM25 by adding a 'binary' category, and marking it low priority. Adding this category required updating `SkipReason` to track the reason a document was skipped. This is necessary because we set the content of skipped docs to `nil`, and `SkipReason` is the only lasting indication that it was binary.
1 year ago