fork of https://github.com/sourcegraph/zoekt
0

Configure Feed

Select the types of activity you want to include in your feed.

fix: don't modify finalCands (#773)

While working on ranking, I noticed that sum-tf is wrong if we have filename and content matches.

We use `finalCands` in our BM25 scoring, however, `finalCands` is modified in `fillChunkMatches` and `fillMatches` which can lead to surprising scores.

Test plan:
updated unit test

+4 -4
+2 -2
build/scoring_test.go
··· 77 77 query: &query.Substring{Pattern: "example"}, 78 78 content: exampleJava, 79 79 language: "Java", 80 - // keyword-score:1.63 (sum-tf: 6.00, length-ratio: 2.00) 81 - wantScore: 1.63, 80 + // keyword-score:1.69 (sum-tf: 7.00, length-ratio: 2.00) 81 + wantScore: 1.69, 82 82 }, { 83 83 // Matches only on content 84 84 fileName: "example.java",
+2 -2
contentprovider.go
··· 147 147 // returned by the API it needs to be copied. 148 148 func (p *contentProvider) fillMatches(ms []*candidateMatch, numContextLines int, language string, debug bool) []LineMatch { 149 149 var filenameMatches []*candidateMatch 150 - contentMatches := ms[:0] 150 + contentMatches := make([]*candidateMatch, 0, len(ms)) 151 151 152 152 for _, m := range ms { 153 153 if m.fileName { ··· 194 194 // returned by the API it needs to be copied. 195 195 func (p *contentProvider) fillChunkMatches(ms []*candidateMatch, numContextLines int, language string, debug bool) []ChunkMatch { 196 196 var filenameMatches []*candidateMatch 197 - contentMatches := ms[:0] 197 + contentMatches := make([]*candidateMatch, 0, len(ms)) 198 198 199 199 for _, m := range ms { 200 200 if m.fileName {