fork of https://github.com/sourcegraph/zoekt
0

Configure Feed

Select the types of activity you want to include in your feed.

gitindex: correctly skip UTF-8 BOM (#230)

Previously we called buf.ReadRune to detect if we had a BOM. However,
buf.ReadRune on the BOM just consumes the first byte and returns
"\uFFFD". So this code accidently worked. In the case of the BOM the
UnreadRune call actually returned an error which we didn't check.

This updates the code to correctly detect a BOM. Additionally it doesn't
have to rely on reading then unreading. Instead we can peak at what is
remaining since this is a bytes.Buffer.

There is a risk that there are submodule files that don't have the exact
BOM we are detecting and somehow we are skipping over and working on
them due to the code before. So when this code rolls out we should
monitor production.

+10 -8
+10 -8
gitindex/submodule.go
··· 17 17 import ( 18 18 "bytes" 19 19 "fmt" 20 - "io" 21 20 22 21 "github.com/go-git/go-git/v5/plumbing/format/config" 23 22 ) ··· 36 35 // Handle the possibility that .gitmodules has a UTF-8 BOM, which would 37 36 // otherwise break the scanner. 38 37 // https://stackoverflow.com/a/21375405 39 - r, _, err := buf.ReadRune() 40 - if err != nil && err != io.EOF { 41 - return nil, fmt.Errorf("buf.ReadRune: %w", err) 42 - } 43 - if r != '\uFEFF' { 44 - buf.UnreadRune() 45 - } 38 + skipIfPrefix(buf, []byte("\uFEFF")) 39 + 46 40 dec := config.NewDecoder(buf) 47 41 cfg := &config.Config{} 48 42 ··· 76 70 77 71 return result, nil 78 72 } 73 + 74 + // skipIfPrefix will detect if the unread portion of buf starts with 75 + // prefix. If it does, it will read over those bytes. 76 + func skipIfPrefix(buf *bytes.Buffer, prefix []byte) { 77 + if bytes.HasPrefix(buf.Bytes(), prefix) { 78 + buf.Next(len(prefix)) 79 + } 80 + }