alpha
Login
or
Join now
boltless.me
/
zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
fork of https://github.com/sourcegraph/zoekt
Star
0
Fork
0
Atom
Configure Feed
Issues
Pull Requests
Commits
Tags
Feed URL
Select the types of activity you want to include in your feed.
Overview
Issues
Pulls
Pipelines
zoekt
/
build
/
at
e068116194eadd9ef4d619fc53371289bf317d58
1 folder
7 files
Julie Tibshirani
Indexing: improve skipped doc handling (#687)
2y ago
e0681161
testdata
Add benchmark for ctags conversion (#679) This change adds a benchmark for the conversion from ctags output to Zoekt document data, plus a tiny optimization to presize the symbol slices.
2 years ago
builder.go
Indexing: improve skipped doc handling (#687) This change makes a couple small improvements to how we handle skipped docs: * Immediately skip ctags parsing if the content is `nil` * Always sort skipped docs to the end of the shard. This seems like a nice invariant. And generally it's good for performance to group data that is expected to be accessed together and has similar content.
2 years ago
builder_test.go
Indexing: respect indexing buffer limit (#686) When indexing documents, we buffer up documents until we reach the shard size limit (100MB), then flush the shard. If we decide to skip a document because it's a binary file, then (naturally) we don't count its content size towards the shard limit. But we still buffered the full document. So if there are a large number of binary files, we could easily blow past the 100MB limit and run into memory issues. This change simply clears `Content` whenever `SkipReason` is set. The invariant: a buffered document should only ever have `SkipReason` or `Content`, not both.
2 years ago
builder_unix.go
Swap out all usages of the `syscall` package (#513) with the `golang.org/x/sys/unix` package. `syscall` has been frozen since Go 1.3 and deprecated (https://go.dev/doc/go1.4#major_library_changes). Using the `golang.org/x/sys/unix` package will bring in bug fixes and enhancements since `syscall` was frozen in 1.3, and will pave the way for multi-platform builds (which will affect only the single-program local install, most likely).
3 years ago
ctags.go
Indexing: improve skipped doc handling (#687) This change makes a couple small improvements to how we handle skipped docs: * Immediately skip ctags parsing if the content is `nil` * Always sort skipped docs to the end of the shard. This seems like a nice invariant. And generally it's good for performance to group data that is expected to be accessed together and has similar content.
2 years ago
ctags_test.go
build: faster newLinesIndices via bytes.IndexByte and buffer re-use (#680) Firstly we use bytes.IndexByte for faster newLinesIndices. On my machine this reduces wall clock time of BenchmarkTagsToSections by 38%. This is faster since bytes.IndexByte relies on CPU specific optimizations to find the next new line (eg uses AVX2 if available). Secondly we reuse nls slice between calls to tagsToSections. I noticed in the profiler a nonsignificant chunk in the garbage collector. The slice built by newLinesIndices is allocated and thrown away for each call to tagsToSections. This means we can re-use it which this commit implements by introducing a struct storing the buffer. We now use this buffer per shard of symbols we analyse. old time/op new time/op delta 188µs ± 7% 101µs ± 3% -46.10% (p=0.000 n=10+10) old alloc/op new alloc/op delta 79.3kB ± 0% 36.3kB ± 0% -54.24% (p=0.000 n=9+10) old allocs/op new allocs/op delta 443 ± 0% 441 ± 0% -0.45% (p=0.000 n=10+10) Test Plan: go test -bench BenchmarkTagsToSections
2 years ago
e2e_test.go
Indexing: improve skipped doc handling (#687) This change makes a couple small improvements to how we handle skipped docs: * Immediately skip ctags parsing if the content is `nil` * Always sort skipped docs to the end of the shard. This seems like a nice invariant. And generally it's good for performance to group data that is expected to be accessed together and has similar content.
2 years ago
scoring_test.go
Scoring: test against local scip-ctags (#677) This change refactors our end-to-end scoring tests and enables local testing using the scip-ctags binary: * Split scoring tests out of `e2e_test` and into their own file `scoring_test` * Split huge test methods into targeted ones like `TestFileNameMatch`, `TestJava`, `TestGo`, etc. * For languages that scip-ctags supports, rerun the same cases using the scip-ctags binary To run scip-ctags tests locally, you can set the env variable ``` SCIP_CTAGS_COMMAND=<sourcegraph-repo>/dev/scip-ctags-dev ``` This doesn't yet update Zoekt CI to run scip-ctags tests. That will be tackled in a follow-up.
2 years ago