Commits · boltless.me/zoekt · Tangled

boltless.me / zoekt

0

fork of https://github.com/sourcegraph/zoekt

0

Commits

Author

Commit

Message

Date

Keegan Carruthers-Smith

adf376d3

web: informative and verbose error message when watchdog fails (#647)

2y ago

Stefan Hengl +2

af126653

indexserver: delete tmp dir on startup (#646)

2y ago

Geoffrey Gilmore

48ed5ac5

grpc: zoekt-sourcegraph-indexserver: support retries when frontend isn't available (#645)

2y ago

Geoffrey Gilmore

2d1affd4

grpc: RepoList: actually persist "repos" field when converting to protobuf message (#644)

2y ago

Geoffrey Gilmore

3ce1f2b2

grpc: add prometheus server and client prometheus metrics (#642)

2y ago

Geoffrey Gilmore

40a9a23b

grpc: FileMatch: tweak file_name to be bytes instead of string (#641)

2y ago

Geoffrey Gilmore

f75df3d8

grpc: port messagesize interceptors and raise default client message size to 90mb (#640)

2y ago

Geoffrey Gilmore

993cfdb2

grpc: port internal error interceptors from sourcegraph/sourcegraph (#639)

2y ago

Geoffrey Gilmore

fcb279ae

grpc: zoekt-webserver: stream search: break up file matches across multiple messages (#636)

2y ago

956d775e

Extract samplingSender and use it for gRPC (#637)

2y ago

d5723536

remove bazel (#634)

2y ago

Keegan Carruthers-Smith

63da184a

stat: introduce timing stats around shard search (#633)

2y ago

9559422b

DisplayTruncator: always apply both limits (#632)

2y ago

Keegan Carruthers-Smith

eede1229

gofmt -s -w .

2y ago

Keegan Carruthers-Smith

626c7d8f

introduce DisplayTruncator (#630)

2y ago

6a428ad6

SearchOptions: add MaxMatchDisplayCount (#615)

All clients of zoekt have a shared problem: they have no reliable way to
bound the size of the SearchResult. The primary dimension that
determines the size of a SearchResult is the number of matches. None of
the existing levers zoekt provides sufficiently limit this size:
- MaxDocDisplayCount is a hard limit on the number of Files in the
SearchResult. But when a single File can have an arbitrary number of
matches for the query, you can still end up with enormous
SearchResults when this parameter is 1.

The existing *MaxMatchCount parameters are more about limiting the
amount of work zoekt does when executing queries than they are about
limiting the response size:
- TotalMaxMatchCount is a soft limit on the number of matches
across shards. But it is only evaluated after handling each shard, so
if a single shard has an enormous number of matches, the SearchResult
will be enormous.
- ShardMaxMatchCount is a soft limit on the number of matches from a
single shard. But it is only evaluated after handling each document, so
if a single document has an enormous number of matches, the
SearchResult will be enormous.
- ShardRepoMaxMatchCount, well, you get the idea.

Different clients have a differing ability to tolerate enormous
SearchResults. Sourcegraph, for example, is apparently doing just fine;
they put hard limits on the number of matches in their own server, which
is itself a client of zoekt. They're presumably able to tolerate large
responses from zoekt as it's running colocated in a datacenter
environment.

But clients that are, for example, running in browsers, and using the
less-compact JSON-encoded API, are much less able to cope with enormous
SearchResults, which can be multiple megabytes large even with the most
conservative applications of the existing parameters.

Enter MaxMatchDisplayCount, which has similar semantics to
MaxDocDisplayCount, and is used by zoekt in the exact same places as
that parameter. With this, clients can get a much better handle on the
size of zoekt SearchResults.

2y ago

9c20a034

fix tracing (#627)

2y ago

0f6564bd

trace: add service.instance.id (#629)

2y ago

Manuel Ucles +1

99233243

Create buf-breaking-check.yml (#625)

2y ago

Keegan Carruthers-Smith

f9b3ea5d

Revert "indexdata: read posting list iff all ng exist (#619)" (#626)

2y ago

b7e5070b

indexdata: read posting list iff all ng exist (#619)

2y ago

Keegan Carruthers-Smith

0aefb15e

rename ngrams to contentNgrams (#623)

2y ago

Keegan Carruthers-Smith

cbe083c9

remove ZOEKT_ENABLE_LAZY_DOC_SECTIONS (#620)

2y ago

Keegan Carruthers-Smith

1d71fd02

ci: remove sync-zoekt step (#621)

2y ago

Keegan Carruthers-Smith

34f694c3

maximise distance between ngrams (#618)

2y ago

Keegan Carruthers-Smith

2632acf4

rm ngramoffset.go from BUILD.bazel

2y ago

Keegan Carruthers-Smith +1

45f608ff

sort ngrams before looking them up (#617)

2y ago

Keegan Carruthers-Smith

3d0bdd5c

remove ngram offset code (#616)

2y ago

Keegan Carruthers-Smith

f9d3a0e2

zoekt: add fgprof for full profiling (#614)

2y ago

Keegan Carruthers-Smith

008a775b

zoekt-indexserver: use value format directive for bad conf warning

2y ago

Philipp Wollermann

9abbb8b0

zoekt-indexserver: Prevent invalid config from causing an NPE (#612)

3y ago

Keegan Carruthers-Smith

25c1ea51

all: observe missing Stats RegexpsConsidered and FlushReason (#611)

3y ago

e2e8aede

Fix template documentation comments (#610)

3y ago

Keegan Carruthers-Smith

a176bde1

go get -u -t ./... (#609)

3y ago

Keegan Carruthers-Smith

7643f3b3

matchiter: capture metric NgramLookups (#608)

3y ago

Keegan Carruthers-Smith

93f7b0c9

matchtree: capture Stats before pruning (#607)

3y ago

Rodrigo Silva Mendoza

b9e6d943

zoekt-indexserver: Check stderr for git fetch (#603)

3y ago

Keegan Carruthers-Smith

7078a585

shards: populate RepoList.Stats.Repos (#605)

3y ago

Keegan Carruthers-Smith

1686b50d

indexserver: remove unused GetRepoRank (#604)

3y ago

Rodrigo Silva Mendoza

63241cb1

Update the zoekt config for a repo every mirror_interval (#600)

3y ago

Keegan Carruthers-Smith

8e309eb6

web: e2e test for RPC (#602)

3y ago

edf0c8b4

fix panic when tracing is not enabled (#601)

3y ago

fb27b377

Tracing: fix for HTTP requests (#599)

3y ago

Keegan Carruthers-Smith

0148e024

indexserver: include index time when pushing sg index updates (#598)

3y ago

Keegan Carruthers-Smith

a4e18dd2

ci: only run sg PR creation after docker push (#597)

3y ago

Keegan Carruthers-Smith

e1876ff4

list: add indextime to MinimalRepoListEntry (#596)

3y ago

ae9c94df

Add scip-ctags to docker (#594)

3y ago

68aa74e2

Use normalizeLanguage to properly map langs (#593)

3y ago

Julie Tibshirani

b8b67221

Add more debug info for keyword scoring (#592)

3y ago

88def9b1

Tracing: add grpc tracing interceptors (#591)

3y ago

bba2733d

Fix bazel build (#590)

3y ago

b45da912

Add a gRPC API (#577)

3y ago

c4c4a21d

update buildfiles (#589)

3y ago

ffc7feb6

Add alternate ctags parser and language map (#581)

3y ago

Keegan Carruthers-Smith

70af1120

gomod: bump cloudflare/circl to v1.3.3 for CVE-2023-1732 (#588)

3y ago

Jean-Hadrien Chabran

579a9a1e

gha: fix images not being pushed (#587)

3y ago

Jean-Hadrien Chabran

8c4dead1

Fix typo in GHA

3y ago

Jean-Hadrien Chabran

a1afc5d0

Update "ci" worflow (#586)

3y ago

Keegan Carruthers-Smith

47f830cc

ctags: remove support for exuberant-ctags (#585)

3y ago

Julie Tibshirani

5250e0e5

Add experimental option for keyword scoring (#583)

3y ago

web: informative and verbose error message when watchdog fails (#647)

adf376d3

Keegan Carruthers-Smith

2y

indexserver: delete tmp dir on startup (#646)

af126653

Stefan Hengl +2

2y

grpc: zoekt-sourcegraph-indexserver: support retries when frontend isn't available (#645)

48ed5ac5

Geoffrey Gilmore

2y

grpc: RepoList: actually persist "repos" field when converting to protobuf message (#644)

2d1affd4

Geoffrey Gilmore

2y

grpc: add prometheus server and client prometheus metrics (#642)

3ce1f2b2

Geoffrey Gilmore

2y

grpc: FileMatch: tweak file_name to be bytes instead of string (#641)

40a9a23b

Geoffrey Gilmore

2y

grpc: port messagesize interceptors and raise default client message size to 90mb (#640)

f75df3d8

Geoffrey Gilmore

2y

grpc: port internal error interceptors from sourcegraph/sourcegraph (#639)

993cfdb2

Geoffrey Gilmore

2y

grpc: zoekt-webserver: stream search: break up file matches across multiple messages (#636)

fcb279ae

Geoffrey Gilmore

2y

Extract samplingSender and use it for gRPC (#637)

956d775e

Camden Cheek

2y

remove bazel (#634)

d5723536

Dave Try

2y

stat: introduce timing stats around shard search (#633)

63da184a

Keegan Carruthers-Smith

2y

DisplayTruncator: always apply both limits (#632)

9559422b

Ian Kerins

2y

gofmt -s -w .

eede1229

Keegan Carruthers-Smith

2y

introduce DisplayTruncator (#630)

626c7d8f

Keegan Carruthers-Smith

2y

SearchOptions: add MaxMatchDisplayCount (#615)

All clients of zoekt have a shared problem: they have no reliable way to
bound the size of the SearchResult. The primary dimension that
determines the size of a SearchResult is the number of matches. None of
the existing levers zoekt provides sufficiently limit this size:
- MaxDocDisplayCount is a hard limit on the number of Files in the
SearchResult. But when a single File can have an arbitrary number of
matches for the query, you can still end up with enormous
SearchResults when this parameter is 1.

The existing *MaxMatchCount parameters are more about limiting the
amount of work zoekt does when executing queries than they are about
limiting the response size:
- TotalMaxMatchCount is a soft limit on the number of matches
across shards. But it is only evaluated after handling each shard, so
if a single shard has an enormous number of matches, the SearchResult
will be enormous.
- ShardMaxMatchCount is a soft limit on the number of matches from a
single shard. But it is only evaluated after handling each document, so
if a single document has an enormous number of matches, the
SearchResult will be enormous.
- ShardRepoMaxMatchCount, well, you get the idea.

Different clients have a differing ability to tolerate enormous
SearchResults. Sourcegraph, for example, is apparently doing just fine;
they put hard limits on the number of matches in their own server, which
is itself a client of zoekt. They're presumably able to tolerate large
responses from zoekt as it's running colocated in a datacenter
environment.

But clients that are, for example, running in browsers, and using the
less-compact JSON-encoded API, are much less able to cope with enormous
SearchResults, which can be multiple megabytes large even with the most
conservative applications of the existing parameters.

Enter MaxMatchDisplayCount, which has similar semantics to
MaxDocDisplayCount, and is used by zoekt in the exact same places as
that parameter. With this, clients can get a much better handle on the
size of zoekt SearchResults.

6a428ad6

Ian Kerins

2y

fix tracing (#627)

9c20a034

Stefan Hengl

2y

trace: add service.instance.id (#629)

0f6564bd

Stefan Hengl

2y

Create buf-breaking-check.yml (#625)

99233243

Manuel Ucles +1

2y

Revert "indexdata: read posting list iff all ng exist (#619)" (#626)

f9b3ea5d

Keegan Carruthers-Smith

2y

indexdata: read posting list iff all ng exist (#619)

b7e5070b

Stefan Hengl

2y

rename ngrams to contentNgrams (#623)

0aefb15e

Keegan Carruthers-Smith

2y

remove ZOEKT_ENABLE_LAZY_DOC_SECTIONS (#620)

cbe083c9

Keegan Carruthers-Smith

2y

ci: remove sync-zoekt step (#621)

1d71fd02

Keegan Carruthers-Smith

2y

maximise distance between ngrams (#618)

34f694c3

Keegan Carruthers-Smith

2y

rm ngramoffset.go from BUILD.bazel

2632acf4

Keegan Carruthers-Smith

2y

sort ngrams before looking them up (#617)

45f608ff

Keegan Carruthers-Smith +1

2y

remove ngram offset code (#616)

3d0bdd5c

Keegan Carruthers-Smith

2y

zoekt: add fgprof for full profiling (#614)

f9d3a0e2

Keegan Carruthers-Smith

2y

zoekt-indexserver: use value format directive for bad conf warning

008a775b

Keegan Carruthers-Smith

2y

zoekt-indexserver: Prevent invalid config from causing an NPE (#612)

9abbb8b0

Philipp Wollermann

3y

all: observe missing Stats RegexpsConsidered and FlushReason (#611)

25c1ea51

Keegan Carruthers-Smith

3y

Fix template documentation comments (#610)

e2e8aede

Ian Kerins

3y

go get -u -t ./... (#609)

a176bde1

Keegan Carruthers-Smith

3y

matchiter: capture metric NgramLookups (#608)

7643f3b3

Keegan Carruthers-Smith

3y

matchtree: capture Stats before pruning (#607)

93f7b0c9

Keegan Carruthers-Smith

3y

zoekt-indexserver: Check stderr for git fetch (#603)

b9e6d943

Rodrigo Silva Mendoza

3y

shards: populate RepoList.Stats.Repos (#605)

7078a585

Keegan Carruthers-Smith

3y

indexserver: remove unused GetRepoRank (#604)

1686b50d

Keegan Carruthers-Smith

3y

Update the zoekt config for a repo every mirror_interval (#600)

63241cb1

Rodrigo Silva Mendoza

3y

web: e2e test for RPC (#602)

8e309eb6

Keegan Carruthers-Smith

3y

fix panic when tracing is not enabled (#601)

edf0c8b4

Camden Cheek

3y

Tracing: fix for HTTP requests (#599)

fb27b377

Camden Cheek

3y

indexserver: include index time when pushing sg index updates (#598)

0148e024

Keegan Carruthers-Smith

3y

ci: only run sg PR creation after docker push (#597)

a4e18dd2

Keegan Carruthers-Smith

3y

list: add indextime to MinimalRepoListEntry (#596)

e1876ff4

Keegan Carruthers-Smith

3y

Add scip-ctags to docker (#594)

ae9c94df

Auguste Rame

3y

Use normalizeLanguage to properly map langs (#593)

68aa74e2

Auguste Rame

3y

Add more debug info for keyword scoring (#592)

b8b67221

Julie Tibshirani

3y

Tracing: add grpc tracing interceptors (#591)

88def9b1

Camden Cheek

3y

Fix bazel build (#590)

bba2733d

Camden Cheek

3y

Add a gRPC API (#577)

b45da912

Camden Cheek

3y

update buildfiles (#589)

c4c4a21d

Dave Try

3y

Add alternate ctags parser and language map (#581)

ffc7feb6

Auguste Rame

3y

gomod: bump cloudflare/circl to v1.3.3 for CVE-2023-1732 (#588)

70af1120

Keegan Carruthers-Smith

3y

gha: fix images not being pushed (#587)

579a9a1e

Jean-Hadrien Chabran

3y

Fix typo in GHA

8c4dead1

Jean-Hadrien Chabran

3y

Update "ci" worflow (#586)

a1afc5d0

Jean-Hadrien Chabran

3y

ctags: remove support for exuberant-ctags (#585)

47f830cc

Keegan Carruthers-Smith

3y

Add experimental option for keyword scoring (#583)

5250e0e5

Julie Tibshirani

3y

Next