fork of https://github.com/sourcegraph/zoekt
0

Configure Feed

Select the types of activity you want to include in your feed.

at main 117 lines 4.9 kB View raw View rendered
1# Zoekt: fast code search 2 3 "Zoekt, en gij zult spinazie eten" - Jan Eertink 4 5 ("seek, and ye shall eat spinach" - My primary school teacher) 6 7Zoekt is a text search engine intended for use with source 8code. (Pronunciation: roughly as you would pronounce "zooked" in English) 9 10**Note:** This has been the maintained source for Zoekt since 2017, when it was forked from the 11original repository [github.com/google/zoekt](https://github.com/google/zoekt). 12 13## Background 14 15Zoekt supports fast substring and regexp matching on source code, with a rich query language 16that includes boolean operators (and, or, not). It can search individual repositories, and search 17across many repositories in a large codebase. Zoekt ranks search results using a combination of code-related signals 18like whether the match is on a symbol. Because of its general design based on trigram indexing and syntactic 19parsing, it works well for a variety of programming languages. 20 21The two main ways to use the project are 22* Through individual commands, to index repositories and perform searches through Zoekt's [query language](doc/query_syntax.md) 23* Or, through the indexserver and webserver, which support syncing repositories from a code host and searching them through a web UI or API 24 25For more details on Zoekt's design, see the [docs directory](doc/). 26 27## Usage 28 29### Installation 30 31 go get github.com/sourcegraph/zoekt/ 32 33**Note**: It is also recommended to install [Universal ctags](https://github.com/universal-ctags/ctags), as symbol 34information is a key signal in ranking search results. See [ctags.md](doc/ctags.md) for more information. 35 36### Command-based usage 37 38Zoekt supports indexing and searching repositories on the command line. This is most helpful 39for simple local usage, or for testing and development. 40 41#### Indexing a local git repo 42 43 go install github.com/sourcegraph/zoekt/cmd/zoekt-git-index 44 $GOPATH/bin/zoekt-git-index -index ~/.zoekt /path/to/repo 45 46#### Indexing a local directory (not git-specific) 47 48 go install github.com/sourcegraph/zoekt/cmd/zoekt-index 49 $GOPATH/bin/zoekt-index -index ~/.zoekt /path/to/repo 50 51#### Searching an index 52 53 go install github.com/sourcegraph/zoekt/cmd/zoekt 54 $GOPATH/bin/zoekt 'hello' 55 $GOPATH/bin/zoekt 'hello file:README' 56 57### Zoekt services 58 59Zoekt also contains an index server and web server to support larger-scale indexing and searching 60of remote repositories. The index server can be configured to periodically fetch and reindex repositories 61from a code host. The webserver can be configured to serve search results through a web UI or API. 62 63#### Indexing a GitHub organization 64 65 go install github.com/sourcegraph/zoekt/cmd/zoekt-indexserver 66 67 echo YOUR_GITHUB_TOKEN_HERE > token.txt 68 echo '[{"GitHubOrg": "apache", "CredentialPath": "token.txt"}]' > config.json 69 70 $GOPATH/bin/zoekt-indexserver -mirror_config config.json -data_dir ~/.zoekt/ 71 72This will fetch all repos under 'github.com/apache', then index the repositories. The indexserver takes care of 73periodically fetching and indexing new data, and cleaning up logfiles. See [config.go](cmd/zoekt-indexserver/config.go) 74for more details on this configuration. 75 76#### Starting the web server 77 78 go install github.com/sourcegraph/zoekt/cmd/zoekt-webserver 79 $GOPATH/bin/zoekt-webserver -index ~/.zoekt/ 80 81This will start a web server with a simple search UI at http://localhost:6070. 82See the [query syntax docs](doc/query_syntax.md) for more details on the query 83language. 84 85#### Container image 86 87Zoekt publishes a single container image at `ghcr.io/sourcegraph/zoekt`. It 88includes the Zoekt binaries, `git`, and `universal-ctags`. By default it runs 89`zoekt-webserver` against `/data/index`: 90 91 docker run --rm -p 6070:6070 -v "$PWD/index:/data/index" ghcr.io/sourcegraph/zoekt 92 93You can override the default command to run `zoekt-indexserver` instead. This 94example stores cloned repositories, logs, and indexes under `/data` and reads a 95mounted mirror config file: 96 97 docker run --rm \ 98 -v "$PWD/config.json:/config.json:ro" \ 99 -v "$PWD/token.txt:/home/zoekt/token.txt:ro" \ 100 -v "$PWD/zoekt-data:/data" \ 101 ghcr.io/sourcegraph/zoekt \ 102 zoekt-indexserver -mirror_config /config.json -data_dir /data 103 104If you start the web server with `-rpc`, it exposes a [simple JSON search 105API](doc/json-api.md) at `http://localhost:6070/api/search`. 106 107The JSON API supports advanced features including: 108- Streaming search results (using the `FlushWallTime` option) 109- Alternative BM25 scoring (using the `UseBM25Scoring` option) 110- Context lines around matches (using the `NumContextLines` option) 111 112Finally, the web server exposes a gRPC API that supports [structured query objects](query/query.go) and advanced search options. 113 114## Acknowledgements 115 116Thanks to Han-Wen Nienhuys for creating Zoekt. Thanks to Alexander Neubeck for 117coming up with this idea, and helping Han-Wen Nienhuys flesh it out.