fork of https://github.com/sourcegraph/zoekt
1# Zoekt: fast code search
2
3 "Zoekt, en gij zult spinazie eten" - Jan Eertink
4
5 ("seek, and ye shall eat spinach" - My primary school teacher)
6
7Zoekt is a text search engine intended for use with source
8code. (Pronunciation: roughly as you would pronounce "zooked" in English)
9
10**Note:** This has been the maintained source for Zoekt since 2017, when it was forked from the
11original repository [github.com/google/zoekt](https://github.com/google/zoekt).
12
13## Background
14
15Zoekt supports fast substring and regexp matching on source code, with a rich query language
16that includes boolean operators (and, or, not). It can search individual repositories, and search
17across many repositories in a large codebase. Zoekt ranks search results using a combination of code-related signals
18like whether the match is on a symbol. Because of its general design based on trigram indexing and syntactic
19parsing, it works well for a variety of programming languages.
20
21The two main ways to use the project are
22* Through individual commands, to index repositories and perform searches through Zoekt's [query language](doc/query_syntax.md)
23* Or, through the indexserver and webserver, which support syncing repositories from a code host and searching them through a web UI or API
24
25For more details on Zoekt's design, see the [docs directory](doc/).
26
27## Usage
28
29### Installation
30
31 go get github.com/sourcegraph/zoekt/
32
33**Note**: It is also recommended to install [Universal ctags](https://github.com/universal-ctags/ctags), as symbol
34information is a key signal in ranking search results. See [ctags.md](doc/ctags.md) for more information.
35
36### Command-based usage
37
38Zoekt supports indexing and searching repositories on the command line. This is most helpful
39for simple local usage, or for testing and development.
40
41#### Indexing a local git repo
42
43 go install github.com/sourcegraph/zoekt/cmd/zoekt-git-index
44 $GOPATH/bin/zoekt-git-index -index ~/.zoekt /path/to/repo
45
46#### Indexing a local directory (not git-specific)
47
48 go install github.com/sourcegraph/zoekt/cmd/zoekt-index
49 $GOPATH/bin/zoekt-index -index ~/.zoekt /path/to/repo
50
51#### Searching an index
52
53 go install github.com/sourcegraph/zoekt/cmd/zoekt
54 $GOPATH/bin/zoekt 'hello'
55 $GOPATH/bin/zoekt 'hello file:README'
56
57### Zoekt services
58
59Zoekt also contains an index server and web server to support larger-scale indexing and searching
60of remote repositories. The index server can be configured to periodically fetch and reindex repositories
61from a code host. The webserver can be configured to serve search results through a web UI or API.
62
63#### Indexing a GitHub organization
64
65 go install github.com/sourcegraph/zoekt/cmd/zoekt-indexserver
66
67 echo YOUR_GITHUB_TOKEN_HERE > token.txt
68 echo '[{"GitHubOrg": "apache", "CredentialPath": "token.txt"}]' > config.json
69
70 $GOPATH/bin/zoekt-indexserver -mirror_config config.json -data_dir ~/.zoekt/
71
72This will fetch all repos under 'github.com/apache', then index the repositories. The indexserver takes care of
73periodically fetching and indexing new data, and cleaning up logfiles. See [config.go](cmd/zoekt-indexserver/config.go)
74for more details on this configuration.
75
76#### Starting the web server
77
78 go install github.com/sourcegraph/zoekt/cmd/zoekt-webserver
79 $GOPATH/bin/zoekt-webserver -index ~/.zoekt/
80
81This will start a web server with a simple search UI at http://localhost:6070.
82See the [query syntax docs](doc/query_syntax.md) for more details on the query
83language.
84
85#### Container image
86
87Zoekt publishes a single container image at `ghcr.io/sourcegraph/zoekt`. It
88includes the Zoekt binaries, `git`, and `universal-ctags`. By default it runs
89`zoekt-webserver` against `/data/index`:
90
91 docker run --rm -p 6070:6070 -v "$PWD/index:/data/index" ghcr.io/sourcegraph/zoekt
92
93You can override the default command to run `zoekt-indexserver` instead. This
94example stores cloned repositories, logs, and indexes under `/data` and reads a
95mounted mirror config file:
96
97 docker run --rm \
98 -v "$PWD/config.json:/config.json:ro" \
99 -v "$PWD/token.txt:/home/zoekt/token.txt:ro" \
100 -v "$PWD/zoekt-data:/data" \
101 ghcr.io/sourcegraph/zoekt \
102 zoekt-indexserver -mirror_config /config.json -data_dir /data
103
104If you start the web server with `-rpc`, it exposes a [simple JSON search
105API](doc/json-api.md) at `http://localhost:6070/api/search`.
106
107The JSON API supports advanced features including:
108- Streaming search results (using the `FlushWallTime` option)
109- Alternative BM25 scoring (using the `UseBM25Scoring` option)
110- Context lines around matches (using the `NumContextLines` option)
111
112Finally, the web server exposes a gRPC API that supports [structured query objects](query/query.go) and advanced search options.
113
114## Acknowledgements
115
116Thanks to Han-Wen Nienhuys for creating Zoekt. Thanks to Alexander Neubeck for
117coming up with this idea, and helping Han-Wen Nienhuys flesh it out.