fork of https://github.com/sourcegraph/zoekt
1
2 "Zoekt, en gij zult spinazie eten" - Jan Eertink
3
4 ("seek, and ye shall eat spinach" - My primary school teacher)
5
6This is a fast text search engine, intended for use with source
7code. (Pronunciation: roughly as you would pronounce "zooked" in English)
8
9**Note:** This is a [Sourcegraph](https://github.com/sourcegraph/zoekt) fork
10of [github.com/google/zoekt](https://github.com/google/zoekt). It is now the
11main maintained source of Zoekt.
12
13# INSTRUCTIONS
14
15## Downloading
16
17 go get github.com/sourcegraph/zoekt/
18
19## Indexing
20
21### Directory
22
23 go install github.com/sourcegraph/zoekt/cmd/zoekt-index
24 $GOPATH/bin/zoekt-index .
25
26### Git repository
27
28 go install github.com/sourcegraph/zoekt/cmd/zoekt-git-index
29 $GOPATH/bin/zoekt-git-index -branches master,stable-1.4 -prefix origin/ .
30
31### Repo repositories
32
33 go install github.com/sourcegraph/zoekt/cmd/zoekt-{repo-index,mirror-gitiles}
34 zoekt-mirror-gitiles -dest ~/repos/ https://gfiber.googlesource.com
35 zoekt-repo-index \
36 -name gfiber \
37 -base_url https://gfiber.googlesource.com/ \
38 -manifest_repo ~/repos/gfiber.googlesource.com/manifests.git \
39 -repo_cache ~/repos \
40 -manifest_rev_prefix=refs/heads/ --rev_prefix= \
41 master:default_unrestricted.xml
42
43## Searching
44
45### Web interface
46
47 go install github.com/sourcegraph/zoekt/cmd/zoekt-webserver
48 $GOPATH/bin/zoekt-webserver -listen :6070
49
50### JSON API
51
52You can retrieve search results as JSON by sending a GET request to zoekt-webserver.
53
54 curl --get \
55 --url "http://localhost:6070/search" \
56 --data-urlencode "q=ngram f:READ" \
57 --data-urlencode "num=50" \
58 --data-urlencode "format=json"
59
60The response data is a JSON object. You can refer to [web.ApiSearchResult](https://sourcegraph.com/github.com/sourcegraph/zoekt@6b1df4f8a3d7b34f13ba0cafd8e1a9b3fc728cf0/-/blob/web/api.go?L23:6&subtree=true) to learn about the structure of the object.
61
62### CLI
63
64 go install github.com/sourcegraph/zoekt/cmd/zoekt
65 $GOPATH/bin/zoekt 'ngram f:READ'
66
67## Installation
68A more organized installation on a Linux server should use a systemd unit file,
69eg.
70
71 [Unit]
72 Description=zoekt webserver
73
74 [Service]
75 ExecStart=/zoekt/bin/zoekt-webserver -index /zoekt/index -listen :443 --ssl_cert /zoekt/etc/cert.pem --ssl_key /zoekt/etc/key.pem
76 Restart=always
77
78 [Install]
79 WantedBy=default.target
80
81
82# SEARCH SERVICE
83
84Zoekt comes with a small service management program:
85
86 go install github.com/sourcegraph/zoekt/cmd/zoekt-indexserver
87
88 cat << EOF > config.json
89 [{"GithubUser": "username"},
90 {"GithubOrg": "org"},
91 {"GitilesURL": "https://gerrit.googlesource.com", "Name": "zoekt" }
92 ]
93 EOF
94
95 $GOPATH/bin/zoekt-indexserver -mirror_config config.json
96
97This will mirror all repos under 'github.com/username', 'github.com/org', as
98well as the 'zoekt' repository. It will index the repositories.
99
100It takes care of fetching and indexing new data and cleaning up logfiles.
101
102The webserver can be started from a standard service management framework, such
103as systemd.
104
105
106# SYMBOL SEARCH
107
108It is recommended to install [Universal
109ctags](https://github.com/universal-ctags/ctags) to improve
110ranking. See [here](doc/ctags.md) for more information.
111
112
113# ACKNOWLEDGEMENTS
114
115Thanks to Han-Wen Nienhuys for creating Zoekt. Thanks to Alexander Neubeck for
116coming up with this idea, and helping Han-Wen Nienhuys flesh it out.
117
118
119# FORK DETAILS
120
121Originally this fork contained some changes that do not make sense to upstream
122and or have not yet been upstreamed. However, this is now the defacto source
123for Zoekt. This section will remain for historical reasons and contains
124outdated information. It can be removed once the dust settles on moving from
125google/zoekt to sourcegraph/zoekt. Differences:
126
127- [zoekt-sourcegraph-indexserver](cmd/zoekt-sourcegraph-indexserver/main.go)
128 is a Sourcegraph specific command which indexes all enabled repositories on
129 Sourcegraph, as well as keeping the indexes up to date.
130- We have exposed the API via
131 [keegancsmith/rpc](https://github.com/keegancsmith/rpc) (a fork of `net/rpc`
132 which supports cancellation).
133- Query primitive `BranchesRepos` to efficiently specify a set of repositories to
134 search.
135- Allow empty shard directories on startup. Needed when starting a fresh
136 instance which hasn't indexed anything yet.
137- We can return symbol/ctag data in results. Additionally we can run symbol regex queries.
138- We search shards in order of repo name and ignore shard ranking.
139- Other minor changes.
140
141Assuming you have the gerrit upstream configured, a useful way to see what we
142changed is:
143
144``` shellsession
145$ git diff gerrit/master -- ':(exclude)vendor/' ':(exclude)Gopkg*'
146```
147
148# DISCLAIMER
149
150This is not an official Google product