fork of https://github.com/sourcegraph/zoekt
0

Configure Feed

Select the types of activity you want to include in your feed.

index: use ID based shard names only on multi-tenant instances (#956)

We only want to use ID based shard names on multi-tenant instances.
Before this change we decided this based on if tenant enforcement was
turned on. However, we would like to always have tenant enforcment on at
Sourcegraph but not migrate all shards to the new naming scheme (just
yet).

This change is quite simple. However, it works since we never actually
read the filenames to extract tenant ID or repository ID. We only ever
use the data that is persisted inside of the shard to do the
enforcement.

The environment variable we read (WORKSPACES_API_URL) is the same one we
use in the Sourcegraph codebase to determine multi-tenant specific code.

Test Plan: Added a unit test. Will do a much larger manual E2E test
with the Sourcegraph project.

+58 -2
+3 -2
index/builder.go
··· 336 336 func (o *Options) shardNameVersion(version, n int) string { 337 337 var prefix string 338 338 339 - // If tenant enforcement is enabled and we have tenant/repo IDs, use those to generate the prefix 340 - if o.RepositoryDescription.TenantID != 0 && o.RepositoryDescription.ID != 0 && tenant.EnforceTenant() { 339 + // Sourcegraph specific: We use IDs in shard names on multi-tenant 340 + // instances to prevent conflicts. 341 + if tenant.UseIDBasedShardNames() { 341 342 prefix = fmt.Sprintf("%09d_%09d", o.RepositoryDescription.TenantID, o.RepositoryDescription.ID) 342 343 } else { 343 344 prefix = o.RepositoryDescription.Name
+33
index/builder_test.go
··· 1165 1165 }) 1166 1166 } 1167 1167 } 1168 + 1169 + func TestOptions_shardName(t *testing.T) { 1170 + opts := Options{ 1171 + IndexDir: "/data", 1172 + RepositoryDescription: zoekt.Repository{ 1173 + Name: "a/b", 1174 + TenantID: 123, 1175 + ID: 456, 1176 + }, 1177 + } 1178 + 1179 + t.Setenv("WORKSPACES_API_URL", "") 1180 + if got, want := opts.shardNameVersion(16, 0), "/data/a%2Fb_v16.00000.zoekt"; got != want { 1181 + t.Fatalf("expected shard name to be repo name based:\ngot: %q\nwant: %q", got, want) 1182 + } 1183 + 1184 + t.Setenv("WORKSPACES_API_URL", "http://example.com") 1185 + if got, want := opts.shardNameVersion(16, 0), "/data/000000123_000000456_v16.00000.zoekt"; got != want { 1186 + t.Fatalf("expected shard name to be ID based:\ngot: %q\nwant: %q", got, want) 1187 + } 1188 + 1189 + // If something goes wrong and TenantID and RepoID is unset, we create a 1190 + // name which won't be visible by any tenant. 1191 + opts = Options{ 1192 + IndexDir: "/data", 1193 + RepositoryDescription: zoekt.Repository{ 1194 + Name: "a/b", 1195 + }, 1196 + } 1197 + if got, want := opts.shardNameVersion(16, 0), "/data/000000000_000000000_v16.00000.zoekt"; got != want { 1198 + t.Fatalf("expected shard name to be with no tenant:\ngot: %q\nwant: %q", got, want) 1199 + } 1200 + }
+22
internal/tenant/enforcement.go
··· 1 1 package tenant 2 2 3 3 import ( 4 + "os" 5 + 4 6 "github.com/sourcegraph/zoekt/internal/tenant/internal/enforcement" 5 7 ) 6 8 ··· 12 14 return false 13 15 } 14 16 } 17 + 18 + // UseIDBasedShardNames returns true if the on disk layout of shards should 19 + // instead use tenant ID and repository IDs in the names instead of the actual 20 + // repository names. 21 + // 22 + // It is possible for repositories to have the same name, but have different 23 + // content in a multi-tenant setup. As such, this implementation only returns 24 + // true in those situations. 25 + // 26 + // Note: We could migrate all on-disk layout to only be ID based. However, 27 + // ID's are a Sourcegraph specific feature so we will always need the two code 28 + // paths. As such we only return true in multitenant setups. 29 + // 30 + // This is Sourcegraph specific. 31 + func UseIDBasedShardNames() bool { 32 + // We use the presence of this environment variable to tell if we are in a 33 + // multi-tenant setup. This is the same check that is done in the 34 + // Sourcegraph monorepo. 35 + return os.Getenv("WORKSPACES_API_URL") != "" 36 + }