Crate objectstore_server

Expand description

§Server Architecture

The objectstore server is an axum-based HTTP server that exposes the objectstore_service storage layer to clients. It handles authentication, authorization, rate limiting, and traffic control on top of the core storage operations.

§Endpoints

All object operations live under the /v1/ prefix:

Method	Path	Description
`POST`	`/v1/objects/{usecase}/{scopes}/`	Insert with server-generated key
`GET`	`/v1/objects/{usecase}/{scopes}/{key}`	Retrieve object
`HEAD`	`/v1/objects/{usecase}/{scopes}/{key}`	Retrieve metadata only
`PUT`	`/v1/objects/{usecase}/{scopes}/{key}`	Insert or overwrite with key
`DELETE`	`/v1/objects/{usecase}/{scopes}/{key}`	Delete object
`POST`	`/v1/objects:batch/{usecase}/{scopes}/`	Batch operations (multipart)

Scopes are encoded in the URL path using Matrix URI syntax: org=123;project=456. An underscore (_) represents empty scopes.

§Internal Endpoints

Internal endpoints are exempt from authentication, rate limiting, and the web concurrency limit so they remain available when the server is under load.

Method	Path	Description
`GET`	`/health`	Liveness probe (always returns 200)
`GET`	`/ready`	Readiness probe (returns 503 when `/tmp/objectstore.down` exists, enabling graceful drain)
`GET`	`/keda`	Prometheus text-format gauges for KEDA autoscaling (see KEDA Metrics)

§Request Flow

A request flows through several layers before reaching the storage service:

Middleware: metrics collection, in-flight request tracking, panic recovery, Sentry transaction tracing, distributed tracing.
Extractors: path parameters are parsed into an ObjectId or ObjectContext. The Authorization header is validated and decoded into an AuthContext. The optional x-downstream-service header is extracted for killswitch matching.
Admission control: killswitches and rate limits are checked during extraction. Rejected requests never reach the handler.
Handler: the endpoint handler calls the AuthAwareService, which checks permissions before delegating to the underlying StorageService. The service enforces its own backpressure before executing the operation.
Response: metadata is mapped to HTTP headers (see objectstore-types docs for the header mapping) and the payload is streamed back.

§Authentication & Authorization

Objectstore uses JWT tokens with EdDSA signatures (Ed25519) for authentication. Auth enforcement is optional and controlled by the auth.enforce config flag, allowing unauthenticated development setups.

§Token Structure

Tokens must include:

Header: kid (key ID) and alg: EdDSA
Claims: aud: "objectstore", iss: "sentry" or "relay", exp (expiration timestamp)
Resource claims (res): the usecase and scope values the token grants access to (e.g., {"os:usecase": "attachments", "org": "123"})
Permissions: array of granted operations (object.read, object.write, object.delete)

§Key Management

The PublicKeyDirectory maps key IDs (kid) to public keys. Each key entry supports multiple key versions for rotation — the server tries each version when verifying a token. Keys also carry max_permissions that are intersected with the token’s claimed permissions, limiting what any token signed by that key can do.

§Authorization Check

On every operation, AuthAwareService verifies that the token’s scopes and permissions cover the requested ObjectContext and operation type. Scope values in the token can use wildcards to grant broad access.

§Configuration

Configuration uses figment for layered merging with this precedence (highest wins):

Environment variables — prefixed with OS__, using __ as a nested separator. Example: OS__LONG_TERM_STORAGE__TYPE=gcs
YAML file — passed via the -c / --config CLI flag
Defaults — sensible development defaults (local filesystem backends, auth disabled)

Key configuration sections:

high_volume_storage / long_term_storage — backend type and connection parameters
auth — key directory and enforcement toggle
rate_limits — throughput and bandwidth limits
http — HTTP layer parameters (concurrency limit)
service — storage service parameters (backend concurrency limit)
killswitches — traffic blocking rules
runtime — worker threads, metrics interval
sentry / metrics / logging — observability

See the config module for the full configuration schema.

§Rate Limiting

Rate limiting operates at two levels:

§Throughput

Throughput limits use token bucket rate limiting with configurable burst. Limits can be set at multiple granularities:

Global: a maximum requests-per-second across all traffic
Per-usecase: a percentage of the global limit allocated to each usecase
Per-scope: a percentage of the global limit for specific scope values
Custom rules: specific RPS or percentage overrides matching usecase/scope combinations

§Bandwidth

Bandwidth limiting uses an exponentially weighted moving average (EWMA) to estimate current throughput. Payload streams are wrapped in a MeteredPayloadStream that reports bytes consumed. When the estimated bandwidth exceeds the configured limit, new requests are rejected.

Like throughput, bandwidth limits can be set at multiple granularities:

Global: a maximum bytes-per-second across all traffic (global_bps)
Per-usecase: a percentage of the global limit for each usecase
Per-scope: a percentage of the global limit for each scope value

Each granularity maintains its own EWMA estimator. The MeteredPayloadStream increments all applicable accumulators (global + per-usecase + per-scope) for every chunk polled. For non-streamed payloads (e.g., batch INSERT where the size is known upfront), bytes are recorded directly via record_bandwidth.

Rate-limited requests receive HTTP 429.

§Web Concurrency Limit

Before requests reach the storage service, a web-tier concurrency limit protects against connection floods. When the number of in-flight HTTP requests reaches http.max_requests (default: 10,000), new requests are rejected immediately with HTTP 503. Health and readiness endpoints (/health, /ready) are excluded from this limit. Rejections are counted in the web.concurrency.rejected metric.

Direct 503 rejection is preferred over readiness-based backpressure:

Instant response and recovery: direct 503 responds in milliseconds and frees capacity the moment any request completes. Readiness probes run on periodic intervals, leaving a window of continued overload and wasting capacity during recovery.
No cascade risk: multiple pods failing readiness probes simultaneously concentrates traffic onto remaining pods. Direct rejection keeps every pod in the pool and self-regulating.
Correct health semantics: a busy pod is still ready — its dependencies are reachable and it can serve traffic. Conflating load with readiness muddies alerting and incident response.
Environment-independent: works in any deployment, not just Kubernetes.

§Service Backpressure

Beyond rate limiting and the web concurrency limit, the StorageService enforces a second layer of backpressure through a concurrency limit on in-flight backend operations, configured via service.max_concurrency. When exceeded, requests receive HTTP 429. See the service architecture docs for details.

§KEDA Metrics

GET /keda serves a Prometheus text-format (version 0.0.4) snapshot of all four rate-limited resources for use with KEDA Prometheus scalers. The endpoint is exempt from the web concurrency limit and request metrics so that it remains available when the server is at capacity.

§Exposed Metrics

§EWMA Gauges

Pre-smoothed rates, self-contained per scrape (no irate() arithmetic needed):

Resource	Utilization	Limit
Bandwidth	`objectstore_bandwidth_ewma`	`objectstore_bandwidth_limit` (only when `global_bps` is set)
Throughput	`objectstore_throughput_ewma`	`objectstore_throughput_limit` (only when `global_rps` is set)
HTTP concurrency	`objectstore_requests_in_flight`	`objectstore_requests_limit`
Task concurrency	`objectstore_tasks_running`	`objectstore_tasks_limit`

Throughput uses an EWMA with a 50 ms tick and α = 0.2, matching the existing bandwidth estimator. The accumulator counts fully admitted requests (requests that pass all throughput checks).

§Counters

Monotonically increasing totals since startup; use irate(counter[window]) in KEDA queries for an unsmoothed, immediately responsive rate:

Counter	Description
`objectstore_bytes_total`	Total bytes transferred since startup
`objectstore_requests_total`	Total admitted requests since startup

§Example KEDA ScaledObject Triggers

§Using EWMA gauges (backward-compatible)

Scale on the highest utilization across all four resources:

triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      query: |
        max(
          objectstore_bandwidth_ewma / objectstore_bandwidth_limit
          or objectstore_throughput_ewma / objectstore_throughput_limit
          or objectstore_requests_in_flight / objectstore_requests_limit
          or objectstore_tasks_running / objectstore_tasks_limit
        )
      threshold: "0.7"

§Using counters with `irate()` (more responsive)

Uses the last two scraped values for an instantaneous rate with no smoothing lag:

triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      query: |
        max(
          irate(objectstore_bytes_total[2m]) / objectstore_bandwidth_limit
          or irate(objectstore_requests_total[2m]) / objectstore_throughput_limit
          or objectstore_requests_in_flight / objectstore_requests_limit
          or objectstore_tasks_running / objectstore_tasks_limit
        )
      threshold: "0.7"

Unconfigured limits produce no series and are excluded from or automatically.

§Killswitches

Killswitches provide emergency traffic blocking without redeployment. Each killswitch is a set of conditions that, when all matched, cause requests to be rejected with HTTP 403:

Usecase: exact match on the usecase string
Scopes: all specified scope key-value pairs must be present
Service: a glob pattern matched against the x-downstream-service request header (e.g., "relay-*" to block all relay instances)

A killswitch with no conditions matches all traffic. Multiple killswitches are evaluated with OR semantics — any match triggers rejection. Killswitches are checked during request extraction, before the handler runs.

Modules§

auth: Authorization logic for objectstore.
batch: HTTP header names used in batch request and response processing.
cli: Command-line interface for the objectstore server.
config: Configuration for the objectstore server.
endpoints: Contains all HTTP endpoint handlers.
extractors: Axum request extractors for objectstore endpoints.
healthcheck: CLI healthcheck subcommand implementation.
killswitches: Runtime killswitches for disabling access to specific object contexts.
multipart: Types and utilities to support Multipart streaming responses.
observability: Initialization of error reporting and distributed tracing.
rate_limits: Admission-based rate limiting for throughput and bandwidth.
state: Shared server state passed to all HTTP request handlers.
web: Module implementing the Objectstore API webserver.