Crate objectstore_server

Crate objectstore_server 

Source
Expand description

§Server Architecture

The objectstore server is an axum-based HTTP server that exposes the objectstore_service storage layer to clients. It handles authentication, authorization, rate limiting, and traffic control on top of the core storage operations.

§Endpoints

All object operations live under the /v1/ prefix:

MethodPathDescription
POST/v1/objects/{usecase}/{scopes}/Insert with server-generated key
GET/v1/objects/{usecase}/{scopes}/{key}Retrieve object
HEAD/v1/objects/{usecase}/{scopes}/{key}Retrieve metadata only
PUT/v1/objects/{usecase}/{scopes}/{key}Insert or overwrite with key
DELETE/v1/objects/{usecase}/{scopes}/{key}Delete object
POST/v1/objects:batch/{usecase}/{scopes}/Batch operations (multipart)

Scopes are encoded in the URL path using Matrix URI syntax: org=123;project=456. An underscore (_) represents empty scopes.

§Internal Endpoints

Internal endpoints are exempt from authentication, rate limiting, and the web concurrency limit so they remain available when the server is under load.

MethodPathDescription
GET/healthLiveness probe (always returns 200)
GET/readyReadiness probe (returns 503 when /tmp/objectstore.down exists, enabling graceful drain)
GET/kedaPrometheus text-format gauges for KEDA autoscaling (see KEDA Metrics)

§Request Flow

A request flows through several layers before reaching the storage service:

  1. Middleware: metrics collection, in-flight request tracking, panic recovery, Sentry transaction tracing, distributed tracing.
  2. Extractors: path parameters are parsed into an ObjectId or ObjectContext. The Authorization header is validated and decoded into an AuthContext. The optional x-downstream-service header is extracted for killswitch matching.
  3. Admission control: killswitches and rate limits are checked during extraction. Rejected requests never reach the handler.
  4. Handler: the endpoint handler calls the AuthAwareService, which checks permissions before delegating to the underlying StorageService. The service enforces its own backpressure before executing the operation.
  5. Response: metadata is mapped to HTTP headers (see objectstore-types docs for the header mapping) and the payload is streamed back.

§Authentication & Authorization

Objectstore uses JWT tokens with EdDSA signatures (Ed25519) for authentication. Auth enforcement is optional and controlled by the auth.enforce config flag, allowing unauthenticated development setups.

§Token Structure

Tokens must include:

  • Header: kid (key ID) and alg: EdDSA
  • Claims: aud: "objectstore", iss: "sentry" or "relay", exp (expiration timestamp)
  • Resource claims (res): the usecase and scope values the token grants access to (e.g., {"os:usecase": "attachments", "org": "123"})
  • Permissions: array of granted operations (object.read, object.write, object.delete)

§Key Management

The PublicKeyDirectory maps key IDs (kid) to public keys. Each key entry supports multiple key versions for rotation — the server tries each version when verifying a token. Keys also carry max_permissions that are intersected with the token’s claimed permissions, limiting what any token signed by that key can do.

§Authorization Check

On every operation, AuthAwareService verifies that the token’s scopes and permissions cover the requested ObjectContext and operation type. Scope values in the token can use wildcards to grant broad access.

§Configuration

Configuration uses figment for layered merging with this precedence (highest wins):

  1. Environment variables — prefixed with OS__, using __ as a nested separator. Example: OS__LONG_TERM_STORAGE__TYPE=gcs
  2. YAML file — passed via the -c / --config CLI flag
  3. Defaults — sensible development defaults (local filesystem backends, auth disabled)

Key configuration sections:

  • high_volume_storage / long_term_storage — backend type and connection parameters
  • auth — key directory and enforcement toggle
  • rate_limits — throughput and bandwidth limits
  • http — HTTP layer parameters (concurrency limit)
  • service — storage service parameters (backend concurrency limit)
  • killswitches — traffic blocking rules
  • runtime — worker threads, metrics interval
  • sentry / metrics / logging — observability

See the config module for the full configuration schema.

§Rate Limiting

Rate limiting operates at two levels:

§Throughput

Throughput limits use token bucket rate limiting with configurable burst. Limits can be set at multiple granularities:

  • Global: a maximum requests-per-second across all traffic
  • Per-usecase: a percentage of the global limit allocated to each usecase
  • Per-scope: a percentage of the global limit for specific scope values
  • Custom rules: specific RPS or percentage overrides matching usecase/scope combinations

§Bandwidth

Bandwidth limiting uses an exponentially weighted moving average (EWMA) to estimate current throughput. Payload streams are wrapped in a MeteredPayloadStream that reports bytes consumed. When the estimated bandwidth exceeds the configured limit, new requests are rejected.

Like throughput, bandwidth limits can be set at multiple granularities:

  • Global: a maximum bytes-per-second across all traffic (global_bps)
  • Per-usecase: a percentage of the global limit for each usecase
  • Per-scope: a percentage of the global limit for each scope value

Each granularity maintains its own EWMA estimator. The MeteredPayloadStream increments all applicable accumulators (global + per-usecase + per-scope) for every chunk polled. For non-streamed payloads (e.g., batch INSERT where the size is known upfront), bytes are recorded directly via record_bandwidth.

Rate-limited requests receive HTTP 429.

§Web Concurrency Limit

Before requests reach the storage service, a web-tier concurrency limit protects against connection floods. When the number of in-flight HTTP requests reaches http.max_requests (default: 10,000), new requests are rejected immediately with HTTP 503. Health and readiness endpoints (/health, /ready) are excluded from this limit. Rejections are counted in the web.concurrency.rejected metric.

Direct 503 rejection is preferred over readiness-based backpressure:

  • Instant response and recovery: direct 503 responds in milliseconds and frees capacity the moment any request completes. Readiness probes run on periodic intervals, leaving a window of continued overload and wasting capacity during recovery.
  • No cascade risk: multiple pods failing readiness probes simultaneously concentrates traffic onto remaining pods. Direct rejection keeps every pod in the pool and self-regulating.
  • Correct health semantics: a busy pod is still ready — its dependencies are reachable and it can serve traffic. Conflating load with readiness muddies alerting and incident response.
  • Environment-independent: works in any deployment, not just Kubernetes.

§Service Backpressure

Beyond rate limiting and the web concurrency limit, the StorageService enforces a second layer of backpressure through a concurrency limit on in-flight backend operations, configured via service.max_concurrency. When exceeded, requests receive HTTP 429. See the service architecture docs for details.

§KEDA Metrics

GET /keda serves a Prometheus text-format (version 0.0.4) snapshot of all four rate-limited resources for use with KEDA Prometheus scalers. The endpoint is exempt from the web concurrency limit and request metrics so that it remains available when the server is at capacity.

§Exposed Metrics

§EWMA Gauges

Pre-smoothed rates, self-contained per scrape (no irate() arithmetic needed):

ResourceUtilizationLimit
Bandwidthobjectstore_bandwidth_ewmaobjectstore_bandwidth_limit (only when global_bps is set)
Throughputobjectstore_throughput_ewmaobjectstore_throughput_limit (only when global_rps is set)
HTTP concurrencyobjectstore_requests_in_flightobjectstore_requests_limit
Task concurrencyobjectstore_tasks_runningobjectstore_tasks_limit

Throughput uses an EWMA with a 50 ms tick and α = 0.2, matching the existing bandwidth estimator. The accumulator counts fully admitted requests (requests that pass all throughput checks).

§Counters

Monotonically increasing totals since startup; use irate(counter[window]) in KEDA queries for an unsmoothed, immediately responsive rate:

CounterDescription
objectstore_bytes_totalTotal bytes transferred since startup
objectstore_requests_totalTotal admitted requests since startup

§Example KEDA ScaledObject Triggers

§Using EWMA gauges (backward-compatible)

Scale on the highest utilization across all four resources:

triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      query: |
        max(
          objectstore_bandwidth_ewma / objectstore_bandwidth_limit
          or objectstore_throughput_ewma / objectstore_throughput_limit
          or objectstore_requests_in_flight / objectstore_requests_limit
          or objectstore_tasks_running / objectstore_tasks_limit
        )
      threshold: "0.7"
§Using counters with irate() (more responsive)

Uses the last two scraped values for an instantaneous rate with no smoothing lag:

triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      query: |
        max(
          irate(objectstore_bytes_total[2m]) / objectstore_bandwidth_limit
          or irate(objectstore_requests_total[2m]) / objectstore_throughput_limit
          or objectstore_requests_in_flight / objectstore_requests_limit
          or objectstore_tasks_running / objectstore_tasks_limit
        )
      threshold: "0.7"

Unconfigured limits produce no series and are excluded from or automatically.

§Killswitches

Killswitches provide emergency traffic blocking without redeployment. Each killswitch is a set of conditions that, when all matched, cause requests to be rejected with HTTP 403:

  • Usecase: exact match on the usecase string
  • Scopes: all specified scope key-value pairs must be present
  • Service: a glob pattern matched against the x-downstream-service request header (e.g., "relay-*" to block all relay instances)

A killswitch with no conditions matches all traffic. Multiple killswitches are evaluated with OR semantics — any match triggers rejection. Killswitches are checked during request extraction, before the handler runs.

Modules§

auth
Authorization logic for objectstore.
batch
HTTP header names used in batch request and response processing.
cli
Command-line interface for the objectstore server.
config
Configuration for the objectstore server.
endpoints
Contains all HTTP endpoint handlers.
extractors
Axum request extractors for objectstore endpoints.
healthcheck
CLI healthcheck subcommand implementation.
killswitches
Runtime killswitches for disabling access to specific object contexts.
multipart
Types and utilities to support Multipart streaming responses.
observability
Initialization of error reporting and distributed tracing.
rate_limits
Admission-based rate limiting for throughput and bandwidth.
state
Shared server state passed to all HTTP request handlers.
web
Module implementing the Objectstore API webserver.