Writable Storage Schema

Properties

  • version: Version of schema.

  • kind: Component kind.

  • name (string): Name of the writable storage.

  • storage (object):

    • key (string): A unique key identifier for the storage.

    • set_key (string): A unique key identifier for a collection of storages located in the same cluster.

  • readiness_state (string): The readiness state defines the availability of the storage in various environments. Internally, this label is used to determine which environments this storage is released in. There for four different readiness states: limited, deprecrate, partial, and complete. Different environments support a set of these readiness_states . If this is a new storage, start with limited which only exposes the storage to CI and local development. Must be one of: [‘limited’, ‘deprecate’, ‘partial’, ‘complete’, ‘experimental’].

  • schema (object):

    • columns (array): Objects (or nested objects) representing columns containg a name, type and args.

    • local_table_name (string): The local table name in a single-node ClickHouse.

    • dist_table_name (string): The distributed table name in distributed ClickHouse.

    • not_deleted_mandatory_condition (string): The name of the column flagging a deletion, eg deleted column in Errors. Defining this column here will ensure any query served by this storage explicitly filters out any ‘deleted’ rows. Should only be used for storages supporting deletion replacement.

    • partition_format (array): The format of the partitions in Clickhouse. Used in the cleanup job.

  • stream_loader (object): The stream loader for a writing to ClickHouse. This provides what is needed to start a Kafka consumer and fill in the ClickHouse table.

    • processor (string): Name of Processor class config key and it’s arguments. Responsible for converting an incoming message body from the event stream into a row or statement to be inserted or executed against clickhouse.

    • default_topic (string): Name of the Kafka topic to consume from.

    • commit_log_topic ([‘string’, ‘null’]): Name of the commit log Kafka topic.

    • subscription_scheduled_topic ([‘string’, ‘null’]): Name of the subscription scheduled Kafka topic.

    • subscription_result_topic ([‘string’, ‘null’]): Name of the subscription result Kafka topic.

    • subscription_scheduler_mode ([‘string’, ‘null’]): The subscription scheduler mode used (e.g. partition or global). This must be specified if subscriptions are supported for this storage.

    • subscription_synchronization_timestamp: Field to be used for timestamp synchronization by the scheduler.

    • subscription_delay_seconds ([‘integer’, ‘null’]): Additional delay in seconds to be added before scheduling. Amount added depends on the synchronization timestamp used. For orig_message_ts, we typically add 60 seconds to account for ingest time as it is not included. Minimum: 0. Maximum: 120.

    • replacement_topic ([‘string’, ‘null’]): Name of the replacements Kafka topic.

    • dlq_topic ([‘string’, ‘null’]): Name of the DLQ Kafka topic.

    • pre_filter (object): Name of class which filter messages incoming from stream.

      • type (string): Name of StreamMessageFilter class config key.

      • args (object): Key/value mappings required to instantiate StreamMessageFilter class.

  • query_processors (array)

    • processor (string): Name of ClickhouseQueryProcessor class config key. Responsible for the transformation applied to a query.

    • args (object): Key/value mappings required to instantiate QueryProcessor class.

  • deletion_settings (object):

    • is_enabled (integer)

    • tables (array): Names of the tables to delete from.

    • allowed_columns (array): Columns allowed in WHERE clause.

    • max_rows_to_delete (integer)

  • deletion_processors (array)

    • processor (string): This processor should validate the query against the storage.

    • args (object): Key/value mappings required to instantiate DeletionProcessor class.

  • mandatory_condition_checkers (array)

    • condition (string): Name of ConditionChecker class config key. Responsible for running final checks on a query to ensure that transformations haven’t impacted/removed conditions required for security reasons.

    • args (object): Key/value mappings required to instantiate ConditionChecker class.

  • allocation_policies (array)

    • name (string): Name of the AllocationPolicy used for allocating read resources per query on this storage.

    • args (object): Key/value mappings required to instantiate AllocationPolicy class.

  • delete_allocation_policies (array)

    • name (string): Name of the AllocationPolicy used for allocating read resources per query on this storage.

    • args (object): Key/value mappings required to instantiate AllocationPolicy class.

  • replacer_processor (object):

    • processor (string): Name of ReplacerProcessor class config key. Responsible for optimizing queries on a storage which can have replacements, eg deletions/updates.

    • args (object): Key/value mappings required to instantiate ReplacerProcessor class.

  • writer_options (object): Extra Clickhouse fields that are used for consumer writes.

  • required_time_column ([‘string’, ‘null’]): The name of the required time column specifed in schema.