QBit Data Type

The QBit data type reorganizes vector storage for faster approximate searches. Instead of storing each vector's elements together, it groups the same binary digit positions across all vectors. This stores vectors at full precision while letting you choose the fine-grained quantization level at search time: read fewer bits for less I/O and faster calculations, or more bits for higher accuracy. You get the speed benefits of reduced data transfer and computation from quantization, but all the original data remains available when needed.

Note QBit data type and distance functions associated with it are currently experimental. To enable them, please first run SET allow_experimental_qbit_type = 1 . If you run into problems, kindly open an issue in the ClickHouse repository.

To declare a column of QBit type, use the following syntax:

column_name QBit(element_type, dimension)

element_type – the type of each vector element. The allowed types are BFloat16 , Float32 and Float64

Using the QBit type in table column definition:

CREATE TABLE test (id UInt32, vec QBit(Float32, 8)) ENGINE = Memory; INSERT INTO test VALUES (1, [1, 2, 3, 4, 5, 6, 7, 8]), (2, [9, 10, 11, 12, 13, 14, 15, 16]); SELECT vec FROM test ORDER BY id;

┌─vec──────────────────────┐ │ [1,2,3,4,5,6,7,8] │ │ [9,10,11,12,13,14,15,16] │ └──────────────────────────┘

QBit implements a subcolumn access pattern that allows you to access individual bit planes of the stored vectors. Each bit position can be accessed using the .N syntax, where N is the bit position:

CREATE TABLE test (id UInt32, vec QBit(Float32, 8)) ENGINE = Memory; INSERT INTO test VALUES (1, [0, 0, 0, 0, 0, 0, 0, 0]); INSERT INTO test VALUES (1, [-0, -0, -0, -0, -0, -0, -0, -0]); SELECT bin(vec.1) FROM test;

┌─bin(tupleElement(vec, 1))─┐ │ 00000000 │ │ 11111111 │ └───────────────────────────┘

The number of accessible subcolumns depends on the element type:

BFloat16 : 16 subcolumns (1-16)

: 16 subcolumns (1-16) Float32 : 32 subcolumns (1-32)

: 32 subcolumns (1-32) Float64 : 64 subcolumns (1-64)