Alexey Milovidov, CTO, ClickHouse
Did you know that ClickHouse can be used as a #vectordatabase?
ClickHouse isn't memory-bound, allowing multi-TB datasets containing embeddings to be queried!
Over the past year, Large Language Models (LLMs) along with products like ChatGPT have captured the world's imagination and have been driving a new wave of functionality built on top of them. The concept of vectors and vector search is core to powering features like recommendations, question answering, image / video search, and much more.
As a result, we've seen a significant increase in vector search interest in the ClickHouse community. Specifically, an interest in better understanding when a specialized vector database becomes necessary, and when it doesn't.
With these models in focus, we take the opportunity to revisit search before vectors, explore what vectors (and embeddings) are, understand vector search, its applications, and how this functionality fits into the broader data landscape.
This presentation was given at the ClickHouse Meetup in Barcelona on May 25th.
To learn more about the topic, we have prepared a series on Vector Search with ClickHouse at https://clickhouse.com/blog/vector-search-clickhouse-p1 and https://clickhouse.com/blog/vector-search-clickhouse-p2
Slides are available at: https://github.com/ClickHouse/clickhouse-presentations/tree/master/meetup74
Building a Modern Data Warehouse for Real-Time Analytics and AI
Mark Mezzapelli VP Business & Development & Partnerships at Shakudo
Monitoring ClickHouse using OpenTelemetry (with IBM Instana Observability)
Joshua Hildred Software Developer at IBM
Accelerating ML Workflows using ClickHouse
Wasim Ismail, Sr. Data Engineer & Jawad Ateeq, Machine Learning Engineering Lead at Borealis AI