Videos / How To

Exploring Parquet Metadata with ClickHouse

Mark Needham Principal Product Marketing Manager, ClickHouse

In this video, we'll learn how to query the metadata of Parquet files using ClickHouse. The video demonstrates how to access and manipulate a dataset named Diffusion DB from Hugging Face, containing metadata for 14 million images generated by the Stable Diffusion AI tool. We'll look at various metadata details like row groups, column data, compressed and uncompressed sizes, and much more. We'll also look at how to use 'array join' and 'untuple' to manipulate and interpret the data more effectively.

