Exploring Parquet Metadata with ClickHouse

Mark Needham

Principal Product Marketing Manager, ClickHouse In this video, we'll learn how to query the metadata of Parquet files using ClickHouse. The video demonstrates how to access and manipulate a dataset named Diffusion DB from Hugging Face, containing metadata for 14 million images generated by the Stable Diffusion AI tool. We'll look at various metadata details like row groups, column data, compressed and uncompressed sizes, and much more. We'll also look at how to use 'array join' and 'untuple' to manipulate and interpret the data more effectively.

Recent videos

YouTube Video: IyI0YQBCy8U

How to

CDC ClickPipes: The Fastest Way to Replicate Your Database to ClickHouse

Marta Paes, Senior Product Manager at ClickHouse

YouTube Video: Mgsm11T7tO8

How to

Under-the-Hood: ClickHouse Incremental Materialized Views and Dictionaries

Maruthi Lokanathan, Solution Architect at ClickHouse Derek Chia, Principal TAM Architect at ClickHouse

YouTube Video: -piPcapMtc0

ClickHouse Cloud, How to

LogHouse, Observability for ClickHouse Cloud

Get an insider’s look at LogHouse, the ClickHouse-powered logging platform that drives observability for ClickHouse Cloud.

Follow us

XBlueskySlackGithubTelegramMeetupRSS