跳到主要内容
跳到主要内容

过多分区片段问题

本指南是从社区会议中获得的一系列发现的一部分。有关更多实际解决方案和见解,您可以按特定问题浏览 需要更多性能优化建议吗?查看性能优化社区见解指南。

理解问题

ClickHouse 会抛出“过多分区片段”错误,以防止严重的性能下降。小分区片段会导致多个问题:读取和合并更多文件时查询性能较差,内存使用增加因为每个分区片段都需要在内存中存储元数据,较小的数据块压缩效率降低,更多文件句柄和寻址操作加大了 I/O 开销,以及较慢的后台合并使合并调度器的工作量增加。

相关文档

及早识别问题

此查询通过分析所有活动表中的分区片段数量和大小来监控表碎片。它识别出可能需要合并优化的分区片段过多或过小的表。定期使用此查询,以在对查询性能产生影响之前捕捉到碎片问题。

-- Challenge: Replace with your actual database and table names for production use
-- Experiment: Adjust the part count thresholds (1000, 500, 100) based on your system
SELECT 
    database,
    table,
    count() as total_parts,
    sum(rows) as total_rows,
    round(avg(rows), 0) as avg_rows_per_part,
    min(rows) as min_rows_per_part,
    max(rows) as max_rows_per_part,
    round(sum(bytes_on_disk) / 1024 / 1024, 2) as total_size_mb,
    CASE 
        WHEN count() > 1000 THEN 'CRITICAL - Too many parts (>1000)'
        WHEN count() > 500 THEN 'WARNING - Many parts (>500)'
        WHEN count() > 100 THEN 'CAUTION - Getting many parts (>100)'
        ELSE 'OK - Reasonable part count'
    END as parts_assessment,
    CASE 
        WHEN avg(rows) < 1000 THEN 'POOR - Very small parts'
        WHEN avg(rows) < 10000 THEN 'FAIR - Small parts'
        WHEN avg(rows) < 100000 THEN 'GOOD - Medium parts'
        ELSE 'EXCELLENT - Large parts'
    END as part_size_assessment
FROM system.parts
WHERE active = 1
  AND database NOT IN ('system', 'information_schema')
GROUP BY database, table
ORDER BY total_parts DESC
LIMIT 20;

视频来源