OPTIMIZE Statement
https://clickhouse.com/docs/sql-reference/statements/optimize
Introduction
Merge Multiple Parts into Fewer or Single Parts
- Merges multiple data parts (files) into one, reducing fragmentation and improving query performance.
- Only affects parts within the same partition .
Apply TTL Deletes and Moves
- During a merge, expired rows (per TTL rules) are removed.
- TTL MOVE operations (e.g. to different disks/volumes) may also be executed.
Re-Aggregate in SummingMergeTree
- When using
SummingMergeTree
, the merge will:- Detect rows with the same primary key
- Recompute sums during the merge
- This reduces multiple rows with the same key to a single aggregated row .
Trigger Deduplication (DEDUPLICATE
)
- In
ReplicatedMergeTree
,OPTIMIZE ... DEDUPLICATE
removes duplicate rows based on primary key + version. - Useful when manually inserted duplicate data needs cleanup.
Specify Partition
sql
OPTIMIZE TABLE table_name PARTITION '2025-05-30' FINAL;
DEDUPLICATE
sql
OPTIMIZE TABLE table DEDUPLICATE; -- all columns
OPTIMIZE TABLE table DEDUPLICATE BY *; -- excludes MATERIALIZED and ALIAS columns
OPTIMIZE TABLE table DEDUPLICATE BY colX,colY,colZ;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT (colX, colY);
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex');
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT (colX, colY);
sql
CREATE TABLE default.example
(
`primary_key` Int32,
`secondary_key` Int32,
`value` UInt32,
`partition_key` UInt32,
`materialized_value` UInt32 MATERIALIZED 12345,
`aliased_value` UInt32 ALIAS 2
)
ENGINE = MergeTree
PARTITION BY partition_key
PRIMARY KEY primary_key
ORDER BY (primary_key, secondary_key);
sql
INSERT INTO example (primary_key, secondary_key, value, partition_key)
VALUES (0, 0, 0, 0), (0, 0, 0, 0), (1, 1, 2, 2), (1, 1, 2, 3), (1, 1, 3, 3);
sql
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
1. │ 1 │ 1 │ 2 │ 2 │
2. │ 0 │ 0 │ 0 │ 0 │
3. │ 0 │ 0 │ 0 │ 0 │
4. │ 1 │ 1 │ 2 │ 3 │
5. │ 1 │ 1 │ 3 │ 3 │
└─────────────┴───────────────┴───────┴───────────────┘
All Columns
Performs deduplication on all columns.
sql
OPTIMIZE TABLE example FINAL DEDUPLICATE;
sql
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
1. │ 0 │ 0 │ 0 │ 0 │
2. │ 1 │ 1 │ 2 │ 2 │
3. │ 1 │ 1 │ 2 │ 3 │
4. │ 1 │ 1 │ 3 │ 3 │
└─────────────┴───────────────┴───────┴───────────────┘
Exclude MATERIALIZED/ALIAS
Performs deduplication on all columns except MATERIALIZED
and ALIAS
.
sql
OPTIMIZE TABLE example FINAL DEDUPLICATE BY *;
sql
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
1. │ 0 │ 0 │ 0 │ 0 │
2. │ 1 │ 1 │ 2 │ 3 │
3. │ 1 │ 1 │ 3 │ 3 │
4. │ 1 │ 1 │ 2 │ 2 │
└─────────────┴───────────────┴───────┴───────────────┘
Specific Column
Performs deduplication only on specified columns.
sql
OPTIMIZE TABLE table FINAL DEDUPLICATE BY colX,colY,colZ;
Exclude Single Column
Performs deduplication on all columns except the specified single column.
sql
OPTIMIZE TABLE example FINAL DEDUPLICATE BY * EXCEPT value;
sql
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
1. │ 0 │ 0 │ 0 │ 0 │
2. │ 1 │ 1 │ 2 │ 3 │
3. │ 1 │ 1 │ 2 │ 2 │
└─────────────┴───────────────┴───────┴───────────────┘
Exclude Multiple Columns
Performs deduplication on all columns except the specified multiple columns.
sql
OPTIMIZE TABLE example FINAL DEDUPLICATE BY primary_key, secondary_key, partition_key;
sql
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
1. │ 1 │ 1 │ 2 │ 2 │
2. │ 1 │ 1 │ 2 │ 3 │
3. │ 0 │ 0 │ 0 │ 0 │
└─────────────┴───────────────┴───────┴───────────────┘
Regex Column Matching
Uses regular expression to match column names for deduplication.
sql
OPTIMIZE TABLE example FINAL DEDUPLICATE BY COLUMNS('.*_key');
sql
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
1. │ 1 │ 1 │ 2 │ 3 │
2. │ 1 │ 1 │ 2 │ 2 │
3. │ 0 │ 0 │ 0 │ 0 │
└─────────────┴───────────────┴───────┴───────────────┘