To parse url-encoded form data and query its fields with SQL, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then point it at the form body with the Form format:
clickhouse local -q "SELECT * FROM file('payload.txt', Form)"
Row 1:
──────
event: signup
user_id: 1001
plan: pro
amount: 49.00
country: GB
The Form format reads one application/x-www-form-urlencoded body and turns each key=value pair into a column, with the file read in place and no import step. One body becomes one queryable row.
A form body carries no type information; it is all text after the =. So Form infers every column as Nullable(String). DESCRIBE confirms it:
clickhouse local -q "DESCRIBE file('payload.txt', Form)"
event Nullable(String)
user_id Nullable(String)
plan Nullable(String)
amount Nullable(String)
country Nullable(String)
The keys became column names, which is what makes the format pleasant to work with. If you want to compute on a field (sum the amounts, filter by a numeric id), give it a real type by passing an explicit structure as the third argument to file():
clickhouse local -q "
SELECT event, user_id, plan, amount, country
FROM file('payload.txt', Form, 'event String, user_id UInt32, plan String, amount Decimal(10,2), country String')
FORMAT Vertical"
Row 1:
──────
event: signup
user_id: 1001
plan: pro
amount: 49
country: GB
Now user_id is an integer and amount is a decimal, ready for arithmetic.
This is the one thing to know about Form. Real form bodies escape special characters: a space becomes + or %20, an ampersand becomes %26, accented letters become percent sequences like %C3%A3. Form decodes the percent escapes for you, but it does not convert a literal + back into a space. Watch what comes out of a body that uses both:
clickhouse local -q "SELECT * FROM file('encoded.txt', Form)"
The body was name=Jane+Doe&city=S%C3%A3o+Paulo¬e=a%26b%3Dc&amount=19.99:
Row 1:
──────
name: Jane+Doe
city: São+Paulo
note: a&b=c
amount: 19.99
%C3%A3 decoded to ã, %26 to & and %3D to = (so the & and = inside note did not split into new fields). But Jane+Doe kept its +. If your source encodes spaces as +, fix it in the query with replaceAll:
clickhouse local -q "
SELECT replaceAll(name, '+', ' ') AS name, replaceAll(city, '+', ' ') AS city, note, amount
FROM file('encoded.txt', Form)"
Row 1:
──────
name: Jane Doe
city: São Paulo
note: a&b=c
amount: 19.99
One row, cleaned, no scripting.
Because a Form file is exactly one row, a directory of captured webhook bodies is a table waiting to happen. Point file() at a glob and ClickHouse reads each file as a row. The virtual _file column tells you which body each row came from:
clickhouse local -q "
SELECT _file AS src, event, user_id, plan, amount
FROM file('hooks/*.txt', Form, 'event String, user_id UInt32, plan String, amount Decimal(10,2)')
ORDER BY user_id
FORMAT PrettyCompact"
┌─src───────┬─event───┬─user_id─┬─plan─┬─amount─┐
1. │ hook1.txt │ signup │ 1001 │ pro │ 49 │
2. │ hook2.txt │ upgrade │ 1002 │ team │ 99 │
3. │ hook3.txt │ signup │ 1003 │ free │ 0 │
└───────────┴─────────┴─────────┴──────┴────────┘
From there the full SQL surface applies (WHERE, GROUP BY, aggregates, joins) across the whole folder at once:
clickhouse local -q "
SELECT event, count() AS n, sum(amount) AS revenue
FROM file('hooks/*.txt', Form, 'event String, user_id UInt32, plan String, amount Decimal(10,2)')
GROUP BY event
ORDER BY revenue DESC
FORMAT PrettyCompact"
┌─event───┬─n─┬─revenue─┐
1. │ upgrade │ 1 │ 99 │
2. │ signup │ 2 │ 49 │
└─────────┴───┴─────────┘
That turns a heap of raw POST bodies on disk into an answer. For more on the glob pattern across formats, see run SQL across multiple files.
The cost here is opening many small files, not crunching numbers. On a folder of 2,000 one-body form files, parsing every file and grouping by plan runs in:
clickhouse local -q "
SELECT plan, count() AS n, sum(amount) AS revenue
FROM file('perf/*.txt', Form, 'event String, user_id UInt32, plan String, amount Decimal(10,2), country String')
GROUP BY plan ORDER BY revenue DESC"
~0.18 seconds, best of three with the files warm in the OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM; clickhouse local 26.6.1.117). That includes opening all 2,000 files from scratch on every run — there is no cached table.
The query you ran against a folder of form bodies is ordinary ClickHouse SQL. When those webhook captures grow past what a laptop folder can hold, you swap the file() glob for a table (or an s3() glob over a bucket) and the SELECT ... GROUP BY stays exactly the same. You prototype against files on your machine and ship the identical logic to a ClickHouse server or to ClickHouse Cloud.
The complete, runnable example lives here. It has generate.sh (creates the sample bodies and the 2,000-file perf set), run.sh (every command above), and expected_output.txt:
github.com/ClickHouse/examples → local-analytics/clickhouse-local-form
git clone https://github.com/ClickHouse/examples
cd examples/local-analytics/clickhouse-local-form
./generate.sh && ./run.sh