SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
© 2022 Altinity, Inc.
All About JSON and
ClickHouse
Tips, Tricks, and New Features
Robert Hodges and Diego Nieto
26 July 2022
1
Copyright © Altinity Inc 2022
© 2022 Altinity, Inc.
Let’s make some introductions
ClickHouse support and services including Altinity.Cloud
Authors of Altinity Kubernetes Operator for ClickHouse
and other open source projects
Robert Hodges
Database geek with 30+ years
on DBMS systems. Day job:
Altinity CEO
Diego Nieto
Database engineer focused on
ClickHouse, PostgreSQL, and
DBMS applications
2
© 2022 Altinity, Inc.
Reading and writing JSON - the basics
3
© 2022 Altinity, Inc.
JSON is pervasive as raw data
head http_logs.json
{"@timestamp": 895873059, "clientip":"54.72.5.0", "request":
"GET /images/home_bg_stars.gif HTTP/1.1", "status": 200,
"size": 2557}
{"@timestamp": 895873059, "clientip":"53.72.5.0", "request":
"GET /images/home_tool.gif HTTP/1.0", "status": 200, "size":
327}
...
Web server log data
© 2022 Altinity, Inc.
Reading and writing JSON data to/from tables
SQL Table
Every key is a
column
{"@timestamp":"
1998-05-22
21:37:39","clienti
p":"54.72.5.0",...}
{"@timestamp":"
1998-05-22
21:37:39","clienti
p":"54.72.5.0",...}
© 2022 Altinity, Inc.
Loading raw JSON using JSONEachRow input format
CREATE TABLE http_logs_tabular (
`@timestamp` DateTime,
`clientip` IPv4,
`status` UInt16,
`request` String,
`size` UInt32
) ENGINE = MergeTree
PARTITION BY toStartOfDay(`@timestamp`)
ORDER BY `@timestamp`
clickhouse-client --query 
'INSERT INTO http_logs_tabular Format JSONEachRow' 
< http_logs_tabular
© 2022 Altinity, Inc.
Writing JSON using JSONEachRow output format
SELECT * FROM http_logs_tabular
LIMIT 2
FORMAT JSONEachRow
{"@timestamp":"1998-05-22
21:37:39","clientip":"54.72.5.0","status":200,"request":"GET
/images/home_bg_stars.gif HTTP/1.1","size":2557}
{"@timestamp":"1998-05-22
21:37:39","clientip":"53.72.5.0","status":200,"request":"GET
/images/home_tool.gif HTTP/1.0","size":327}
© 2022 Altinity, Inc.
Storing JSON data in Strings
8
© 2022 Altinity, Inc.
Mapping JSON to a blob with optional derived columns
{"@timestamp":"
1998-05-22
21:37:39","clienti
p":"54.72.5.0",...}
SQL Table
JSON
String
JSON String (“blob”) with
derived header values
© 2022 Altinity, Inc.
Start by storing the JSON as a String
CREATE TABLE http_logs
(
`file` String,
`message` String
)
ENGINE = MergeTree
PARTITION BY file
ORDER BY tuple()
SETTINGS index_granularity = 8192
“Blob”
© 2022 Altinity, Inc.
Load data whatever way is easiest...
head http_logs.csv
"file","message"
"documents-211998.json","{""@timestamp"": 895873059,
""clientip"":""54.72.5.0"", ""request"": ""GET
/images/home_bg_stars.gif HTTP/1.1"", ""status"": 200, ""size"":
2557}"
"documents-211998.json","{""@timestamp"": 895873059,
""clientip"":""53.72.5.0"", ""request"": ""GET /images/home_tool.gif
HTTP/1.0"", ""status"": 200, ""size"": 327}"
...
clickhouse-client --query 
'INSERT INTO http_logs Format CSVWithNames' 
< http_logs.csv
© 2022 Altinity, Inc.
You can query using JSON* functions
-- Get a JSON string value
SELECT JSONExtractString(message, 'request') AS request
FROM http_logs LIMIT 3
-- Get a JSON numeric value
SELECT JSONExtractInt(message, 'status') AS status
FROM http_logs LIMIT 3
-- Use values to answer useful questions.
SELECT JSONExtractInt(message, 'status') AS status, count() as count
FROM http_logs WHERE status >= 400
WHERE toDateTime(JSONExtractUInt32(message, '@timestamp') BETWEEN
'1998-05-20 00:00:00' AND '1998-05-20 23:59:59'
GROUP BY status ORDER BY status
© 2022 Altinity, Inc.
-- Get using JSON function
SELECT JSONExtractString(message, 'request')
FROM http_logs LIMIT 3
-- Get it with proper type.
SELECT visitParamExtractString(message, 'request')
FROM http_logs LIMIT 3
JSON* vs visitParam functions
SLOWER
Complete
JSON parser
FASTER
But cannot distinguish same
name in different structures
© 2022 Altinity, Inc.
We can improve usability by ordering data
CREATE TABLE http_logs_sorted (
`file` String,
`message` String,
timestamp DateTime DEFAULT
toDateTime(JSONExtractUInt(message, '@timestamp'))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
INSERT INTO http_logs_sorted
SELECT file, message FROM http_logs
14
© 2022 Altinity, Inc.
And still further by adding more columns
ALTER TABLE http_logs_sorted
ADD COLUMN `status` Int16 DEFAULT JSONExtractInt(message,
'status') CODEC(ZSTD(1))
ALTER TABLE http_logs_sorted
ADD COLUMN `request` String DEFAULT
JSONExtractString(message, 'request')
-- Force columns to be materialized
ALTER TABLE http_logs_sorted
UPDATE status=status, request=request
WHERE 1
15
© 2022 Altinity, Inc.
Our query is now simpler...
SELECT
status, count() as count
FROM http_logs_sorted WHERE status >= 400 AND
timestamp BETWEEN
'1998-05-20 00:00:00' AND '1998-05-20 23:59:59'
GROUP BY status ORDER BY status
16
© 2022 Altinity, Inc.
And MUCH faster!
SELECT
status, count() as count
FROM http_logs_sorted WHERE status >= 400 AND
timestamp BETWEEN
'1998-05-20 00:00:00' AND '1998-05-20 23:59:59'
GROUP BY status ORDER BY status
0.014 seconds vs 9.8 seconds!
Can use primary
key index to drop
blocks
100x less I/O to read
17
© 2022 Altinity, Inc.
Using paired arrays and maps for JSON
18
© 2022 Altinity, Inc.
Representing JSON as paired arrays and maps
{"@timestamp":"
1998-05-22
21:37:39","clienti
p":"54.72.5.0",...}
SQL Table
Array
of
Keys
Arrays: Header values
with key-value pairs
Array
of
Values
SQL Table
Map
with
Key/Values
Map: Header values with
mapped key value pairs
© 2022 Altinity, Inc.
Storing JSON in paired arrays
CREATE TABLE http_logs_arrays (
`file` String,
`keys` Array(String),
`values` Array(String),
timestamp DateTime CODEC(Delta, ZSTD(1))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
20
© 2022 Altinity, Inc.
Loading JSON to paired arrays
-- Load data. Might be better to format outside ClickHouse.
INSERT into http_logs_arrays(file, keys, values, timestamp)
SELECT file,
arrayMap(x -> x.1,
JSONExtractKeysAndValues(message, 'String')) keys,
arrayMap(x -> x.2,
JSONExtractKeysAndValues(message, 'String')) values,
toDateTime(JSONExtractUInt(message, '@timestamp'))
timestamp
FROM http_logs limit 30000000
21
© 2022 Altinity, Inc.
Querying values in arrays
-- Run a query.
SELECT values[indexOf(keys, 'status')] status, count()
FROM http_logs_arrays
GROUP BY status ORDER BY status
status|count() |
------|--------|
200 |24917090|
206 | 64935|
302 | 1941|
304 | 4899616|
400 | 888|
404 | 115005|
500 | 525|
4-5x faster than accessing
JSON string objects
22
© 2022 Altinity, Inc.
Another way to store JSON objects: Maps
CREATE TABLE http_logs_map (
`file` String, `message` Map(String, String),
timestamp DateTime
DEFAULT toDateTime(toUInt32(message['@timestamp']))
CODEC(Delta, ZSTD(1))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
23
© 2022 Altinity, Inc.
Loading and querying JSON in Maps
-- Load data
INSERT into http_logs_map(file, message)
SELECT file,
JSONExtractKeysAndValues(message, 'String') message
FROM http_logs
-- Run a query.
SELECT message['status'] status, count()
FROM http_logs_map
GROUP BY status ORDER BY status 4-5x faster than accessing
JSON string objects
24
© 2022 Altinity, Inc.
The JSON Data Type
25
New in
22.3
© 2022 Altinity, Inc.
Mapping complex data to a JSON data type column
{Complex
JSON}
SQL Table
JSON
Data
Type
JSON data type (“blob”)
with other column values
© 2022 Altinity, Inc.
How did JSON work until now?
● Storing JSON using String datatypes
● 2 Parsers:
○ Simple parser
○ Full-fledged parser
● 2-set functions for each parser:
○ Family of simpleJSON functions that only work for simple non-nested JSON files
■ visitParamExtractUInt = simpleJSONExtractUInt
○ Family of JSONExtract* functions that can parse any JSON object completely.
■ JSONExtractUInt, JSONExtractString, JSONExtractRawArray …
Query Time!
27
© 2022 Altinity, Inc.
How did JSON work until now?
WITH JSONExtract(json, 'Tuple(a UInt32, b UInt32, c Nested(d UInt32, e
String))') AS parsed_json
SELECT JSONExtractUInt(json, 'a') AS a, JSONExtractUInt(json, 'b') AS b,
JSONExtractArrayRaw(json, 'c') AS array_c, tupleElement(parsed_json, 'a')
AS a_tuple, tupleElement(parsed_json, 'b') AS b_tuple,
tupleElement(parsed_json, 'c') AS array_c_tuple,
tupleElement(tupleElement(parsed_json, 'c'), 'd') AS `c.d`,
tupleElement(tupleElement(parsed_json, 'c'), 'e') AS `c.e`
FROM ( SELECT '{"a":1,"b":2,"c":[{"d":3,"e":"str_1"},
{"d":4,"e":"str_2"}, {"d":3,"e":"str_1"}, {"d":4,"e":"str_1"},
{"d":7,"e":"str_9"}]}' AS json )
FORMAT Vertical
28
Let’s dive in!
© 2022 Altinity, Inc.
How did JSON work until now?
1. Approach A: Using tuples
1.1. Get the structure of the json parsing it using the JSONExtract function and generate a
tuple structure using a CTE (WITH clause)
1.2. Use tupleElement function to extract the tuples: tupleElement->tupleElement for
getting nested fields
2. Approach B: Direct
2.1. Use JSONExtractUInt/Array to extract the values directly
Both require multiple passes:
● Tuple approach= 2 pass (CTE + Query)
● Direct approach= 3 pass two ints (a and b) and an array (array_c).
29
© 2022 Altinity, Inc.
New JSON
● ClickHouse parses JSON data at INSERT time.
● Automatic inference and creation of the underlying table structure
● JSON object stored in a columnar ClickHouse native format
● Named tuple and array notation to query JSON objects: array[x] | tuple.element
30
Ingestor Parsing
Conver-
sion
Storage
Layer
Raw
JSON
Extracted
fields
Columns with
ClickHouse type
definitions
© 2022 Altinity, Inc.
New JSON storage format
31
© 2022 Altinity, Inc.
New JSON
SET allow_experimental_object_type = 1;
CREATE TABLE json_test.stack_overflow_js (`raw` JSON)
ENGINE = MergeTree ORDER BY tuple();
INSERT INTO stack_overflow_js
SELECT json
FROM file('stack_overflow_nested.json.gz', JSONAsObject);
SELECT count(*) FROM stack_overflow_js;
11203029 rows in set. Elapsed: 2.323 sec. Processed 11.20 million rows, 3.35 GB (4.82
million rows/s., 1.44 GB/s.)
32
© 2022 Altinity, Inc.
New JSON useful settings
SET describe_extend_object_types = 1;
DESCRIBE TABLE stack_overflow_js;
--Basic structure
SET describe_include_subcolumns = 1;
DESCRIBE TABLE stack_overflow_js FORMAT Vertical;
--Columns included
SET output_format_json_named_tuples_as_objects = 1;
SELECT raw FROM stack_overflow_js LIMIT 1 FORMAT JSONEachRow;
--JSON full structure
33
© 2022 Altinity, Inc.
New vs Old-school
stack_overflow_js vs stack_overflow_str:
CREATE TABLE nested_json.stack_overflow_js (`raw` JSON)
ENGINE = MergeTree ORDER BY tuple();
CREATE TABLE nested_json.stack_overflow_str (`raw` String)
ENGINE = MergeTree ORDER BY tuple();
● topK stack_overflow_str:
SELECT topK(100)(arrayJoin(JSONExtract(raw, 'tag','Array(String)')))
FROM stack_overflow_str;
1 rows in set. Elapsed: 2.101 sec. Processed 11.20 million rows, 3.73 GB (5.33 million rows/s., 1.77 GB/s.)
● topK stack_overflow_str:
SELECT topK(100)(arrayJoin(raw.tag)) FROM stack_overflow_js
1 rows in set. Elapsed: 0.331 sec. Processed 11.20 million rows, 642.07 MB (33.90 million rows/s., 1.94 GB/s.)
34
© 2022 Altinity, Inc.
Limitations:
● What happens if there are schema changes?:
○ column type changes, new keys, deleted keys ….
○ Insert a new json like this { “foo”: “10”, “bar”: 10 }:
■ CH will create a new part for this json
■ CH will create a tuple structure: raw.foo and raw.bar
■ OPTIMIZE TABLE FINAL
● New mixed tuple = stack_overflow tuple + foobar tuple
● Problems:
○ No errors or warnings during insertions
○ Malformed JSON will pollute our data
○ We cannot select slices like raw.answers.*
○ CH creates a dynamic column per json key (our JSON has 1K keys so 1K columns)
35
© 2022 Altinity, Inc.
Check tuple structure:
INSERT INTO stack_overflow_js VALUES ('{ "bar": "hello", "foo": 1 }');
SELECT table,
column,
name AS part_name,
type,
subcolumns.names,
subcolumns.type
FROM system.parts_columns
WHERE table = 'stack_overflow_js'
FORMAT Vertical
36
© 2022 Altinity, Inc.
Check tuple structure:
Row 1:
──────
table: stack_overflow_js
column: raw
part_name: all_12_22_5
type: Tuple(answers Nested(date String, user String), creationDate String, qid String, tag
Array(String), title String, user String)
subcolumns.names:
['answers','answers.size0','answers.date','answers.user','creationDate','qid','tag','tag.size0','title','user']
subcolumns.types: ['Nested(date String, user
String)','UInt64','Array(String)','Array(String)','String','String','Array(String)','UInt64','String','String']
subcolumns.serializations:
['Default','Default','Default','Default','Default','Default','Default','Default','Default','Default']
Row 2:
──────
table: stack_overflow_js
column: raw
part_name: all_23_23_0
type: Tuple(Bar String, foo Int8)
subcolumns.names: ['foo','foo']
subcolumns.types: ['String','String']
subcolumns.serializations: ['Default','Default']
37
© 2022 Altinity, Inc.
Improvements:
● CODEC Changes: LZ4 vs ZSTD
SELECT table, column,
formatReadableSize(sum(column_data_compressed_bytes)) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes)) AS uncompressed
FROM system.parts_columns
WHERE table IN ('stack_overflow_js', 'stack_overflow_str') AND column IN ('raw'')
GROUP BY table, column
● ALTER TABLEs
ALTER TABLE stack_overflow_str MODIFY COLUMN raw CODEC(ZSTD(3));
ALTER TABLE stack_overflow_js MODIFY COLUMN raw CODEC(ZSTD(3));
38
table column LZ4 ZSTD uncompressed
stack_overflow_str raw 1.73 GiB 1.23 GiB 3.73 GiB
stack_overflow_json raw 1.30 GiB 886.77 GiB 2.29 GiB
© 2022 Altinity, Inc.
Improvements
● Query times: LZ4 vs ZSTD
○ LZ4
■ 0.3s New vs 2.1s Old
○ ZSTD
■ 0.4s New vs 2.8s Old
39
table column LZ4 ZSTD comp.ratio
stack_overflow_str raw 0.3s 0.4s 12%
stack_overflow_json raw 2.1s 2.8s 10%
© 2022 Altinity, Inc.
Wrap-up and References
40
© 2022 Altinity, Inc.
Secrets to JSON happiness in ClickHouse
● Use JSON formats to read and write JSON data
● Fetch JSON String data with
JSONExtract*/JSONVisitParam* functions
● Store JSON in paired arrays or maps
● (NEW) The new JSON data type stores data efficiently
and offers convenient query syntax
○ It’s still experimental
41
© 2022 Altinity, Inc.
More things to look at by yourself
● Using materialized views to populate JSON data
● Indexing JSON data
○ Indexes on JSON data type columns
○ Bloom filters on blobs
● More compression and codec tricks
42
© 2022 Altinity, Inc.
Where to get more information
ClickHouse Docs: https://clickhouse.com/docs/
Altinity Knowledge Base: https://kb.altinity.com/
Altinity Blog: https://altinity.com
ClickHouse Source Code and Tests: https://github.com/ClickHouse/ClickHouse
● Especially tests
43
© 2022 Altinity, Inc.
Thank you!
Questions?
https://altinity.com
44
Altinity.Cloud
Altinity Support
Altinity Stable
Builds
We’re hiring!
Copyright © Altinity Inc 2022

Weitere ähnliche Inhalte

Was ist angesagt?

A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...Altinity Ltd
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOAltinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides Altinity Ltd
 
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...Altinity Ltd
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Ltd
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Altinity Ltd
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house queryCristinaMunteanu43
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Altinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...Altinity Ltd
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxData
 

Was ist angesagt? (20)

A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 

Ähnlich wie All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINAL.pdf

Going Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21cGoing Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21cJim Czuprynski
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxAndrei Negruti
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for AnalyticsMongoDB
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
 
Http4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web StackHttp4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web StackGaryCoady
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraDataStax Academy
 
The rise of json in rdbms land jab17
The rise of json in rdbms land jab17The rise of json in rdbms land jab17
The rise of json in rdbms land jab17alikonweb
 
Leap Ahead with Redis 6.2
Leap Ahead with Redis 6.2Leap Ahead with Redis 6.2
Leap Ahead with Redis 6.2VMware Tanzu
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6Maxime Beugnet
 
GeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxGeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxDatabricks
 
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_103 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1mlraviol
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Dan Robinson
 
The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210Mahmoud Samir Fayed
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsAlexander Rubin
 
BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7Georgi Kodinov
 

Ähnlich wie All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINAL.pdf (20)

Java and xml
Java and xmlJava and xml
Java and xml
 
Going Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21cGoing Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21c
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptx
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 
Codable routing
Codable routingCodable routing
Codable routing
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
 
Http4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web StackHttp4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web Stack
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for Cassandra
 
Mobile Web 5.0
Mobile Web 5.0Mobile Web 5.0
Mobile Web 5.0
 
The rise of json in rdbms land jab17
The rise of json in rdbms land jab17The rise of json in rdbms land jab17
The rise of json in rdbms land jab17
 
Leap Ahead with Redis 6.2
Leap Ahead with Redis 6.2Leap Ahead with Redis 6.2
Leap Ahead with Redis 6.2
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6
 
GeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxGeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony Fox
 
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_103 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
 
Php sql-android
Php sql-androidPhp sql-android
Php sql-android
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
 
The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of Things
 
BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7
 

Mehr von Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
 

Mehr von Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 

Kürzlich hochgeladen

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 

Kürzlich hochgeladen (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 

All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINAL.pdf

  • 1. © 2022 Altinity, Inc. All About JSON and ClickHouse Tips, Tricks, and New Features Robert Hodges and Diego Nieto 26 July 2022 1 Copyright © Altinity Inc 2022
  • 2. © 2022 Altinity, Inc. Let’s make some introductions ClickHouse support and services including Altinity.Cloud Authors of Altinity Kubernetes Operator for ClickHouse and other open source projects Robert Hodges Database geek with 30+ years on DBMS systems. Day job: Altinity CEO Diego Nieto Database engineer focused on ClickHouse, PostgreSQL, and DBMS applications 2
  • 3. © 2022 Altinity, Inc. Reading and writing JSON - the basics 3
  • 4. © 2022 Altinity, Inc. JSON is pervasive as raw data head http_logs.json {"@timestamp": 895873059, "clientip":"54.72.5.0", "request": "GET /images/home_bg_stars.gif HTTP/1.1", "status": 200, "size": 2557} {"@timestamp": 895873059, "clientip":"53.72.5.0", "request": "GET /images/home_tool.gif HTTP/1.0", "status": 200, "size": 327} ... Web server log data
  • 5. © 2022 Altinity, Inc. Reading and writing JSON data to/from tables SQL Table Every key is a column {"@timestamp":" 1998-05-22 21:37:39","clienti p":"54.72.5.0",...} {"@timestamp":" 1998-05-22 21:37:39","clienti p":"54.72.5.0",...}
  • 6. © 2022 Altinity, Inc. Loading raw JSON using JSONEachRow input format CREATE TABLE http_logs_tabular ( `@timestamp` DateTime, `clientip` IPv4, `status` UInt16, `request` String, `size` UInt32 ) ENGINE = MergeTree PARTITION BY toStartOfDay(`@timestamp`) ORDER BY `@timestamp` clickhouse-client --query 'INSERT INTO http_logs_tabular Format JSONEachRow' < http_logs_tabular
  • 7. © 2022 Altinity, Inc. Writing JSON using JSONEachRow output format SELECT * FROM http_logs_tabular LIMIT 2 FORMAT JSONEachRow {"@timestamp":"1998-05-22 21:37:39","clientip":"54.72.5.0","status":200,"request":"GET /images/home_bg_stars.gif HTTP/1.1","size":2557} {"@timestamp":"1998-05-22 21:37:39","clientip":"53.72.5.0","status":200,"request":"GET /images/home_tool.gif HTTP/1.0","size":327}
  • 8. © 2022 Altinity, Inc. Storing JSON data in Strings 8
  • 9. © 2022 Altinity, Inc. Mapping JSON to a blob with optional derived columns {"@timestamp":" 1998-05-22 21:37:39","clienti p":"54.72.5.0",...} SQL Table JSON String JSON String (“blob”) with derived header values
  • 10. © 2022 Altinity, Inc. Start by storing the JSON as a String CREATE TABLE http_logs ( `file` String, `message` String ) ENGINE = MergeTree PARTITION BY file ORDER BY tuple() SETTINGS index_granularity = 8192 “Blob”
  • 11. © 2022 Altinity, Inc. Load data whatever way is easiest... head http_logs.csv "file","message" "documents-211998.json","{""@timestamp"": 895873059, ""clientip"":""54.72.5.0"", ""request"": ""GET /images/home_bg_stars.gif HTTP/1.1"", ""status"": 200, ""size"": 2557}" "documents-211998.json","{""@timestamp"": 895873059, ""clientip"":""53.72.5.0"", ""request"": ""GET /images/home_tool.gif HTTP/1.0"", ""status"": 200, ""size"": 327}" ... clickhouse-client --query 'INSERT INTO http_logs Format CSVWithNames' < http_logs.csv
  • 12. © 2022 Altinity, Inc. You can query using JSON* functions -- Get a JSON string value SELECT JSONExtractString(message, 'request') AS request FROM http_logs LIMIT 3 -- Get a JSON numeric value SELECT JSONExtractInt(message, 'status') AS status FROM http_logs LIMIT 3 -- Use values to answer useful questions. SELECT JSONExtractInt(message, 'status') AS status, count() as count FROM http_logs WHERE status >= 400 WHERE toDateTime(JSONExtractUInt32(message, '@timestamp') BETWEEN '1998-05-20 00:00:00' AND '1998-05-20 23:59:59' GROUP BY status ORDER BY status
  • 13. © 2022 Altinity, Inc. -- Get using JSON function SELECT JSONExtractString(message, 'request') FROM http_logs LIMIT 3 -- Get it with proper type. SELECT visitParamExtractString(message, 'request') FROM http_logs LIMIT 3 JSON* vs visitParam functions SLOWER Complete JSON parser FASTER But cannot distinguish same name in different structures
  • 14. © 2022 Altinity, Inc. We can improve usability by ordering data CREATE TABLE http_logs_sorted ( `file` String, `message` String, timestamp DateTime DEFAULT toDateTime(JSONExtractUInt(message, '@timestamp')) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp INSERT INTO http_logs_sorted SELECT file, message FROM http_logs 14
  • 15. © 2022 Altinity, Inc. And still further by adding more columns ALTER TABLE http_logs_sorted ADD COLUMN `status` Int16 DEFAULT JSONExtractInt(message, 'status') CODEC(ZSTD(1)) ALTER TABLE http_logs_sorted ADD COLUMN `request` String DEFAULT JSONExtractString(message, 'request') -- Force columns to be materialized ALTER TABLE http_logs_sorted UPDATE status=status, request=request WHERE 1 15
  • 16. © 2022 Altinity, Inc. Our query is now simpler... SELECT status, count() as count FROM http_logs_sorted WHERE status >= 400 AND timestamp BETWEEN '1998-05-20 00:00:00' AND '1998-05-20 23:59:59' GROUP BY status ORDER BY status 16
  • 17. © 2022 Altinity, Inc. And MUCH faster! SELECT status, count() as count FROM http_logs_sorted WHERE status >= 400 AND timestamp BETWEEN '1998-05-20 00:00:00' AND '1998-05-20 23:59:59' GROUP BY status ORDER BY status 0.014 seconds vs 9.8 seconds! Can use primary key index to drop blocks 100x less I/O to read 17
  • 18. © 2022 Altinity, Inc. Using paired arrays and maps for JSON 18
  • 19. © 2022 Altinity, Inc. Representing JSON as paired arrays and maps {"@timestamp":" 1998-05-22 21:37:39","clienti p":"54.72.5.0",...} SQL Table Array of Keys Arrays: Header values with key-value pairs Array of Values SQL Table Map with Key/Values Map: Header values with mapped key value pairs
  • 20. © 2022 Altinity, Inc. Storing JSON in paired arrays CREATE TABLE http_logs_arrays ( `file` String, `keys` Array(String), `values` Array(String), timestamp DateTime CODEC(Delta, ZSTD(1)) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp 20
  • 21. © 2022 Altinity, Inc. Loading JSON to paired arrays -- Load data. Might be better to format outside ClickHouse. INSERT into http_logs_arrays(file, keys, values, timestamp) SELECT file, arrayMap(x -> x.1, JSONExtractKeysAndValues(message, 'String')) keys, arrayMap(x -> x.2, JSONExtractKeysAndValues(message, 'String')) values, toDateTime(JSONExtractUInt(message, '@timestamp')) timestamp FROM http_logs limit 30000000 21
  • 22. © 2022 Altinity, Inc. Querying values in arrays -- Run a query. SELECT values[indexOf(keys, 'status')] status, count() FROM http_logs_arrays GROUP BY status ORDER BY status status|count() | ------|--------| 200 |24917090| 206 | 64935| 302 | 1941| 304 | 4899616| 400 | 888| 404 | 115005| 500 | 525| 4-5x faster than accessing JSON string objects 22
  • 23. © 2022 Altinity, Inc. Another way to store JSON objects: Maps CREATE TABLE http_logs_map ( `file` String, `message` Map(String, String), timestamp DateTime DEFAULT toDateTime(toUInt32(message['@timestamp'])) CODEC(Delta, ZSTD(1)) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp 23
  • 24. © 2022 Altinity, Inc. Loading and querying JSON in Maps -- Load data INSERT into http_logs_map(file, message) SELECT file, JSONExtractKeysAndValues(message, 'String') message FROM http_logs -- Run a query. SELECT message['status'] status, count() FROM http_logs_map GROUP BY status ORDER BY status 4-5x faster than accessing JSON string objects 24
  • 25. © 2022 Altinity, Inc. The JSON Data Type 25 New in 22.3
  • 26. © 2022 Altinity, Inc. Mapping complex data to a JSON data type column {Complex JSON} SQL Table JSON Data Type JSON data type (“blob”) with other column values
  • 27. © 2022 Altinity, Inc. How did JSON work until now? ● Storing JSON using String datatypes ● 2 Parsers: ○ Simple parser ○ Full-fledged parser ● 2-set functions for each parser: ○ Family of simpleJSON functions that only work for simple non-nested JSON files ■ visitParamExtractUInt = simpleJSONExtractUInt ○ Family of JSONExtract* functions that can parse any JSON object completely. ■ JSONExtractUInt, JSONExtractString, JSONExtractRawArray … Query Time! 27
  • 28. © 2022 Altinity, Inc. How did JSON work until now? WITH JSONExtract(json, 'Tuple(a UInt32, b UInt32, c Nested(d UInt32, e String))') AS parsed_json SELECT JSONExtractUInt(json, 'a') AS a, JSONExtractUInt(json, 'b') AS b, JSONExtractArrayRaw(json, 'c') AS array_c, tupleElement(parsed_json, 'a') AS a_tuple, tupleElement(parsed_json, 'b') AS b_tuple, tupleElement(parsed_json, 'c') AS array_c_tuple, tupleElement(tupleElement(parsed_json, 'c'), 'd') AS `c.d`, tupleElement(tupleElement(parsed_json, 'c'), 'e') AS `c.e` FROM ( SELECT '{"a":1,"b":2,"c":[{"d":3,"e":"str_1"}, {"d":4,"e":"str_2"}, {"d":3,"e":"str_1"}, {"d":4,"e":"str_1"}, {"d":7,"e":"str_9"}]}' AS json ) FORMAT Vertical 28 Let’s dive in!
  • 29. © 2022 Altinity, Inc. How did JSON work until now? 1. Approach A: Using tuples 1.1. Get the structure of the json parsing it using the JSONExtract function and generate a tuple structure using a CTE (WITH clause) 1.2. Use tupleElement function to extract the tuples: tupleElement->tupleElement for getting nested fields 2. Approach B: Direct 2.1. Use JSONExtractUInt/Array to extract the values directly Both require multiple passes: ● Tuple approach= 2 pass (CTE + Query) ● Direct approach= 3 pass two ints (a and b) and an array (array_c). 29
  • 30. © 2022 Altinity, Inc. New JSON ● ClickHouse parses JSON data at INSERT time. ● Automatic inference and creation of the underlying table structure ● JSON object stored in a columnar ClickHouse native format ● Named tuple and array notation to query JSON objects: array[x] | tuple.element 30 Ingestor Parsing Conver- sion Storage Layer Raw JSON Extracted fields Columns with ClickHouse type definitions
  • 31. © 2022 Altinity, Inc. New JSON storage format 31
  • 32. © 2022 Altinity, Inc. New JSON SET allow_experimental_object_type = 1; CREATE TABLE json_test.stack_overflow_js (`raw` JSON) ENGINE = MergeTree ORDER BY tuple(); INSERT INTO stack_overflow_js SELECT json FROM file('stack_overflow_nested.json.gz', JSONAsObject); SELECT count(*) FROM stack_overflow_js; 11203029 rows in set. Elapsed: 2.323 sec. Processed 11.20 million rows, 3.35 GB (4.82 million rows/s., 1.44 GB/s.) 32
  • 33. © 2022 Altinity, Inc. New JSON useful settings SET describe_extend_object_types = 1; DESCRIBE TABLE stack_overflow_js; --Basic structure SET describe_include_subcolumns = 1; DESCRIBE TABLE stack_overflow_js FORMAT Vertical; --Columns included SET output_format_json_named_tuples_as_objects = 1; SELECT raw FROM stack_overflow_js LIMIT 1 FORMAT JSONEachRow; --JSON full structure 33
  • 34. © 2022 Altinity, Inc. New vs Old-school stack_overflow_js vs stack_overflow_str: CREATE TABLE nested_json.stack_overflow_js (`raw` JSON) ENGINE = MergeTree ORDER BY tuple(); CREATE TABLE nested_json.stack_overflow_str (`raw` String) ENGINE = MergeTree ORDER BY tuple(); ● topK stack_overflow_str: SELECT topK(100)(arrayJoin(JSONExtract(raw, 'tag','Array(String)'))) FROM stack_overflow_str; 1 rows in set. Elapsed: 2.101 sec. Processed 11.20 million rows, 3.73 GB (5.33 million rows/s., 1.77 GB/s.) ● topK stack_overflow_str: SELECT topK(100)(arrayJoin(raw.tag)) FROM stack_overflow_js 1 rows in set. Elapsed: 0.331 sec. Processed 11.20 million rows, 642.07 MB (33.90 million rows/s., 1.94 GB/s.) 34
  • 35. © 2022 Altinity, Inc. Limitations: ● What happens if there are schema changes?: ○ column type changes, new keys, deleted keys …. ○ Insert a new json like this { “foo”: “10”, “bar”: 10 }: ■ CH will create a new part for this json ■ CH will create a tuple structure: raw.foo and raw.bar ■ OPTIMIZE TABLE FINAL ● New mixed tuple = stack_overflow tuple + foobar tuple ● Problems: ○ No errors or warnings during insertions ○ Malformed JSON will pollute our data ○ We cannot select slices like raw.answers.* ○ CH creates a dynamic column per json key (our JSON has 1K keys so 1K columns) 35
  • 36. © 2022 Altinity, Inc. Check tuple structure: INSERT INTO stack_overflow_js VALUES ('{ "bar": "hello", "foo": 1 }'); SELECT table, column, name AS part_name, type, subcolumns.names, subcolumns.type FROM system.parts_columns WHERE table = 'stack_overflow_js' FORMAT Vertical 36
  • 37. © 2022 Altinity, Inc. Check tuple structure: Row 1: ────── table: stack_overflow_js column: raw part_name: all_12_22_5 type: Tuple(answers Nested(date String, user String), creationDate String, qid String, tag Array(String), title String, user String) subcolumns.names: ['answers','answers.size0','answers.date','answers.user','creationDate','qid','tag','tag.size0','title','user'] subcolumns.types: ['Nested(date String, user String)','UInt64','Array(String)','Array(String)','String','String','Array(String)','UInt64','String','String'] subcolumns.serializations: ['Default','Default','Default','Default','Default','Default','Default','Default','Default','Default'] Row 2: ────── table: stack_overflow_js column: raw part_name: all_23_23_0 type: Tuple(Bar String, foo Int8) subcolumns.names: ['foo','foo'] subcolumns.types: ['String','String'] subcolumns.serializations: ['Default','Default'] 37
  • 38. © 2022 Altinity, Inc. Improvements: ● CODEC Changes: LZ4 vs ZSTD SELECT table, column, formatReadableSize(sum(column_data_compressed_bytes)) AS compressed, formatReadableSize(sum(column_data_uncompressed_bytes)) AS uncompressed FROM system.parts_columns WHERE table IN ('stack_overflow_js', 'stack_overflow_str') AND column IN ('raw'') GROUP BY table, column ● ALTER TABLEs ALTER TABLE stack_overflow_str MODIFY COLUMN raw CODEC(ZSTD(3)); ALTER TABLE stack_overflow_js MODIFY COLUMN raw CODEC(ZSTD(3)); 38 table column LZ4 ZSTD uncompressed stack_overflow_str raw 1.73 GiB 1.23 GiB 3.73 GiB stack_overflow_json raw 1.30 GiB 886.77 GiB 2.29 GiB
  • 39. © 2022 Altinity, Inc. Improvements ● Query times: LZ4 vs ZSTD ○ LZ4 ■ 0.3s New vs 2.1s Old ○ ZSTD ■ 0.4s New vs 2.8s Old 39 table column LZ4 ZSTD comp.ratio stack_overflow_str raw 0.3s 0.4s 12% stack_overflow_json raw 2.1s 2.8s 10%
  • 40. © 2022 Altinity, Inc. Wrap-up and References 40
  • 41. © 2022 Altinity, Inc. Secrets to JSON happiness in ClickHouse ● Use JSON formats to read and write JSON data ● Fetch JSON String data with JSONExtract*/JSONVisitParam* functions ● Store JSON in paired arrays or maps ● (NEW) The new JSON data type stores data efficiently and offers convenient query syntax ○ It’s still experimental 41
  • 42. © 2022 Altinity, Inc. More things to look at by yourself ● Using materialized views to populate JSON data ● Indexing JSON data ○ Indexes on JSON data type columns ○ Bloom filters on blobs ● More compression and codec tricks 42
  • 43. © 2022 Altinity, Inc. Where to get more information ClickHouse Docs: https://clickhouse.com/docs/ Altinity Knowledge Base: https://kb.altinity.com/ Altinity Blog: https://altinity.com ClickHouse Source Code and Tests: https://github.com/ClickHouse/ClickHouse ● Especially tests 43
  • 44. © 2022 Altinity, Inc. Thank you! Questions? https://altinity.com 44 Altinity.Cloud Altinity Support Altinity Stable Builds We’re hiring! Copyright © Altinity Inc 2022