Weitere ähnliche Inhalte Ähnlich wie Analytic Platforms Should Be Columnar Orientation (20) Mehr von DATAVERSITY (20) Kürzlich hochgeladen (20) Analytic Platforms Should Be Columnar Orientation1. Slide 1
Unlock Potential
William McKnight
President
McKnight Consulting Group
www.mcknightcg.com
@williammcknight
Analytic Databases Should be Columnar
@williammcknight
2. Copyright © 2021 McKnight Consulting Group Slide 2
William McKnight
President, McKnight Consulting Group
Consulted to Pfizer, Scotiabank, Fidelity, TD
Ameritrade, Teva Pharmaceuticals, Verizon, and
many other Global 1000 companies
Frequent keynote speaker and trainer internationally
Hundreds of articles, blogs and white papers in
publication
Focused on delivering business value and solving
business problems utilizing proven, streamlined
approaches to information management
Former Database Engineer, Fortune 50 Information
Technology executive and Ernst&Young
Entrepreneur of Year Finalist
Owner/consultant: Data strategy and implementation
consulting firm
William McKnight
The Savvy Manager’s Guide
The
Savvy
Manager’s
Guide
Information
Management
Information Management
Strategies for Gaining a
Competitive Advantage with Data
2
4. Copyright © 2021 McKnight Consulting Group Slide 4
RDBMS Design over the years
RDBMS design is virtually unchanged, except for parallelism
Hardware, however:
§ Storage capacity has increased tremendously (and got far
cheaper)
§ CPU performance has improved
§ Transfer rates and seek times have increased modestly
5. Copyright © 2021 McKnight Consulting Group Slide 5
Row-Wise DBMS Stores Data in Rows
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
6. Copyright © 2021 McKnight Consulting Group Slide 6
L2 Cache Misses
CPU
L1
L2
Memory
Storage
7. Copyright © 2021 McKnight Consulting Group Slide 7
Data Block Layout
© McKnight Consulting Group, 2010
Page Header
Page
Footer
Row IDs
1120Aris Doug Johnson Practice
Director 206-676-5636
doug.johnson@aris.com
1121Stolt Offshore MS Ltd Craig Lennox Mr
+66 1226 71269
craig.lennox@stoltoffshore.com
1122Medtronic, Inc. Mark Kohls Principle
Database Administrator
763.516.2557
mark.kohls@medtronic.com
Records
8. Copyright © 2021 McKnight Consulting Group Slide 8
Columnar Data Block Layout
Block Header
Page
Footer
1120
1121
1122
1123
1124
1125
…
Records
9. Copyright © 2021 McKnight Consulting Group Slide 9
Traditional databases
Date Store # State Class Sales Category
3/1/21 32 NY A 6 Gen
3/1/21 35 CT A 9 Spec
3/1/21 36 CT C 11 Gen
3/1/21 39 SD D 8 Gen
3/1/21 42 KY A 5 Spec
3/1/21 43 VT C 14 Spec
3/1/21 47 GA A 31 Gen
3/1/21 51 MD A 4 Sub
3/1/21 55 DC D 16 Gen
3/1/21 59 NY B 7 Gen
3/1/21 62 NJ C 9 Spec
Calculate the average
sales for the “A”
stores in “NY”
Traditional approach:
• Data stored by row using
data blocks (4K … 32K)
• For queries, select a ‘filter’
-Build B-tree index for filters,
-BUT If filter is not selective
enough then scan the table
-Go to selected blocks and add
up sales numbers
-Randomly distributed data
will result in most blocks being
read
-Still have to read irrelevant
data in each block
10. Copyright © 2021 McKnight Consulting Group Slide 10
Mixing Columns in Containers
11. Copyright © 2021 McKnight Consulting Group Slide 11
Vertical Partitioning of Data
Columnar -
Columns are
stored
independently
Date Store # State Class Sales Category
3/1/13 32 NY A 6 Gen
3/1/13 35 CT A 9 Spec
3/1/13 36 CT C 11 Gen
3/1/13 39 SD D 8 Gen
3/1/13 42 KY A 5 Spec
3/1/13 43 VT C 14 Spec
3/1/13 47 GA A 31 Gen
3/1/13 51 MD A 4 Sub
3/1/13 55 DC D 16 Gen
3/1/13 59 NY B 7 Gen
3/1/13 62 NJ C 9 Spec
Benefits:
• Consistent data types are easy to compress
• Resulting storage size is typically less than 50% the
size of the raw data
12. Copyright © 2021 McKnight Consulting Group Slide 12
Columnar Compression
• Positional Representation
• Run-Length Encoding
• Dictionary Encoding
• Delta from Median
• NULL and Trim leading or trailing zeros or blanks
• UTF8 Compression
13. Copyright © 2021 McKnight Consulting Group Slide 13
Run-Length
Qtr Store# Sales Qtr
Q1 32 6 Q1 1 500
Q1 35 9 Q2 501 999
Q1 36 11 Q3 1000 1498
Q1 39 8
Q1 42 5
Q1 43 14
Q2 32 31
Q2 35 4
Q2 36 16
Q2 39 7
Q2 42 9
(Value, StartPosition, Count)
14. Copyright © 2021 McKnight Consulting Group Slide 14
Dictionary Encoding Example
Original data value Orig.
Size*
Compressed Value New size
(bytes)
England 30 0 1
England 30 0 1
United States of America 30 1 1
United States of America 30 1 1
Japan 30 2 1
Argentina 30 3 1
Sri Lanka 30 4 1
Japan 30 2 1
United States of America 30 1 1
Totals 270 9
* Fixed length, 30 bytes per value
16. Copyright © 2021 McKnight Consulting Group Slide 16
Materialization Strategies
Function of ‘projection’
§ Row-stores = removes unneeded columns from result set
§ Column-stores = when to GLUE
Early Materialization
§ Construct rows before processing
§ Decompress all compressed columns first
Late Materialization
§ Wait until end of operation
17. Copyright © 2021 McKnight Consulting Group Slide 17
Late Materialization
(4,1,4)
prodID
2
1
3
1
storeID
SELECT custID, price
FROM Sales
WHERE (prodID = 4) AND (storeID = 1)
Select
prodId = 4
Select
storeID = 1
1
1
1
1
0
1
0
1
AND
3
3
13
80
3 13
3 80
Construct
18. Copyright © 2021 McKnight Consulting Group Slide 18
Row-based
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
Workload Splitting
Same data in both structures
Optimizer or user determines which to use
Columnar
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
CustomerID
CompanyName ContactFirstName ContactLastName ContactTitle PhoneNumber
1119 m4ii dhamotharan achaiyan solutions architect 91222507176
1120 Aris Doug Johnson Practice Director 206-676-5636
1121 Stolt Offshore MS Ltd Craig Lennox Mr +66 1226 712519
1122 Medtronic, Inc. Mark Kohls Principle Database Administrator 763.516.2557
1123 Beckman Coulter Tim Parsons Business Systems Manager +61 22 996 0963
1124 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
1126 The Boeing Company Mike Roberts Senior Business Process Architect (206)655-7155
1127 IT/1 Consulting Leif B. Soerensen Data Warehouse Consultant +65 26236691
1128 Banco de Bogotá JOSE ALFREDO LOPEZ ARIAS Administrador DWH 5713320032
1133 The HArtford Jimmy Chen Business System Analyst 215-653-2662
1134 CGI Group Terry Petherick Senior Consultant 613-236-2155
1135 Metavante Corporation Ron Kundinger Assistant Vice President 616-577-9227
1138 CP Associates Wilson Mak Consultant 252-92593731
1142 PRSB Ming Long Wu Assistant Administrator 226-2-23931261 ext 719
1143 aft greg tanner cto 303.233.6122
1144 Zamba Solutions Jeff McCall Executive Vice President 602-626-6125
1146 MR Consultancy Mukesh Rughani Mr +66 (0)1379 662219
1147 Intellor Group Robin Martin Project Coordinator 301-202-6766
1148 Banco de Bogotá José Alfredo López Arias Administrador DWH 5713320032
19. Copyright © 2021 McKnight Consulting Group Slide 19
Benchmark
SQLite for row-oriented
DuckDB for columnar
20. Copyright © 2021 McKnight Consulting Group Slide 20
Test 1 : Insert 100000 high cardinality
customers
CREATE TABLE customer ( id INTEGER PRIMARY KEY, lastname VARCHAR(20), firstname VARCHAR(30), street
VARCHAR(30), city VARCHAR(20), state VARCHAR(2), zip VARCHAR(10), country VARCHAR(20), phone
VARCHAR(10))
First 10 rows:
0|Jordan|Katherine|5832 Degan St|Freyer|AZ|86285|USA|(691)551-1092
1|Andrade|Sterling|1047 Clark St|Michaels|MT|83750|USA|(665)579-6921
2|Frederick|John|5807 Travis St|Jones|AK|95733|USA|(896)790-5223
3|Binette|Jimmy|8629 Kester St|Booker|LA|62854|USA|(569)203-5537
4|Caswell|Stefanie|4165 Green St|Champagne|TN|11565|USA|(926)189-1496
5|Palmer|Neil|3340 Mohabir St|Callahan|IN|49647|USA|(986)595-8182
6|Silva|Carol|8553 Hamilton St|Lanzi|GA|93518|USA|(238)814-9708
7|Folkers|Robert|1984 Beebe St|Sprenger|OK|06495|USA|(488)334-2533
8|Moultrie|Bernard|14 Armstrong St|Taus|NV|61668|USA|(357)688-8420
9|King|Giuseppe|5864 Goede St|White|TN|26195|USA|(651)345-7210
Row -> 100000 rows inserted in 80920.816 ms Average : 809.208 μs
Columnar -> 100000 rows inserted in 110879.347 ms Average : 1108.793 μs
21. Copyright © 2021 McKnight Consulting Group Slide 21
Test 2 : Insert 10000 low cardinality
items
CREATE TABLE item ( id INTEGER PRIMARY KEY, name VARCHAR(30), department
VARCHAR(30), status VARCHAR(1), price DECIMAL(8,2) )
First 10 rows...
0|Bunker|Clothing|B|789.64
1|Creighton|Clothing|B|390.59
2|Cole|Clothing|A|625.07
3|Jantzen|House goods|B|827.39
4|Lopez|Clothing|B|194.08
5|Dery|House goods|B|199.29
6|Flores|Electronics|B|552.61
7|Crigger|Clothing|B|172.15
8|Kidder|Clothing|B|30.97
9|Marion|Clothing|A|228.73
Row -> 10000 rows inserted in 7379.863 ms Average : 737.986 μs
Columnar -> 10000 rows inserted in 7930.747 ms Average : 793.075 μs
22. Copyright © 2021 McKnight Consulting Group Slide 22
Test 3 : Insert 1000000 Narrow Fact
Table Data
CREATE TABLE sales ( id INTEGER PRIMARY KEY, customerid INTEGER, itemid INTEGER )
First 10 rows...
0|15340|1000
1|15443|9490
2|37370|1805
3|65986|2084
4|69930|7926
5|89665|3421
6|49097|5176
7|16616|6072
8|39226|5486
9|64665|3398
Row -> 1000000 rows inserted in 519258.013 ms Average : 519.258 μs
Columnar -> 1000000 rows inserted in 535044.747 ms Average : 535.045 μs
23. Copyright © 2021 McKnight Consulting Group Slide 23
Test 4 : Single Table Select
SELECT lastname FROM customer WHERE state=‘AL’;
SELECT lastname FROM customer WHERE state='AK’;
SELECT lastname FROM customer WHERE state='AZ’;
SELECT lastname FROM customer WHERE state='AR’;
SELECT lastname FROM customer WHERE state='CA’;
SELECT lastname FROM customer WHERE state='CO’;
SELECT lastname FROM customer WHERE state='CT’;
SELECT lastname FROM customer WHERE state='DE’;
SELECT lastname FROM customer WHERE state='FL’;
SELECT lastname FROM customer WHERE state=‘GA’;
Row -> 50 queries in 392.256 ms Average : 7845.130 μs
Columnar -> 50 queries in 165.821 ms Average : 3316.412 μs
24. Copyright © 2021 McKnight Consulting Group Slide 24
Test 5 : Single Table Aggregation
SELECT department, sum(price) FROM item GROUP BY department;
SELECT status, sum(price) FROM item GROUP BY status;
SELECT substring(name, 1, 1), sum(price) FROM item GROUP BY substring(name, 1, 1);
Row -> 3 queries in 7.833 ms Average : 2611.001 μs
Columnar -> 3 queries in 2.115 ms Average : 704.924 μs
25. Copyright © 2021 McKnight Consulting Group Slide 25
Test 6 : Analytics Join Aggregation
SELECT department, sum(price) FROM customer c, item i, sales s WHERE s.customerid
= c.id AND s.itemid = i.id GROUP BY department;
SELECT status, sum(price) FROM customer c, item i, sales s WHERE s.customerid = c.id
AND s.itemid = i.id GROUP BY status;
SELECT city, sum(price) FROM customer c, item i, sales s WHERE s.customerid = c.id
AND s.itemid = i.id GROUP BY city;
SELECT state, sum(price) FROM customer c, item i, sales s WHERE s.customerid = c.id
AND s.itemid = i.id GROUP BY state;
SELECT substring(lastname, 1, 1), sum(price) FROM customer c, item i, sales s WHERE
s.customerid = c.id AND s.itemid = i.id GROUP BY substring(lastname, 1, 1);
Row -> 5 queries in 8744.340 ms Average : 1748867.941 μs
Columnar -> 5 queries in 1209.917 ms Average : 241983.461 μs
26. Copyright © 2021 McKnight Consulting Group Slide 26
Benchmark Conclusions
Columnar is a little slower to load, but much
faster on queries
2.3x faster on simple single column scans
3.7x on simple aggregations
7.2x on an analytics query with a 3-table join
27. Copyright © 2021 McKnight Consulting Group Slide 27
Summary: Columnar Databases
§ Is an alternative to row storage
§ Stores each container independently
§ Addresses idle CPUs and disk bottlenecks
§ Is great for compression
§ Is best when there is a lot of data, long rows and when you
can isolate the loads
§ Is great for high column selectivity queries
28. Slide 28
Unlock Potential
William McKnight
President
McKnight Consulting Group
www.mcknightcg.com
@williammcknight
Analytic Databases Should be Columnar
@williammcknight