UiPath Community: Communication Mining from Zero to Hero
[B14] A MySQL Replacement by Colin Charles
1. MariaDB - a MySQL
replacement
Colin Charles, Team MariaDB, SkySQL Ab
colin@mariadb.org | http://mariadb.org/
http://bytebot.net/blog/ | @bytebot on Twitter
DB Tech Showcase, Osaka, Japan
18 June 2014
2. whoami
• Work on MariaDB at SkySQL Ab
• Merged with Monty Program Ab, makers of
MariaDB
• Formerly MySQL AB (exit: Sun Microsystems)
• Past lives include Fedora Project (FESCO),
OpenOffice.org
3. Who are you?
• Developer?
• Operator? (DBA, sysadmin)
• A bit of both?
5. 5W1H is MariaDB
• Drop-in compatible MySQL replacement
• Community developed, Foundation backed, feature
enhanced, backwards compatible, GPLv2 licensed
• Steady stream of releases in 4 years 4 months: 5.1, 5.2, 5.3,
5.5, 10.0, MariaDB Galera Cluster 5.5, MariaDB with TokuDB
5.5
• Enterprise features open: PAM authentication plugin,
threadpool, audit plugin
• Default in Red Hat Enterprise Linux, Fedora, openSUSE, etc.
6. MariaDB in Japan
• Contributions:
• SPIDER storage engine
• Groonga full-text search engine
• HandlerSocket key/value store
• Translations
• 13.6% (6/44) maria-captains from Japan
• 24/7/365 support by Ashisuto
9. Microseconds & 5.6
• TIME_TO_SEC(), UNIX_TIMESTAMP() preserve
microseconds of the argument
MariaDB 10.0 MySQL 5.6
SELECT
TIME_TO_SEC('10:10:10.12345');
+-------------------------------+
| TIME_TO_SEC('10:10:10.12345') |
+-------------------------------+
| 36610.12345 |
+-------------------------------+
1 row in set (0.01 sec)
SELECT
TIME_TO_SEC('10:10:10.12345');
+-------------------------------+
| TIME_TO_SEC('10:10:10.12345') |
+-------------------------------+
| 36610 |
+-------------------------------+
1 row in set (0.00 sec)
10. Virtual Columns
• A column in a table that has its value automatically
calculated either with a pre-calculated/
deterministic expression or values of other fields in
the table
• VIRTUAL - computed on the fly when data is
queried (like a VIEW)
• PERSISTENT - computed when data is inserted
and stored in a table
MariaDB 5.2+
11. Virtual Columns
CREATE TABLE table1 (
a INT NOT NULL,
b VARCHAR(32),
c INT AS (a mod 10) VIRTUAL,
d VARCHAR(5) AS (left(b,5))
PERSISTENT);
12. Virtual columns example
CREATE TABLE product (
-> productname VARCHAR(25),
-> price_eur DOUBLE,
-> xrate DOUBLE,
-> price_cny DOUBLE AS (price_eur*xrate) VIRTUAL);
INSERT INTO product VALUES ('toothpaste', 1.5, 1.39,
default);
INSERT into product VALUES ('shaving cream', 3.59,
1.39, default);
13. Virtual columns example II
select * from product;
+---------------+-----------+-------+-------------------+
| productname | price_eur | xrate | price_cny |
+---------------+-----------+-------+-------------------+
| toothpaste | 1.5 | 1.39 | 2.085 |
| shaving cream | 3.59 | 1.39 | 4.990099999999999 |
+---------------+-----------+-------+-------------------+
2 rows in set (0.00 sec)
14. PCRE Regular Expressions
• Powerful REGEXP/RLIKE operator
• New operators:
• REGEXP_REPLACE(sub,pattern,replace)
• REGEXP_INSTR(sub,pattern)
• REGEXP_SUBSTR(sub,pattern)
• Works with multi-byte character sets that MariaDB
supports, including East-Asian sets
MariaDB 10.0+
15. GIS
• MariaDB implements a subset of SQL with
Geometry Types
• No longer just minimum bounding rectangles
(MBR) - shapes considered
CREATE TABLE geom (g GEOMETRY NOT NULL,
SPATIAL INDEX(g)) ENGINE=MyISAM;
• ST_ prefix - as per OpenGIS requirements
MariaDB 5.3+
16. Dynamic columns
• Allows you to create virtual columns with dynamic content for each
row in table. Store different attributes for each item (like a web
store).
• Basically a BLOB with handling functions: COLUMN_CREATE,
COLUMN_ADD, COLUMN_GET, COLUMN_DELETE,
COLUMN_EXISTS, COLUMN_LIST, COLUMN_CHECK,
COLUMN_JSON
• In MariaDB 10.0: name support (instead of referring to columns by
numbers, name it), convert all dynamic column content to JSON
array, interface with Cassandra
INSERT INTO tbl SET
dyncol_blob=COLUMN_CREATE("column_name", "value");
MariaDB 5.3+
17. Query Cassandra
• Data is mapped: rowkey, static columns, dynamic
columns
• super columns aren’t supported
• No 1-1 direct map for data types
• Write to Cassandra from SQL (SELECT, INSERT,
UPDATE, DELETE)
MariaDB 10.0+
18. Cassandra II
pk varchar(36) primary key,
data1 varchar(60),
data2 bigint
) engine=cassandra keyspace='ks1' column_family='cf1'
• Table must have a primary key
• name/type must match Cassandra’s rowkey
• Columns map to Cassandra’s static columns
• name must be same as in Cassandra, datatypes must match, can be
subset of CF’s columns
19. Mapping
• Datatype mapping - complete table at KB
• Data mapping is safe - engine will refuse incorrect
mappings
• Command mapping: INSERT overwrites rows,
UPDATE reads then writes, DELETE reads then
writes
20. Typical use cases
• Web page hits collection, streaming data
• Sensor data
• Reads served with a lookup
• Want an auto-replicated, fault-tolerant table?
21. CONNECT
• Target: ETL for BI or analytics
• Import data from CSV, XML, ODBC, MS Access,
etc.
• WHERE conditions pushed to ODBC source
• DROP TABLE just removes the stored definition, not
data itself
• “Virtual” tables cannot be indexed
MariaDB 10.0+
22. SPIDER
• Horizontal partitioning, built on top of PARTITIONs
• Associates a partition with a remote server
• Transparent to user, easy to expand
• Has index condition pushdown support enabled
MariaDB 10.0+
23. TokuDB
• Opensource - separate MariaDB 5.5+TokuDB/
integrated in 10.0.5
• Improved insert (10-20x faster) & query speed,
compression (up to 90% space reduction),
replication performance and online schema
flexibility
• Uses Fractal Tree Indexes instead of B-Tree
• Tests & builds of TokuDB on multiple platforms
24. Engines, etc
• Plan for backups - TokuDB can be cool for your uses as
an example
• Galera: study your workload patterns, your application,
etc.
• SPIDER (built-in sharding capabilities, partitioning & XA
transaction capable with multiple backends including
Oracle)
• its not going to be straightforward to “just start” -
need to know right tables to implement, etc.
25. Threadpool
• Modified from 5.1 (libevent based), great for
CPU bound loads and short running queries
• Windows (threadpool), Linux (epoll), Solaris
(event ports), FreeBSD/OSX (kevents)
• No minimization of concurrent transactions
with dynamic pool size
• thread_handling=pool-of-threads
• https://mariadb.com/kb/en/thread-pool-in-
mariadb-55/
MariaDB 5.5+
26. PAM Authentication
• Authentication using /etc/shadow
• Authentication using LDAP, SSH pass phrases, password
expiration, username mapping, logging every login attempt, etc.
• INSTALL PLUGIN pam SONAME ‘auth_pam.so’;
• CREATE USER foo@host IDENTIFIED via pam
• Remember to configure PAM (/etc/pam.d or /etc/pam.conf)
• http://www.mysqlperformanceblog.com/2013/02/24/using-two-
factor-authentication-with-percona-server/
MariaDB 5.2+
27. SQL Error Logging Plugin
• Log errors sent to clients in a log file that can be
analysed later. Log file can be rotated
(recommended)
• a MYSQL_AUDIT_PLUGIN
install plugin SQL_ERROR_LOG soname
'sql_errlog.so';
MariaDB 5.5+
28. Audit Plugin
• Log server activity - who connects to the server,
what queries run, what tables touched - rotating log
file or syslogd
• a MYSQL_AUDIT_PLUGIN
INSTALL PLUGIN server_audit SONAME
‘server_audit.so’;
MariaDB 10.0+
29. Replication made better
• Selective skipping of replication events (session-
based or on master or slave)
• Dynamic control of replication variables (no
restarts!)
• Using row-based replication? Annotate the binary
log with SQL statements
• Slaves perform checksums on binary log events
MariaDB 5.3+
30. Replication made better II
• Group commit in the binary log - finally, sync_binlog=1,
innodb_flush_log_at_trx_commit=1 performs
• START TRANSACTION WITH CONSISTENT SNAPSHOT
• mysqldump —single-transaction —master-
data - full non-blocking backup
• Slaves crash-safe (data stored inside transaction tables)
• Multi-source replication - (real-time) analytics, shard
provisioning, backups, etc.
31. New KILL syntax
• HARD | SOFT & USER USERNAME are MariaDB-specific (5.3.2)
• KILL QUERY ID query_id (10.0.5) - kill by query id, rather than
thread id
• SOFT ensures things that may leave a table in an inconsistent
state aren’t interrupted (like REPAIR or INDEX creation for
MyISAM or Aria)
KILL [HARD | SOFT] [CONNECTION | QUERY]
[thread_id | USER user_name]
MariaDB 5.3+
32. Statistics
• Understand server activity better to understand database loads
• SET GLOBAL userstat=1;
• SHOW CLIENT_STATISTICS; SHOW USER_STATISTICS;
• # of connections, CPU usage, bytes received/sent, row
statistics
• SHOW INDEX_STATISTICS; SHOW TABLE_STATISTICS;
• # rows read, changed, indexes
• INFORMATION_SCHEMA.PROCESSLIST has MEMORY_USAGE,
EXAMINED_ROWS (similar with SHOW STATUS output)
MariaDB 5.2+
MariaDB 10.0+
33. EXPLAIN enhanced
• Explain analyser: https://mariadb.org/
explain_analyzer/analyze/
• SHOW EXPLAIN for <thread_id>
• EXPLAIN output in the slow query log
• EXPLAIN not just for SELECT but INSERT/UPDATE/
DELETE
MariaDB 10.0+
34. Roles
• Bundles users together, with similar privileges -
follows the SQL standard
CREATE ROLE audit_bean_counters;
GRANT SELECT ON accounts.* to
audit_bean_counters;
GRANT audit_bean_counters to ceo;
MariaDB 10.0+
35. Connectors
• The MariaDB project provides LGPL connectors
(client libraries) for:
• C
• Java
• ODBC
• Embedding a connector? Makes sense to use
these LGPL licensed ones…
37. MariaDB Galera Cluster
• MariaDB Galera Cluster is made for today’s cloud
based environments. It is fully read-write scalable,
comes with synchronous replication, allows multi-
master topologies, and guarantees no lag or lost
transactions.
• Currently 5.5-based
• 10.0 is in beta (almost ready for release)
38. Trusted by many
• Google
• Wikipedia
• Tumblr
• SpamExperts
• Limelight Networks
• KakaoTalk
• Paybox Services
40. Resources
• We moved to github! https://github.com/MariaDB/server
• We’re still on launchpad for older branches: https://
launchpad.net/maria
• maria-discuss@lists.launchpad.net
• maria-developers@lists.launchpad.net
• #maria on freenode
• facebook.com/MariaDB.dbms
• @mariadb / +MariaDB