Indexing the MySQL Index: Key to performance tuning
1. Indexing the MySQL Index: Guide to
Performance Enhancement
Presented by – Sonali Minocha
OSSCube
2. Who Am I?
Chief Technology Officer (MySQL)
with OSSCube
MySQL Consulting, Implementation
& Training
MySQL Certified DBA & Cluster DBA
3.
4. What is Index?
A mechanism to locate and
A database index is a data
access data within a
structure that improves the
database. An index may
speed of data retrieval
quote one or more columns
operations on a database
and be a means of enforcing
table.
uniqueness on their values.
5. More about Index
• Speedy data retrieval.
• SPEED of SELECTs
• Rapid random look ups.
• Efficient for
Reporting, OLAP, read-intensive applications
•However it is expensive for
– Slows down writes
– heavy write applications (OLTP) be careful
– More disk space used
6. Properties
Index can be created on
: Index only contains key-
• One or more columns. fields according to
which table is arranged.
Index may quote one or
more columns and be a
Index may be unique or
means of enforcing
non-unique.
uniqueness of their
values.
7. EMPLOYEE TABLE
EMPLOYEE ID FIRSTNAME LASTNAME AGE SALARY GENDER
001 Ashish Kataria 25 10000 M
002 Rony Felix 28 20000 M
003 Namita Misra 24 10000 F
004 Ankur Aeran 30 25000 M
005 Priyanka Jain 30 20000 F
006 Pradeep Pandey 31 30000 M
007 Pankaj Gupta 25 12000 M
008 Ankit Garg 30 15000 M
8. Cont.
In this table if we have to search for employee whose name is
Rony then code will look like :
For each row in table
if row[2] = 'Rony' then
results.append[row]
Else
movenext
So we checking each now for condition.
10. Type Of Indexes
Concatenated
Column Index Covering Index
Index
Clustered/Non-
Partial Index
clustered Index
11. Column Index
Only those query will
Index on a single
be optimized which
column
satisfy your criteria.
Eg:
By adding an index to
SELECT
employeeid, the
employeeid, firstnam
query is optimized to
e
only look at records
FROM Employee
that satisfy your
WHERE
criteria.
employeeid = 001
12. Concatenated Index
Index on multiple Use appropriate index.
columns. :
SELECT employeeid, lastname
FROM Employee
WHERE employeeid = 002
AND lastname = ‘Felix’;
13. Covering Index
The benefit of a covering
index is that the lookup of the
various B-Tree index pages
Covers all columns in a query.
necessarily satisfies the
query, and no additional data
page lookups are necessary.
SELECT employeeid
FROM Employee
WHERE employeeid = 001
14. Partial Index
Subset of a column for the index.
Use on CHAR, VARCHAR,TEXT etc.
Creating a partial index may greatly reduce the size of the
index, and minimize the additional data lookups required.
Create table t ( name char(255) , INDEX ( name(15) ) );
Eg:-SELECT employeeid, firstname, lastname
FROM Employee WHERE lastname like ‘A%’
We should add an index to lastname to improve
performance.
15. Clustered vs. Non-clustered
Describes whether the data records are stored
on disk in a sorted order
MyISAM - non clustered.
InnoDB - Clustered.
Secondary indexes built upon the clustering
key
16. Primary Index is added to all secondary index.
Because the data resides within the leaf nodes of index, more space in memory needed to search through same amount of records
18. How it can be faster?
If we create HASH TABLE. The key of
hash table would be based on
empnameand the values would be
pointer to the database row.
This is Hash Index:
• Hash index are good for equality searches.
• Hash index are not good for index searches.
So what should be the solution for Range Searches?
B-Tree
19. 30 0X775800
Age Location of the
data
B-Tree/ Binary tree: Stores data in
ordered way.
Nodes in B-Tree
contains a index
field and a
pointer to a
Allows data row.
logarithmic • So like in above Each node takes
It allows faster Single disk
selections, inser example if we up one disk
range searches. operation.
tions and create an index on block.
deletion. age the node of B-
tree will look like
20. B-Tree 003 006
Diagram
001 002 004 005 008 007
EMPLOYEE ID FIRSTNAME LASTNAME AGE SALARY GENDER
001 Ashish Kataria 25 10000 M
002 Rony Felix 28 20000 M
003 Namita Misra 24 10000 F
004 Ankur Aeran 30 25000 M
005 Priyanka Jain 30 20000 F
006 Pradeep Pandey 31 30000 M
007 Pankaj Gupta 25 12000 M
008 Ankit Garg 30 15000 M
21. R-Tree
MySQL supports any other type of index called Spatial Index. Spatial Index are
created the way other index are created. Only extended keyword is used
'SPATIAL'.
22. Fulltext Indexes
Ability to search for text.
Only available in MyISAM.
Can be created for a TEXT, CHAR or VARCHAR.
Important points of fulltext Search:
• Searches are not case sensitive.
• Short words are ignored, the default minimum length is 4 character.
• ft_min_word_len
• ft_max_word_len
Words called stopwords are ignored:
• ft_stopword_file= ' '
If a word is present in more than 50% of the rows it will have a weight of zero. This has advantage
on large data sets.
23. Hash, B-Tree, R-Tree uses different strategy to speed data
retrieval time.
The best algorithm is pickedup depending on data expected
and supportedalgorithm.
24. Query is using Index or Not?
With EXPLAIN the query is
Query Execution Plan sent all the way to the
(EXPLAIN) optimizer, but not to the
storage engine
Secrets of Best MySQL Optimization Practice
25. mysql> explain select * from citylistG
id: 1
select_type: SIMPLE
table: citylist
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 4079
Extra:
1 row in set (0.01 sec)
26. Selectivity
• Selectivity of a column is the ratio between number of distinct
values and number of total values.
•Primary Key has selectivity 1.
eg: Employee table has 10,000 users with fields employeeid
,email ,firstname ,lastname ,salary ,gender
Our application searches for following fields:
employeeid
first ,lastname ,gender email So
employeeid, email, firstname and lastname can be candiates
for indexes.
27. Since employee id is unique its selectivity will
be equal to the primary key selectivity.
In case of gender it will have two values M ,F
selectivity = 2/10,000 = .00002
If we drop this index , it will be more beneficial.
Index on firstname and lastname selectivity is a
function of name you are searching.
Selectivity above than 15% is a good index.
28. # /*
SQL script to grab the
# SQL script to grab the worst performing indexes worst performing
# in the whole server indexes in the whole
server
# */
# SELECT
# t.TABLE_SCHEMA AS `db`
# , t.TABLE_NAME AS `table`
# , s.INDEX_NAME AS `inde name`
# , s.COLUMN_NAME AS `field name`
# , s.SEQ_IN_INDEX `seq in index`
# , s2.max_columns AS `# cols`
# , s.CARDINALITY AS `card`
# , t.TABLE_ROWS AS `est rows`
# , ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %`
# FROM INFORMATION_SCHEMA.STATISTICS s
# INNER JOIN INFORMATION_SCHEMA.TABLES t
# ON s.TABLE_SCHEMA = t.TABLE_SCHEMA
# AND s.TABLE_NAME = t.TABLE_NAME
29. # INNER JOIN (
# SELECT
# TABLE_SCHEMA
# , TABLE_NAME
# , INDEX_NAME
# , MAX(SEQ_IN_INDEX) AS max_columns
# FROM INFORMATION_SCHEMA.STATISTICS
# WHERE TABLE_SCHEMA != 'mysql'
# GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
# ) AS s2
# ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA
# AND s.TABLE_NAME = s2.TABLE_NAME
# AND s.INDEX_NAME = s2.INDEX_NAME
# WHERE t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB */
# AND t.TABLE_ROWS> 10 /* Only tables with some rows */
# AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */
# AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* Selectivity < 1.0 b/c unique indexes are perfect anyway */
# ORDER BY `sel %`, s.TABLE_SCHEMA, s.TABLE_NAME /* Switch to `sel %` DESC for best non-unique indexes */
30. Where to add index
WHERE clauses ( on which column data is filtered)
• Good distribution and selectivity in field values
• BAD IDEA to index gender or columns like status
Index join columns
Try to create as many Covering Index as possible
GROUP BY clauses
• Field order is important.
31. Avoid Redundant Indexes
Example:
Key(a)
key(a,b)
Key(a(10));
Key(a)andKey(a(10) is redundant because they are prefix of Key(A,B)
Redundantx may be useful
A – integer column
B – varchar(255)
Key(A) will be faster than using Key(A,B).
Index on short columns are more faster however if index on longer column
is created that can be beneficial as covered index.
32. Key Caches (MyISAM)
• For tables are used more often Key Cache can
be used to optimize read of those tables
hot_cache.key_buffer_size = 128K
• Assign tables to caches
CACHE INDEX table1, TO hot_cache;
CACHE INDEX table2 TO cold_cache;
33. • Preload your indexes for maximum efficiency
• LOAD INDEX INTO CACHE table1;
• Use IGNORE LEAVES
34. Case where Index will not be used
Functions on indexed fields.
WHERE TO_DAYS(dateofjoining) –
TO_DAYS(Now()) <= 7 (doesn’t use index)
WHERE dateofjoing >= DATE_SUB(NOW(), INTER
VAL 7 DAY) (uses index)
35. Select * from employee where name like ‘%s’;
If we use left() function used on index column.
36. Choosing Indexes
Index columns that you
use for searching,
Consider column
sorting or grouping, not Index Short Values.
selectivity.
columns you only
display as output.
Index prefixes of string Take advantage of
Don't over Index.
values. leftmost prefixes.
Match Index types to Use the slow-query log
the type of to identify queries that
comparisions you may be performing
perform. badly.
37. Keep data types as small as possible for what you
need Don't use BIGINT unless required
The smaller your data types, the more records
will fit into the index blocks. The more records
fit in each block, the fewer reads are needed to
find your records.
38. Common indexing mistakes
Using CREATE Misusing a
Not using an Index.
INDEX. composite Index.
Appending the
Using an
primary key to an
expression on a
index on an
column.
InnoDB table.
40. Thank you for your time and attention
www.osscube.com
For more information, please feel free to drop in a line to
sonali@osscube.com or visit http://www.osscube.com