Presentation about image recognition applied to digitized specimen of the Van Groenendael Krijger collection of Javanese Papilionid butterflies. Occasion: BrainFood, 12 April 2017, Naturalis, Leiden, the Netherlands.
3. By extracting salient features from images and using these to
train neural networks, automated identification may be possible
Automated identification
4. Project structure overview
• Open source, freely
available at:
github.com/naturalis
• Designed as loosely
coupled, swappable
modules
• Intended for re-use for
multiple cases
5. Project structure: reference images
photos [table]
id INTEGER NOT NULL
md5sum VARCHAR(32) NOT NULL
path VARCHAR(255)
title VARCHAR(100)
description VARCHAR(255)
photos_tags [table]
photo_id INTEGER NOT NULL
tag_id INTEGER NOT NULL
tags [table]
id INTEGER NOT NULL
name VARCHAR(50) NOT NULL
photos_taxa [table]
photo_id INTEGER NOT NULL
taxon_id INTEGER NOT NULL
taxa [table]
id INTEGER NOT NULL
rank_id INTEGER NOT NULL
name VARCHAR(50) NOT NULL
description VARCHAR(255)
ranks [table]
id INTEGER NOT NULL
name VARCHAR(50) NOT NULL
10. Results: SURF features
• PCA plots of the “speeded up robust
features” show clustering both at the
genus (top) and species (bottom) level
• Some species are so dimorphic that
the sexes are treated as separate
species (not shown)
• Some individuals are
“gynandromorphic”, though there is
likely positive collection bias
• Some taxa are much more variable
than others
11. Results: k-folds cross-validation
• Split the data in k (2, 5, 10) partitions
• Train on 1 partition, use k-1 as “out-of-sample” data
• Count number of correct/incorrect/unknown identifications
12. Next steps
• Application of trained neural networks to the entire
VGKS collection (once that is fully digitized)
• Testing other classifiers in addition to ANNs
• Improvement of the end user interface, possibly
as a native ‘app’ or on the web
• Extension of the platform to additional cases,
such as shells (snails, bivalves)
• Do more with the image feature data: mimicry,
character displacement, dimorphism
13. Acknowledgements
Naturalis sector Collection
• Max Caspers
• Luc Willemse
• Jan Moonen
• Digitization volunteers
Hogeschool Leiden
• Barbara Gravendeel
• Patrick Wijntjes
• Saskia de Vetter
LIACS
• Fons Verbeek
• Mengke Li
• Yuanhao Guo
IBL
• Wim van Tongeren
WUR
• Feia Matthijssen
Made possible by
• Naturalis internal grant for
application-oriented research
• The Van Groenendael-Krijger
Stichting
• Kind contributions of photos by
numerous orchid breeders