3. We collect the world of
fashion into a customisable
shopping experience.
3
4. What makes us different?
All data is scraped from retailers
500 spiders (scrapy), 9000 designers
Almost everything is automated
SEO, recommendation, classification, sales
This architecture comes with a few problems
5. Why do we get duplicates?
There is no ISBN for fashion
Burberry Selfridges
inter-retailer
intra-retailer intra-retailer
6. How We Used to Find Duplicates
Lucene fuzzy string matching
Doesn’t really work
Yoox.com
3,000 products called “dress”
7,000 products called “shirt”
7. How We Detect Duplicates Now
BRISK image descriptors
Leutenegger, Chli and Siegwart
BRISK: Binary Robust Invariant Scalable Keypoints.
ICCV 2011: 2548-2555
31. What’s Next
Reverse image search
This works! We tried during a hackathon
Similiar textual features
i.e. word embeddings
Dual image / text vector embeddings