A presentation by Dr. Jay Ven Eman, CEO of Access Innovations, Inc., on measuring the financial benefits of taxonomies. First presented at the 2009 Data Harmony Users Group meeting.
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
ROI & Impact Measures for Taxonomies
1. ROI & Impact: Quantitative &
Qualitative Measures for
Taxonomies
Wednesday, 11 February 2009
12:00 – 12:30 PM MST
Presented by Jay Ven Eman, Ph.D., CEO
Access Innovations, Inc. / Data Harmony
505.998.0800 / www.accessinn.com / www.dataharmony.com
j_ven_eman@accessinn.com
DHUG 2009
7. ROI - Segments
Cost of taxonomy system
Indexing costs
Cost of getting system ready
Ongoing maintenance
Increased efficiency
Increased quality of retrieval
Cost of legacy system maintenance
10. Hit, Miss, Noise
Hit – exactly what a human indexer would use
Miss – human indexer would use but system
did not assign
Noise – system assigned but human did not
Relevant noise – could have been assigned
Irrelevant noise – just plain wrong
11. Subjective
Relevance
Reflects how akin it is to the users request
Aboutness
Reflects the topical match between the document
content and the term
How well the topic describes what the document is
about
Varies with level of conceptual terms vs. factual
terms in the thesaurus
12. Subjective
“There is now a 92% accuracy rating accuracy on accounting and
regulatory document search based on hit, miss and noise or
relevance, precision and recall statistics…Access Innovations.”
USGAO
“IEEE had their system up and running in three days, in full
production in less than two weeks.” Institute of Electrical and
Electronics Engineers (IEEE)
“The American Economic Association said its editors think using it
is fun and makes time fly!” American Economic Association (AEA)
“ ProQuest CSA have achieved a 7 fold increase in productivity –
thus they have four licenses.” ProQuest CSA
“Weather Channel finds things 50% faster using Data Harmony. A
significant saving in time.” The Weather Channel
13. Statistical
Precision
Correct retrieval / Total retrieval
Hits / hits + noise
Recall
Correct retrieval / Total correct in system
Hits / Hits + misses
Level of effort
Hits / Hits + misses + noise
14. Cost Goals
Cost Savings
Software/hardware
More efficient delivery systems
Retirement of legacy systems
Cost Avoidance
Additional staff not needed to scale
Lower training costs
15. Productivity Goals
Productivity gains
Employee productivity – fourfold
Get up to speed faster
Learn vocabulary faster
Able to capture peoples knowledge in the
rule base
Staff savings / redeployment
Elimination of new hires
16. Additional Benefits
Revenue Generation
Higher hit rates
More purchases off the site
Competitive advantage
Shorter product / sales cycles
Faster implementation
Better search experience
Ability to meet regulatory requirements
17. Go – No Go
Reach 85% precision to launch for
productivity - assisted
Reach 85% for filtering or categorization
Sorting for production
Level of effort to get to 85%
Integration into the workflow is efficient
18. Benchmarks
15 – 20% irrelevant returns / noise
Amount of work needed to achieve 85%
level
How good is good enough?
Satisfice = satisfaction + suffice
How much error can you put up with?
19. Example ROI Calculation
Assume – 5,000 term thesaurus
1.5 synonyms per terms
7,500 terms total
Assume 85% accuracy
Use assisted for indexing
Use automatically for filtering
Assume $75 per hour for staff
Assume 10,000 records for test batch
20. Indexing costs with Data Harmony
80% of rules built automatically
7,500 x .8 = 6,000
20% require complex rules
Average rule takes 5 minutes
(Actually MUCH faster using M.A.I. GUI)
5 x 1,500 = 7,500 minutes
125 hours x $75 = $9,375
21. Indexing Costs
Base cost of MAIstro EE - $60,000
Cost of getting system ready
Programming support and integration
Estimated at 2 weeks programming $125 / hour = $10,000
Rule building
Estimated at 125 hours $75 / hour = $9,375
Possible need to re-run training set several times
Ongoing maintenance
Estimated at 15% of purchase price for license = $9,000
Rule building for new terms 50 terms per quarter
200 terms x .8 = 160 automatic
40 at 5 minutes per term = 200 minutes /60 = 3.33 hours x $75 =
$250
Targeted initial accuracy at 85%
22. Indexing costs
Year one
$60,000 + $10,000 + $9,375 = $79,375
Years thereafter
9000 + 250 = $9250
85% accuracy
24. ROI & Impact: Quantitative &
Qualitative Measures for
Taxonomies
Wednesday, 11 February 2009
12:00 – 12:30 PM MST
Presented by Jay Ven Eman, Ph.D., CEO
Access Innovations, Inc. / Data Harmony
505.998.0800 / www.accessinn.com / www.dataharmony.com
j_ven_eman@accessinn.com
Thank you!