SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
© 2019 Intel
Data Annotation at Scale:
Pitfalls and Solutions
Nikita Manovich
May 2019
© 2019 Intel
Why data is important?
- Data is one of key limiting factors for human-level AI
- More data beats a cleverer algorithm
- But data alone is not enough
2
Fei-Fei Li
(Professor of Computer Science at
Stanford University)
© 2019 Intel
Getting data is hard
- Public datasets are limited for real use cases
- Restricted or unknown terms of use for public data
- Privacy laws in different countries (e.g. Germany, US, China)
- Personal information like faces and license plates
3
- Outsourcing to data companies
- Buying datasets
- Sign NDA and consent forms with participants
- Crowdsourcing data collection and annotation
- Synthetic data and augmentation
© 2019 Intel
Data outsourcing
- Expensive, price depends on volume
- Nearly full turnkey data solution
- Bureaucracy (e.g. register as a supplier, big tasks only)
- 3rd party proprietary tools and infrastructure
- Difficult to check results before data deployment
- Necessary to fix issues in data anyway
- Outcome based pricing vs Time and Materials
4
- Limited legal and privacy risks
- Growing competitive market
- Nearly full turnkey data solution
© 2019 Intel
Data as a product
- Product life cycle for data
- Data tools responsibility
- Document
- Search and create
- Develop and convert
- Analyze and visualize
- Test and debug
- Publish and maintain
5
Planning
Analysis
Design
Implementation
Maintenance
Testing and
Integration
© 2019 Intel
Data annotation workflow
6
Raw
Data
Issue
tracker
Data
Spec
Annotated
data
Algo
team
Data
annotation
team
Labeling
tool
Automatic
annotation
Semi-
automatic
annotation
Manual
annotation
Manual
verification
© 2019 Intel
High quality data = higher cost + a lot of time
- Data specifications are not reliable
- Humans make mistakes
- Automatic and semi-automatic methods are not perfect
- Computer vision problems are ill-posed
7
- Golden test before a real annotation task
- Annotate the same data several times
- Reduce complexity, split an annotation task
- Reduce subjectivity of data (e.g. ignore label)
- Invest money into data infrastructure
© 2019 Intel
Data annotation workflow optimization
- Data specification is too strict
- Undefined annotation workflow
- Every object on an image is annotated
- Homegrown data annotation tools
8
- Use tight bounding boxes (up to 10x speedup)
- Specify how to annotate data (e.g. use shortcuts)
- Use ignore regions for very small objects
- Integrate automatic annotation (data in the loop)
- Manage performance of your data annotation team
© 2019 Intel
Computer Vision Annotation Tool (CVAT)
- Open Source (MIT License)
- Growing community
- Auto annotation using
trained DL models
- Collaborative
- Easy to deploy and
maintain
- Client-server architecture
- Web-based UI
- Django server (REST)
- Optimized for primary
annotation workflows
9
GitHub: Gitter:
© 2019 Intel
Use case: object detection
- Shapes
- Bounding boxes
- Polylines
- Points
- Polygons
- Interpolation of bounding
boxes between key frames
- Any labels (e.g. car, person,
ignore)
- Any attributes (e.g. parked,
color, model, etc)
10
© 2019 Intel
Use case: classification
- Keyboard shortcuts
- Flexible filtration of objects
- Optimized for efficiency
(>1000 tags/hour)
- Type of attributes: boolean,
choice, number, text
- Concentrate on one
attribute at a time
- Use undefined attribute by
default
- Annotate the same
attribute several times to
raise quality
11
© 2019 Intel
Use case: semantic segmentation
- Layers to avoid re-drawing
- Flexible filtration of objects
- Easy way to draw, resize,
edit polygons
- Highlight of unannotated
regions
- UI and UX tricks
- Transparency
- Emphasized boundaries
- Class view
- Semi-automatic methods
(e.g. Deep Extreme Cut)
12
© 2019 Intel
Use case: auto annotation
13
© 2019 Intel
Data in the loop concept
14
Extract
useful data
Annotate by
DL model
Verify data
Build a
dataset
Train DL
model
Deploy DL
model
Data
© 2019 Intel
Management of the data annotation workflow
- Available information
- Activity
- Actions
- Working hours
- Statistics
- Exceptions
- Data annotation flow
reconstruction
- Choose any time period
- Triage annotation problems
- Flexible filtration (e.g. user, event)
- Custom visualizations
15
© 2019 Intel
Plans
UI
cvat.js
REST API
16
CVAT
XML
MS
COCO
Pascal
VOC
TF
records
JSON
…
CVAT
TF
OpenVINO
PyTorch
Caffe2
MXNet
…
© 2019 Intel
Conclusion
- Data is a critical product in a company’s portfolio
- Follow legal and privacy laws when dealing with data
- Data providers are not a silver bullet
- Improve and optimize your data workflow
- Invest money into own data infrastructure
- Use right tools to develop a data product
17
© 2019 Intel
Resources
18
Computer Vision Annotation Tool
GitHub
https://github.com/opencv/cvat
Gitter
https://gitter.im/opencv-cvat
Intel AI blog
https://www.intel.ai/introducing-cvat
Intel Developer Zone
Computer Vision Annotation Tool: A Universal
Approach to Data Annotation
Contact information
Email
nikita.manovich@intel.com
© 2019 Intel
Backup Material
19
© 2019 Intel
Professional data annotator portrait
Age 26-30 (56.3%), 21-25 (31.3%)
Gender 56.2% (female), 43.8% (male)
Height 1.61m – 1.80m (75%)
Weight mostly equally distributed between 41kg – 100kg
Education higher (59.4%), students (28.1%)
Profession journalist, physician, teacher, economist, engineer, …
Hobby learning foreign languages, sport, photography, reading,
drawing, music, computer games, …
Standing 13m – 24m (34.4%), 7m - 12m (28.1%)
20

Weitere ähnliche Inhalte

Was ist angesagt?

Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Yahoo Developer Network
 
Farmer Recommendation system
Farmer Recommendation systemFarmer Recommendation system
Farmer Recommendation systemSandeep Wakchaure
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Dm from databases perspective u 1
Dm from databases perspective u 1Dm from databases perspective u 1
Dm from databases perspective u 1sakthyvel3
 
Bringing AI to Business Intelligence
Bringing AI to Business IntelligenceBringing AI to Business Intelligence
Bringing AI to Business IntelligenceSi Krishan
 
What is artificial intelligence? What are task domains in AI?
What is artificial intelligence? What are task domains in AI?What is artificial intelligence? What are task domains in AI?
What is artificial intelligence? What are task domains in AI?Cyber Infrastructure INC
 
Artificial intelligence in software engineering ppt.
Artificial intelligence in software engineering ppt.Artificial intelligence in software engineering ppt.
Artificial intelligence in software engineering ppt.Pradeep Vishwakarma
 
Artificial intelligence and knowledge representation
Artificial intelligence and knowledge representationArtificial intelligence and knowledge representation
Artificial intelligence and knowledge representationSajan Sahu
 
Explainable AI
Explainable AIExplainable AI
Explainable AIDinesh V
 
Semantic net in AI
Semantic net in AISemantic net in AI
Semantic net in AIShahDhruv21
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3alaa223
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
Introduction to data mining technique
Introduction to data mining techniqueIntroduction to data mining technique
Introduction to data mining techniquePawneshwar Datt Rai
 

Was ist angesagt? (20)

Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
 
Farmer Recommendation system
Farmer Recommendation systemFarmer Recommendation system
Farmer Recommendation system
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Dm from databases perspective u 1
Dm from databases perspective u 1Dm from databases perspective u 1
Dm from databases perspective u 1
 
Lec1,2
Lec1,2Lec1,2
Lec1,2
 
Bringing AI to Business Intelligence
Bringing AI to Business IntelligenceBringing AI to Business Intelligence
Bringing AI to Business Intelligence
 
What is artificial intelligence? What are task domains in AI?
What is artificial intelligence? What are task domains in AI?What is artificial intelligence? What are task domains in AI?
What is artificial intelligence? What are task domains in AI?
 
Artificial intelligence in software engineering ppt.
Artificial intelligence in software engineering ppt.Artificial intelligence in software engineering ppt.
Artificial intelligence in software engineering ppt.
 
Text MIning
Text MIningText MIning
Text MIning
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
 
Artificial intelligence and knowledge representation
Artificial intelligence and knowledge representationArtificial intelligence and knowledge representation
Artificial intelligence and knowledge representation
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
AI Chatbot
AI ChatbotAI Chatbot
AI Chatbot
 
Semantic net in AI
Semantic net in AISemantic net in AI
Semantic net in AI
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3
 
Chapter 1 (final)
Chapter 1 (final)Chapter 1 (final)
Chapter 1 (final)
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Introduction to data mining technique
Introduction to data mining techniqueIntroduction to data mining technique
Introduction to data mining technique
 

Ähnlich wie "Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel

Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?Harald Erb
 
Teradata and Cisco integrated journey to IoT and Smart city
Teradata and Cisco integrated journey to IoT and Smart cityTeradata and Cisco integrated journey to IoT and Smart city
Teradata and Cisco integrated journey to IoT and Smart cityArtur Borycki
 
IT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsIT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsPiet J.H. Daas
 
MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...
MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...
MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...MongoDB
 
Mobile Maintenance App Insight Mobile SAP PM
Mobile Maintenance App Insight Mobile SAP PMMobile Maintenance App Insight Mobile SAP PM
Mobile Maintenance App Insight Mobile SAP PMRODIAS GmbH
 
Bhadale group of companies data science project implementation catalogue
Bhadale group of companies data science project implementation catalogueBhadale group of companies data science project implementation catalogue
Bhadale group of companies data science project implementation catalogueVijayananda Mohire
 
AI Foundations: Simpler Technologies, Smarter Business
AI Foundations: Simpler Technologies, Smarter BusinessAI Foundations: Simpler Technologies, Smarter Business
AI Foundations: Simpler Technologies, Smarter BusinessTIBCO_Software
 
Introduction to Smart Data Models
Introduction to Smart Data ModelsIntroduction to Smart Data Models
Introduction to Smart Data ModelsFIWARE
 
Ibm db2update2019 machine learning and db2 ai
Ibm db2update2019 machine learning and db2 aiIbm db2update2019 machine learning and db2 ai
Ibm db2update2019 machine learning and db2 aiGustav Lundström
 
Data virtualization an introduction
Data virtualization an introductionData virtualization an introduction
Data virtualization an introductionDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Take a Look Under the Hood of BMC Remedy with Smart IT: An Architectural Review
Take a Look Under the Hood of BMC Remedy with Smart IT:  An Architectural ReviewTake a Look Under the Hood of BMC Remedy with Smart IT:  An Architectural Review
Take a Look Under the Hood of BMC Remedy with Smart IT: An Architectural ReviewBMC Software
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMatillion
 
Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinDenodo
 
TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015Panorama Software
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldRehgan Avon
 
Big Data overview
Big Data overviewBig Data overview
Big Data overviewalexisroos
 
HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...
HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...
HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...NTT DATA Technology & Innovation
 

Ähnlich wie "Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel (20)

Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Teradata and Cisco integrated journey to IoT and Smart city
Teradata and Cisco integrated journey to IoT and Smart cityTeradata and Cisco integrated journey to IoT and Smart city
Teradata and Cisco integrated journey to IoT and Smart city
 
20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai20180115 Mobile AIoT Networking-ftsai
20180115 Mobile AIoT Networking-ftsai
 
IT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsIT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics Netherlands
 
MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...
MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...
MongoDB World 2019: Enabling Global Tire Design Leveraging MongoDB's Document...
 
Mobile Maintenance App Insight Mobile SAP PM
Mobile Maintenance App Insight Mobile SAP PMMobile Maintenance App Insight Mobile SAP PM
Mobile Maintenance App Insight Mobile SAP PM
 
Bhadale group of companies data science project implementation catalogue
Bhadale group of companies data science project implementation catalogueBhadale group of companies data science project implementation catalogue
Bhadale group of companies data science project implementation catalogue
 
AI Foundations: Simpler Technologies, Smarter Business
AI Foundations: Simpler Technologies, Smarter BusinessAI Foundations: Simpler Technologies, Smarter Business
AI Foundations: Simpler Technologies, Smarter Business
 
Introduction to Smart Data Models
Introduction to Smart Data ModelsIntroduction to Smart Data Models
Introduction to Smart Data Models
 
Ibm db2update2019 machine learning and db2 ai
Ibm db2update2019 machine learning and db2 aiIbm db2update2019 machine learning and db2 ai
Ibm db2update2019 machine learning and db2 ai
 
Data virtualization an introduction
Data virtualization an introductionData virtualization an introduction
Data virtualization an introduction
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Take a Look Under the Hood of BMC Remedy with Smart IT: An Architectural Review
Take a Look Under the Hood of BMC Remedy with Smart IT:  An Architectural ReviewTake a Look Under the Hood of BMC Remedy with Smart IT:  An Architectural Review
Take a Look Under the Hood of BMC Remedy with Smart IT: An Architectural Review
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry Devlin
 
TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015TOP Business Intelligence Predictions for 2015
TOP Business Intelligence Predictions for 2015
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial World
 
Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...
HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...
HashiTalks2020: Making Automatically Compliant Design Documents With Infrastr...
 

Mehr von Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...Edge AI and Vision Alliance
 
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic LeapEdge AI and Vision Alliance
 
“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from InstrumentalEdge AI and Vision Alliance
 
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...Edge AI and Vision Alliance
 
“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...
“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...
“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...Edge AI and Vision Alliance
 
May 2023 Embedded Vision Summit Opening Remarks (May 23)
May 2023 Embedded Vision Summit Opening Remarks (May 23)May 2023 Embedded Vision Summit Opening Remarks (May 23)
May 2023 Embedded Vision Summit Opening Remarks (May 23)Edge AI and Vision Alliance
 
“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...
“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...
“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...Edge AI and Vision Alliance
 
“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group
“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group
“3D Sensing: Market and Industry Update,” a Presentation from the Yole GroupEdge AI and Vision Alliance
 
“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...
“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...
“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...Edge AI and Vision Alliance
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...Edge AI and Vision Alliance
 
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...Edge AI and Vision Alliance
 
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...Edge AI and Vision Alliance
 
“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...
“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...
“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...Edge AI and Vision Alliance
 

Mehr von Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
 
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
 
“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental
 
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
 
“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...
“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...
“Efficient Neuromorphic Computing with Dynamic Vision Sensor, Spiking Neural ...
 
May 2023 Embedded Vision Summit Opening Remarks (May 23)
May 2023 Embedded Vision Summit Opening Remarks (May 23)May 2023 Embedded Vision Summit Opening Remarks (May 23)
May 2023 Embedded Vision Summit Opening Remarks (May 23)
 
“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...
“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...
“Frontiers in Perceptual AI: First-person Video and Multimodal Perception,” a...
 
“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group
“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group
“3D Sensing: Market and Industry Update,” a Presentation from the Yole Group
 
“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...
“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...
“Open Standards Unleash Hardware Acceleration for Embedded Vision,” a Present...
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
 
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
 
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
 
“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...
“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...
“Modernizing the Development of AI-based IoT Devices with Wedge,” a Presentat...
 

Kürzlich hochgeladen

20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 

Kürzlich hochgeladen (20)

20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 

"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel

  • 1. © 2019 Intel Data Annotation at Scale: Pitfalls and Solutions Nikita Manovich May 2019
  • 2. © 2019 Intel Why data is important? - Data is one of key limiting factors for human-level AI - More data beats a cleverer algorithm - But data alone is not enough 2 Fei-Fei Li (Professor of Computer Science at Stanford University)
  • 3. © 2019 Intel Getting data is hard - Public datasets are limited for real use cases - Restricted or unknown terms of use for public data - Privacy laws in different countries (e.g. Germany, US, China) - Personal information like faces and license plates 3 - Outsourcing to data companies - Buying datasets - Sign NDA and consent forms with participants - Crowdsourcing data collection and annotation - Synthetic data and augmentation
  • 4. © 2019 Intel Data outsourcing - Expensive, price depends on volume - Nearly full turnkey data solution - Bureaucracy (e.g. register as a supplier, big tasks only) - 3rd party proprietary tools and infrastructure - Difficult to check results before data deployment - Necessary to fix issues in data anyway - Outcome based pricing vs Time and Materials 4 - Limited legal and privacy risks - Growing competitive market - Nearly full turnkey data solution
  • 5. © 2019 Intel Data as a product - Product life cycle for data - Data tools responsibility - Document - Search and create - Develop and convert - Analyze and visualize - Test and debug - Publish and maintain 5 Planning Analysis Design Implementation Maintenance Testing and Integration
  • 6. © 2019 Intel Data annotation workflow 6 Raw Data Issue tracker Data Spec Annotated data Algo team Data annotation team Labeling tool Automatic annotation Semi- automatic annotation Manual annotation Manual verification
  • 7. © 2019 Intel High quality data = higher cost + a lot of time - Data specifications are not reliable - Humans make mistakes - Automatic and semi-automatic methods are not perfect - Computer vision problems are ill-posed 7 - Golden test before a real annotation task - Annotate the same data several times - Reduce complexity, split an annotation task - Reduce subjectivity of data (e.g. ignore label) - Invest money into data infrastructure
  • 8. © 2019 Intel Data annotation workflow optimization - Data specification is too strict - Undefined annotation workflow - Every object on an image is annotated - Homegrown data annotation tools 8 - Use tight bounding boxes (up to 10x speedup) - Specify how to annotate data (e.g. use shortcuts) - Use ignore regions for very small objects - Integrate automatic annotation (data in the loop) - Manage performance of your data annotation team
  • 9. © 2019 Intel Computer Vision Annotation Tool (CVAT) - Open Source (MIT License) - Growing community - Auto annotation using trained DL models - Collaborative - Easy to deploy and maintain - Client-server architecture - Web-based UI - Django server (REST) - Optimized for primary annotation workflows 9 GitHub: Gitter:
  • 10. © 2019 Intel Use case: object detection - Shapes - Bounding boxes - Polylines - Points - Polygons - Interpolation of bounding boxes between key frames - Any labels (e.g. car, person, ignore) - Any attributes (e.g. parked, color, model, etc) 10
  • 11. © 2019 Intel Use case: classification - Keyboard shortcuts - Flexible filtration of objects - Optimized for efficiency (>1000 tags/hour) - Type of attributes: boolean, choice, number, text - Concentrate on one attribute at a time - Use undefined attribute by default - Annotate the same attribute several times to raise quality 11
  • 12. © 2019 Intel Use case: semantic segmentation - Layers to avoid re-drawing - Flexible filtration of objects - Easy way to draw, resize, edit polygons - Highlight of unannotated regions - UI and UX tricks - Transparency - Emphasized boundaries - Class view - Semi-automatic methods (e.g. Deep Extreme Cut) 12
  • 13. © 2019 Intel Use case: auto annotation 13
  • 14. © 2019 Intel Data in the loop concept 14 Extract useful data Annotate by DL model Verify data Build a dataset Train DL model Deploy DL model Data
  • 15. © 2019 Intel Management of the data annotation workflow - Available information - Activity - Actions - Working hours - Statistics - Exceptions - Data annotation flow reconstruction - Choose any time period - Triage annotation problems - Flexible filtration (e.g. user, event) - Custom visualizations 15
  • 16. © 2019 Intel Plans UI cvat.js REST API 16 CVAT XML MS COCO Pascal VOC TF records JSON … CVAT TF OpenVINO PyTorch Caffe2 MXNet …
  • 17. © 2019 Intel Conclusion - Data is a critical product in a company’s portfolio - Follow legal and privacy laws when dealing with data - Data providers are not a silver bullet - Improve and optimize your data workflow - Invest money into own data infrastructure - Use right tools to develop a data product 17
  • 18. © 2019 Intel Resources 18 Computer Vision Annotation Tool GitHub https://github.com/opencv/cvat Gitter https://gitter.im/opencv-cvat Intel AI blog https://www.intel.ai/introducing-cvat Intel Developer Zone Computer Vision Annotation Tool: A Universal Approach to Data Annotation Contact information Email nikita.manovich@intel.com
  • 19. © 2019 Intel Backup Material 19
  • 20. © 2019 Intel Professional data annotator portrait Age 26-30 (56.3%), 21-25 (31.3%) Gender 56.2% (female), 43.8% (male) Height 1.61m – 1.80m (75%) Weight mostly equally distributed between 41kg – 100kg Education higher (59.4%), students (28.1%) Profession journalist, physician, teacher, economist, engineer, … Hobby learning foreign languages, sport, photography, reading, drawing, music, computer games, … Standing 13m – 24m (34.4%), 7m - 12m (28.1%) 20