SlideShare ist ein Scribd-Unternehmen logo
1 von 51
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training Chatbots and Conversational
Intelligence Agents with Amazon
Mechanical Turk and Facebook’s ParlAI
J a c k U r b a n e k – F a c e b o o k
N o v e m b e r 2 0 1 7
M C L 3 4 9
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Session preview
• What is ParlAI and what is it trying to solve?
• Brief intro to Amazon Mechanical Turk (MTurk)
• How we collect conversational data with MTurk
• Optimizing for the human element
• How to leverage ParlAI for your problem
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Who am I?
• Research Engineer on Facebook AI Research
(FAIR)
• Engineer on the ParlAI team
• Primary contributor to ParlAI’s
MTurk implementation
• User of ParlAI-MTurk for data collection
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why ParlAI?
Quick NLP primer
Issues in current dialogue agent creation efforts and tasks
Motives for a dialogue research platform
ParlAI and its features
I n t r o d u c t o r y m a t e r i a l s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
NLP is difficult because language is imprecise.
One fundamental goal:
• Enable human ⟷ computer dialogue
Dialogue is broken into 1000’s of tasks with:
• Different skill requirements
• A shared input/output format
Most NLP research attempts are siloed:
• They focus on only a subset of tasks
NLP primer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The issue of siloed research
Take two dialogue tasks:
• Question Answering (QA) and chit-chat
One popular QA Dataset is Stanford’s (SQuAD)
• It maps a question and Wikipedia paragraph
pair to the answer’s start/end indices in that
paragraph
A model trained to perform really well on
SQuAD will not generalize to chit chat, even
though they share the same core of requiring
contextual language understanding.
A mock SQuAD-like interaction
Who won the 2017
Super Bowl?
(94,114)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why a dialogue research platform?
• Testing on multiple tasks can expose model weaknesses
• Multi-task training may enable a broader sense of learning
• Standardized method for training and data collection encourages
sharing of compatible datasets
• Better allow the NLP community to share, test, and iterate on models
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI features
• Unified framework/API for training and evaluation of dialogue models
• Many easy-to-access tasks to train and evaluate on
• Multi-task training over any tasks
• Supports both supervised and interactive (online and reinforcement
learning) tasks
• Supports other media including images
• Existing models to work from
• Data collection and model evaluation through
Mechanical Turk
• Open Source
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI features: Tasks
QA datasets
SQuAD
bAbI tasks
MCTest
SimpleQuestions
WikiQa, WebQuestions,
WikiMovies, MTurkWikiMovies
MovieDD (Movie-Recommendations)
MS MARCO
TriviaQA
InsuranceQA
Dialogue Goal-Oriented
bAbIDialog tasks
Dialog-based Language Learning bAbI
Dialog-based Language Learning
Movie
MovieDD-QARecs dialogue
personalized dialog, bAbI+
Visual QA / Visual Dialogue
VQAv1, VQAv2
VisDial, FVQA
CLEVR
Sentence Completion
QACNN
QADailyMail
CBT
BookTest
Dialogue Chit-Chat
Ubuntu
Movies SubReddit
Cornell Movie
OpenSubtitles
Negotiation
Deal or No Deal?
Machine Translation
WMT EnDe (in progress)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI features: Basic implementation
Main classes:
• world – Defines the environment and drives interaction between agents
• agent – A communicator in the world
• teacher – An agent that talks to learning agents, implementing a task
• action – A Python dict that passes text, labels, and rewards between agents
teacher = SquadTeacher(opt)
agent = MyAgent(opt)
world = World(opt, [teacher, agent])
for i in range(num_exs):
world.parley()
print(world.display())
def parley(self):
for agent in self.agents:
act = agent.act()
for other_agent in self.agents:
if other_agent != agent:
other_agent.observe(act)
Main code to train an agent and
print results of each example
Implementation of world.parley in
which each agent acts in turn
while others observe
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI features: Agents
drqa: an attentive LSTM model DrQA (Chen et al., 2017) implemented in PyTorch
that has competitive results on SQuAD amongst other datasets.
memnn: code for an end-to-end memory network (Sukhbaatar et al., 2015) in Lua
Torch.
seq2seq: basic sequence to sequence model (Sutskever et al., 2014).
ir_baseline: information retrieval baseline that scores responses with TFIDF
matching.
remote_agent: basic class for any agent connecting over ZeroMQ.
local_human: keyboard input replaces an ML agent.
repeat_label: basic class for merely repeating all data sent to it
mturk_agent: human worker on MTurk is able to act in a ParlAI world
More details and overall use instructions at parl.ai.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Mechanical Turk and How We Use It
Intro to Mechanical Turk
Summary of our MTurk use
ParlAI’s MTurk operational flow
I n t r o d u c t o r y m a t e r i a l s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Simple intro to Mechanical Turk
Crowdsourcing internet marketplace
for tasks computers currently can’t do.
Requesters pay people to handle bulk
work.
Workers complete this work in the
form of human intelligence tasks
(HITs) and you get the results.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Intro to Mechanical Turk
• HITs are created through a simple templated workflow.
• When workers complete a HIT, you review their work to accept/reject it.
• If you reject the work, you are refusing to pay. Keep in mind that
these are people and this is their work.
Reviewing work for an image tagging taskCreating MTurk Project
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How ParlAI uses MTurk
MTurk workers act remotely within a ParlAI
world we can collect data from.
We are able to have workers interact with
models, then rate the model.
We support automated review where
appropriate.
Interactions with MTurk are almost entirely
programmatic.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ParlAI MTurk functionality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Engineering Goals and Challenges
Completely programmatic interactions with external services
Ability to enable easy creation of arbitrary conversational tasks
Support for multiple actors or trained models
Method for preparing workers for a task
Options for automated work approval
B u i l d i n g P a r l A I M T u r k
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Arbitrary conversational tasks
Problem: Need complete control over what we can show workers in order
to support arbitrary chats.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supporting arbitrary content
Solution: Use MTurk’s programmatic interface and
support for external endpoints to be able to connect
to its workers while retaining control of our content.
1. Set up an external server
2. Host the HIT details there externally from MTurk
3. Create an “ExternalQuestion” HIT pointing to the
server
4. Collect data from the server
This is all done programmatically whenever a ParlAI
user wants to collect data.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supporting arbitrary content
Implementation:
• HIT details – simple python dictionary
• Frontend – templated HTML and
JavaScript
• Server – initialized on per-task basis
Users can set up a task with no additional
MTurk or server knowledge required
Creating complex tasks requires writing
only additional task-related code using
templating
HIT content as delivered by the external server
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Handling multiple responsive actors
Problem: The normal MTurk flow doesn’t natively line up with our use
case.
Solution: Link multiple HITs together within our server.
Single worker per task
instance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Handling multiple actors
Implementation:
• The server acts as a pass-through between
workers
• A worker’s messages are handled as acts in
ParlAI
• Workers receive ParlAI observations
Easy to swap other agents like pre-trained
models in for workers, allowing workers to test
your models.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Preparing workers for tasks
Problem: Conversational tasks can be complicated or unclear, and
qualification tests don’t always provide the context to prepare a worker for
a task.
Solution: Onboard workers within a task.
It can be unclear how to prepare a worker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Preparing workers for tasks
Implementation:
ParlAI has this functionality through
onboarding worlds that provide:
• Specific turn-based steps
• Mocks of the real task
• Filtering of workers who cannot
complete the task
• Option to only onboard workers the
first time they take your HIT Onboarding worlds can quiz workers
before they are added to the available
worker pool
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Automated approval of work
Problem:
• Models may require a lot of data to produce good
results
• Conversations can be hard to judge as properly fitting
into the dataset you were trying to create
• It can take nearly as much time to verify the examples
manually as it did to collect them in the first place
Solution: Strive for automated approval of work.
Implementation: Rule-based verification of data.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Human Element – Lessons Learned
Understanding worker interaction with tasks
Handling disconnects and abandoned work
Improving results by improving tasks
Managing unintended task abuse
B u i l d i n g P a r l A I M T u r k
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Understanding the MTurk workflow
Problem: MTurk workers don’t necessarily take one task at a time, they often try to
optimize their work output which can lead to unexpected behaviors.
Solution: Be aware of how workers interact with and claim tasks, and the generally
asynchronous nature of the MTurk interface. Set reasonable task expiration times.
Initial test was stalled when a worker
quickly queued all 8 of the test HITs and
nobody else was able to claim them
Initial test left a worker waiting in
pool for 30 minutes after another
worker abandoned a HIT without
returning it
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Handling disconnects and abandons
Problem: Workers may disconnect or leave the other person or people
hanging.
Always have to remember that these are people – it won’t feel good to
have one’s work ripped away from them due to others.
Worker interaction isn’t always perfect
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Handling disconnects and abandons
Solution:
ParlAI MTurk implements functionality to
improve these situations.
• Optional paying out to abandoned
workers
• Allow tasks to set a maximum act time
before the worker is considered inactive
and disconnected
• Support reconnecting within a timeframe
• Explain all failure states to the user when
they happen
Text displayed when a partner
disconnects
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Improving results by improving tasks
Work on balancing task length and pay
Workers are less likely to take the time you may need for your dataset if their time
spent isn’t well compensated.
Engaging tasks keep people’s interest
Workers aren’t robots – if you make the tasks fun or somehow rewarding, it is a
better outcome for everyone involved. Improving their experience improves your
data and encourages more people to work on your tasks.
Clear tasks lead to proper output
Ensuring that workers fully understand your task and intention is a shortcut to
quality data.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Preventing task abuse
Some workers aren’t going to produce the kind of data you want.
Oftentimes they may optimize an unclear or tedious problem in an
unintended way that makes the data produced invalid or otherwise
unwanted.
While rare, these can be mitigated by a combination of:
• Clarifying the problem and setting clearer restrictions of expected
behavior
• Checking and filtering out specific bad behavior from your results
• Blocking workers who continue to abuse your HITs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How to Use ParlAI MTurk
Setting up HIT details
Creating and running your HIT
Extended use cases
Examples
A c c o m p l i s h i n g y o u r g o a l s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Setting up HIT details
Starting a ParlAI MTurk task begins
with creating a task config file to
customize MTurk display information:
• hit_title
• hit_description
• hit_keywords
• task_description
Use this file to catch workers’ attention
and give them an overview of what to
expect.
task_config = {}
task_config['hit_title'] = 
’Simulating a Customer Service Interaction’
task_config['hit_description'] = 
’’’Play the role of either Customer Service or a
customer with a problem and attempt to solve the
problem through dialog with another MTurk worker’’’
task_config['hit_keywords'] = 
'chat,dialog,customer service’
task_config['task_description'] = 
''’In this task, you will be assigned the role of a
customer or a customer service rep. As a customer, you
will be given a problem and have to communicate it to
the rep, then confirm the solution they suggest
trying. As the rep, you must offer a solution to the
customer and ensure that their problem is solved.'''
Example HIT setup file
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Setting up ParlAI World
Much of ParlAI’s MTurk functionality can be customized by implementing a
few functions. Most functionality can be altered within just parley.
Stubs and examples are available on our GitHub.
class MTurkCustomerServiceWorld(MTurkTaskWorld):
def parley(self):
if not self.is_init:
self.workers[0].observe(self.cust_task)
self.workers[1].observe(self.rep_task)
self.is_init = True
else:
customer_act = self.workers[0].act()
self.process_customer_act(customer_act)
rep_act = self.workers[1].act()
self.process_rep_act(rep_act)
def process_customer_act(self, act):
if act[‘type’] == ’action’:
if act[‘action’] == self.cust_task.req_action:
self.problem_resolved = True
else: # action type is message
self.worker[1].observe(act)
def process_rep_act(self, act):
if act[‘type’] == ‘action’:
if act[‘action’] == ‘resolve’:
self.episode_done = self.problem_resolved
else: # action type is message
self.worker[0].observe(act)
Example ParlAI parley code
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Creating and running the HIT
Running a ParlAI MTurk hit is as
simple as calling the run file for
your task with a few flags:
• -nc – Number of conversations
• -r – payout reward per conversation
• --unique – only allows each worker to
complete this task once
• --count-complete – only count
finished conversations towards the
number requested
• --sandbox/--live – run the HIT on the
MTurk sandbox server or push it live to
workers
Detailed explanations for running a
hit are available on our GitHub
>> python3 run.py –nc 15 –r 0.1 --sandbox --count-complete
[ optional arguments: ]
[ datapath: /Users/jju/ParlAI/data ]
[ Mechanical Turk: ]
[ mturk_log_path: /Users/jju/ParlAI/logs/mturk ]
[ num_conversations: 15 ]
[ unique_worker: False ]
[ reward: 0.1 ]
[ is_sandbox: True ]
[ hard_block: False ]
[ count_complete: True ]
You are going to allow workers from Amazon Mechanical Turk to
be an agent in ParlAI.
During this process, Internet connection is required, and you
should turn off your computer's auto-sleep feature.
Please press Enter to continue...
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Custom HIT pages
Grounded dialogue often requires additional UI
elements. For this we provide the ability to use
custom HTML in the task.
Additional JavaScript can also be used to allow
for interactions with buttons and additional UI
elements to be sent through to the ParlAI
world as well.
Information can be sent from the ParlAI world
to be rendered on the frontend, allowing
conversations to be grounded on something
determined by the ParlAI world.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The finished experience: Customer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The finished experience - Rep
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Extended use cases
ParlAI MTurk supports much more than can be explained here.
• Filtering workers through requirements: Allow successful
workers to continue working on your specific HITs without
damaging the reputation of unsuccessful workers with blocks
• Repeat worker role assignments: Ensure that workers are
only given a specific role in a conversation in cases where
experiencing more than one role would disturb the task
results
• Task experimentation: Run one task with multiple worlds or
options, randomly assigning workers to different variants in
order to collect experimental data within one HIT
• Hands-free iteration: Use MTurk in an evaluation loop
combined with Task Experimentation in order to iterate on
optimizing a model with no concrete automated evaluation
metric with no additional interaction
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Major Takeaways
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Major takeaways
• Lots of unexpected tasks can be done through MTurk if you’re willing to experiment
• Workers are humans, thus better and more clarified experiences drive better data
• ParlAI MTurk can enable both data collection and model evaluation for your dialogue needs
• Simple conversation tasks can be created with almost no new code
• Grounded conversation tasks are easily enabled by existing ParlAI MTurk
frameworks
• Bonus: ParlAI MTurk is open source and still growing. Pull requests are always welcome, and
ideas for features or improvements may be addressed if they can improve the way that
ParlAI supports research.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Questions?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thanks for attending!

Weitere ähnliche Inhalte

Was ist angesagt?

ChatGPTは思ったほど賢くない
ChatGPTは思ったほど賢くないChatGPTは思ったほど賢くない
ChatGPTは思ったほど賢くないCarnot Inc.
 
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphingDeep Learning JP
 
ツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところ
ツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところ
ツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところY Watanabe
 
実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)
実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)
実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)Ayako_Hasegawa
 
パターン・ランゲージ入門講座(Pattern Language Innovators Summit)
パターン・ランゲージ入門講座(Pattern Language Innovators Summit)パターン・ランゲージ入門講座(Pattern Language Innovators Summit)
パターン・ランゲージ入門講座(Pattern Language Innovators Summit)Takashi Iba
 
Microsoft Teamsを使ったメッセージ通知開発
Microsoft Teamsを使ったメッセージ通知開発Microsoft Teamsを使ったメッセージ通知開発
Microsoft Teamsを使ったメッセージ通知開発miekobari
 
Python 3.9からの新定番zoneinfoを使いこなそう
Python 3.9からの新定番zoneinfoを使いこなそうPython 3.9からの新定番zoneinfoを使いこなそう
Python 3.9からの新定番zoneinfoを使いこなそうRyuji Tsutsui
 
暗号技術の実装と数学
暗号技術の実装と数学暗号技術の実装と数学
暗号技術の実装と数学MITSUNARI Shigeo
 
Rustに触れて私のPythonはどう変わったか
Rustに触れて私のPythonはどう変わったかRustに触れて私のPythonはどう変わったか
Rustに触れて私のPythonはどう変わったかShunsukeNakamura17
 
NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision Tree
NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision TreeNIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision Tree
NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision TreeTakami Sato
 
transformer解説~Chat-GPTの源流~
transformer解説~Chat-GPTの源流~transformer解説~Chat-GPTの源流~
transformer解説~Chat-GPTの源流~MasayoshiTsutsui
 
最適輸送入門
最適輸送入門最適輸送入門
最適輸送入門joisino
 
100%Kotlin ORM Ktormを試してみた
100%Kotlin ORM Ktormを試してみた100%Kotlin ORM Ktormを試してみた
100%Kotlin ORM Ktormを試してみたKeita Tsukamoto
 
Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Soft Rasterizer: A Differentiable Renderer for Image-based 3D ReasoningSoft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Soft Rasterizer: A Differentiable Renderer for Image-based 3D ReasoningKohei Nishimura
 
Java ORマッパー選定のポイント #jsug
Java ORマッパー選定のポイント #jsugJava ORマッパー選定のポイント #jsug
Java ORマッパー選定のポイント #jsugMasatoshi Tada
 
Consumer Driven Contractsで REST API/マイクロサービスをテスト #m3tech
Consumer Driven Contractsで REST API/マイクロサービスをテスト #m3techConsumer Driven Contractsで REST API/マイクロサービスをテスト #m3tech
Consumer Driven Contractsで REST API/マイクロサービスをテスト #m3techToshiaki Maki
 
ITコミュニティと情報発信に共通する成長と貢献の要素
ITコミュニティと情報発信に共通する成長と貢献の要素ITコミュニティと情報発信に共通する成長と貢献の要素
ITコミュニティと情報発信に共通する成長と貢献の要素NISHIHARA Shota
 

Was ist angesagt? (20)

ChatGPTは思ったほど賢くない
ChatGPTは思ったほど賢くないChatGPTは思ったほど賢くない
ChatGPTは思ったほど賢くない
 
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
 
ツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところ
ツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところ
ツール比較しながら語る O/RマッパーとDBマイグレーションの実際のところ
 
実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)
実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)
実践 Amazon Mechanical Turk ※下記の注意点をご覧ください(回答の質の悪化・報酬額の相場の変化・仕様変更)
 
パターン・ランゲージ入門講座(Pattern Language Innovators Summit)
パターン・ランゲージ入門講座(Pattern Language Innovators Summit)パターン・ランゲージ入門講座(Pattern Language Innovators Summit)
パターン・ランゲージ入門講座(Pattern Language Innovators Summit)
 
Microsoft Teamsを使ったメッセージ通知開発
Microsoft Teamsを使ったメッセージ通知開発Microsoft Teamsを使ったメッセージ通知開発
Microsoft Teamsを使ったメッセージ通知開発
 
固有表現抽出と適用例のご紹介
固有表現抽出と適用例のご紹介固有表現抽出と適用例のご紹介
固有表現抽出と適用例のご紹介
 
Python 3.9からの新定番zoneinfoを使いこなそう
Python 3.9からの新定番zoneinfoを使いこなそうPython 3.9からの新定番zoneinfoを使いこなそう
Python 3.9からの新定番zoneinfoを使いこなそう
 
GPT
GPTGPT
GPT
 
暗号技術の実装と数学
暗号技術の実装と数学暗号技術の実装と数学
暗号技術の実装と数学
 
GPT解説
GPT解説GPT解説
GPT解説
 
Rustに触れて私のPythonはどう変わったか
Rustに触れて私のPythonはどう変わったかRustに触れて私のPythonはどう変わったか
Rustに触れて私のPythonはどう変わったか
 
NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision Tree
NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision TreeNIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision Tree
NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision Tree
 
transformer解説~Chat-GPTの源流~
transformer解説~Chat-GPTの源流~transformer解説~Chat-GPTの源流~
transformer解説~Chat-GPTの源流~
 
最適輸送入門
最適輸送入門最適輸送入門
最適輸送入門
 
100%Kotlin ORM Ktormを試してみた
100%Kotlin ORM Ktormを試してみた100%Kotlin ORM Ktormを試してみた
100%Kotlin ORM Ktormを試してみた
 
Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Soft Rasterizer: A Differentiable Renderer for Image-based 3D ReasoningSoft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
 
Java ORマッパー選定のポイント #jsug
Java ORマッパー選定のポイント #jsugJava ORマッパー選定のポイント #jsug
Java ORマッパー選定のポイント #jsug
 
Consumer Driven Contractsで REST API/マイクロサービスをテスト #m3tech
Consumer Driven Contractsで REST API/マイクロサービスをテスト #m3techConsumer Driven Contractsで REST API/マイクロサービスをテスト #m3tech
Consumer Driven Contractsで REST API/マイクロサービスをテスト #m3tech
 
ITコミュニティと情報発信に共通する成長と貢献の要素
ITコミュニティと情報発信に共通する成長と貢献の要素ITコミュニティと情報発信に共通する成長と貢献の要素
ITコミュニティと情報発信に共通する成長と貢献の要素
 

Ähnlich wie Training Chatbots and Conversational Artificial Intelligence Agents with Amazon Mechanical Turk and Facebook’s ParlAI - MCL349 - re:Invent 2017

Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...Amazon Web Services
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Amazon Web Services
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Amazon Web Services
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenAWS Germany
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018Hagay Lupesko
 
DEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceDEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceAmazon Web Services
 
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...Ruth Everett
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersAmazon Web Services
 
Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018
Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018
Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018Amazon Web Services
 
Introduction to GluonCV
Introduction to GluonCVIntroduction to GluonCV
Introduction to GluonCVApache MXNet
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Amazon Web Services
 
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...Amazon Web Services
 
Serverless Machine Learning
Serverless Machine LearningServerless Machine Learning
Serverless Machine LearningAsavari Tayal
 
Meaningful UI Test Automation
Meaningful UI Test AutomationMeaningful UI Test Automation
Meaningful UI Test AutomationRahul Verma
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWSGili Nachum
 
From Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMakerFrom Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMakerAmazon Web Services
 
From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...
From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...
From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...Amazon Web Services
 
雲端推動的人工智能革命
雲端推動的人工智能革命雲端推動的人工智能革命
雲端推動的人工智能革命Amazon Web Services
 
Meaningful UI Test Automation
Meaningful UI Test AutomationMeaningful UI Test Automation
Meaningful UI Test AutomationRahul Verma
 

Ähnlich wie Training Chatbots and Conversational Artificial Intelligence Agents with Amazon Mechanical Turk and Facebook’s ParlAI - MCL349 - re:Invent 2017 (20)

Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
 
DEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceDEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 Service
 
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
 
Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018
Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018
Moving to DevOps the Amazon Way (DEV210-R1) - AWS re:Invent 2018
 
Introduction to GluonCV
Introduction to GluonCVIntroduction to GluonCV
Introduction to GluonCV
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
 
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
 
GluonCV
GluonCVGluonCV
GluonCV
 
Serverless Machine Learning
Serverless Machine LearningServerless Machine Learning
Serverless Machine Learning
 
Meaningful UI Test Automation
Meaningful UI Test AutomationMeaningful UI Test Automation
Meaningful UI Test Automation
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWS
 
From Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMakerFrom Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMaker
 
From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...
From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...
From Monolith to Microservices (And All the Bumps along the Way) (CON360-R1) ...
 
雲端推動的人工智能革命
雲端推動的人工智能革命雲端推動的人工智能革命
雲端推動的人工智能革命
 
Meaningful UI Test Automation
Meaningful UI Test AutomationMeaningful UI Test Automation
Meaningful UI Test Automation
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Training Chatbots and Conversational Artificial Intelligence Agents with Amazon Mechanical Turk and Facebook’s ParlAI - MCL349 - re:Invent 2017

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training Chatbots and Conversational Intelligence Agents with Amazon Mechanical Turk and Facebook’s ParlAI J a c k U r b a n e k – F a c e b o o k N o v e m b e r 2 0 1 7 M C L 3 4 9
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Session preview • What is ParlAI and what is it trying to solve? • Brief intro to Amazon Mechanical Turk (MTurk) • How we collect conversational data with MTurk • Optimizing for the human element • How to leverage ParlAI for your problem
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Who am I? • Research Engineer on Facebook AI Research (FAIR) • Engineer on the ParlAI team • Primary contributor to ParlAI’s MTurk implementation • User of ParlAI-MTurk for data collection
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why ParlAI? Quick NLP primer Issues in current dialogue agent creation efforts and tasks Motives for a dialogue research platform ParlAI and its features I n t r o d u c t o r y m a t e r i a l s
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. NLP is difficult because language is imprecise. One fundamental goal: • Enable human ⟷ computer dialogue Dialogue is broken into 1000’s of tasks with: • Different skill requirements • A shared input/output format Most NLP research attempts are siloed: • They focus on only a subset of tasks NLP primer
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The issue of siloed research Take two dialogue tasks: • Question Answering (QA) and chit-chat One popular QA Dataset is Stanford’s (SQuAD) • It maps a question and Wikipedia paragraph pair to the answer’s start/end indices in that paragraph A model trained to perform really well on SQuAD will not generalize to chit chat, even though they share the same core of requiring contextual language understanding. A mock SQuAD-like interaction Who won the 2017 Super Bowl? (94,114)
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why a dialogue research platform? • Testing on multiple tasks can expose model weaknesses • Multi-task training may enable a broader sense of learning • Standardized method for training and data collection encourages sharing of compatible datasets • Better allow the NLP community to share, test, and iterate on models
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI features • Unified framework/API for training and evaluation of dialogue models • Many easy-to-access tasks to train and evaluate on • Multi-task training over any tasks • Supports both supervised and interactive (online and reinforcement learning) tasks • Supports other media including images • Existing models to work from • Data collection and model evaluation through Mechanical Turk • Open Source
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI features: Tasks QA datasets SQuAD bAbI tasks MCTest SimpleQuestions WikiQa, WebQuestions, WikiMovies, MTurkWikiMovies MovieDD (Movie-Recommendations) MS MARCO TriviaQA InsuranceQA Dialogue Goal-Oriented bAbIDialog tasks Dialog-based Language Learning bAbI Dialog-based Language Learning Movie MovieDD-QARecs dialogue personalized dialog, bAbI+ Visual QA / Visual Dialogue VQAv1, VQAv2 VisDial, FVQA CLEVR Sentence Completion QACNN QADailyMail CBT BookTest Dialogue Chit-Chat Ubuntu Movies SubReddit Cornell Movie OpenSubtitles Negotiation Deal or No Deal? Machine Translation WMT EnDe (in progress)
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI features: Basic implementation Main classes: • world – Defines the environment and drives interaction between agents • agent – A communicator in the world • teacher – An agent that talks to learning agents, implementing a task • action – A Python dict that passes text, labels, and rewards between agents teacher = SquadTeacher(opt) agent = MyAgent(opt) world = World(opt, [teacher, agent]) for i in range(num_exs): world.parley() print(world.display()) def parley(self): for agent in self.agents: act = agent.act() for other_agent in self.agents: if other_agent != agent: other_agent.observe(act) Main code to train an agent and print results of each example Implementation of world.parley in which each agent acts in turn while others observe
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI features: Agents drqa: an attentive LSTM model DrQA (Chen et al., 2017) implemented in PyTorch that has competitive results on SQuAD amongst other datasets. memnn: code for an end-to-end memory network (Sukhbaatar et al., 2015) in Lua Torch. seq2seq: basic sequence to sequence model (Sutskever et al., 2014). ir_baseline: information retrieval baseline that scores responses with TFIDF matching. remote_agent: basic class for any agent connecting over ZeroMQ. local_human: keyboard input replaces an ML agent. repeat_label: basic class for merely repeating all data sent to it mturk_agent: human worker on MTurk is able to act in a ParlAI world More details and overall use instructions at parl.ai.
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Mechanical Turk and How We Use It Intro to Mechanical Turk Summary of our MTurk use ParlAI’s MTurk operational flow I n t r o d u c t o r y m a t e r i a l s
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Simple intro to Mechanical Turk Crowdsourcing internet marketplace for tasks computers currently can’t do. Requesters pay people to handle bulk work. Workers complete this work in the form of human intelligence tasks (HITs) and you get the results.
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Intro to Mechanical Turk • HITs are created through a simple templated workflow. • When workers complete a HIT, you review their work to accept/reject it. • If you reject the work, you are refusing to pay. Keep in mind that these are people and this is their work. Reviewing work for an image tagging taskCreating MTurk Project
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How ParlAI uses MTurk MTurk workers act remotely within a ParlAI world we can collect data from. We are able to have workers interact with models, then rate the model. We support automated review where appropriate. Interactions with MTurk are almost entirely programmatic.
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ParlAI MTurk functionality
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Engineering Goals and Challenges Completely programmatic interactions with external services Ability to enable easy creation of arbitrary conversational tasks Support for multiple actors or trained models Method for preparing workers for a task Options for automated work approval B u i l d i n g P a r l A I M T u r k
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Arbitrary conversational tasks Problem: Need complete control over what we can show workers in order to support arbitrary chats.
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Supporting arbitrary content Solution: Use MTurk’s programmatic interface and support for external endpoints to be able to connect to its workers while retaining control of our content. 1. Set up an external server 2. Host the HIT details there externally from MTurk 3. Create an “ExternalQuestion” HIT pointing to the server 4. Collect data from the server This is all done programmatically whenever a ParlAI user wants to collect data.
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Supporting arbitrary content Implementation: • HIT details – simple python dictionary • Frontend – templated HTML and JavaScript • Server – initialized on per-task basis Users can set up a task with no additional MTurk or server knowledge required Creating complex tasks requires writing only additional task-related code using templating HIT content as delivered by the external server
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Handling multiple responsive actors Problem: The normal MTurk flow doesn’t natively line up with our use case. Solution: Link multiple HITs together within our server. Single worker per task instance
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Handling multiple actors Implementation: • The server acts as a pass-through between workers • A worker’s messages are handled as acts in ParlAI • Workers receive ParlAI observations Easy to swap other agents like pre-trained models in for workers, allowing workers to test your models.
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Preparing workers for tasks Problem: Conversational tasks can be complicated or unclear, and qualification tests don’t always provide the context to prepare a worker for a task. Solution: Onboard workers within a task. It can be unclear how to prepare a worker
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Preparing workers for tasks Implementation: ParlAI has this functionality through onboarding worlds that provide: • Specific turn-based steps • Mocks of the real task • Filtering of workers who cannot complete the task • Option to only onboard workers the first time they take your HIT Onboarding worlds can quiz workers before they are added to the available worker pool
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Automated approval of work Problem: • Models may require a lot of data to produce good results • Conversations can be hard to judge as properly fitting into the dataset you were trying to create • It can take nearly as much time to verify the examples manually as it did to collect them in the first place Solution: Strive for automated approval of work. Implementation: Rule-based verification of data.
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Human Element – Lessons Learned Understanding worker interaction with tasks Handling disconnects and abandoned work Improving results by improving tasks Managing unintended task abuse B u i l d i n g P a r l A I M T u r k
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Understanding the MTurk workflow Problem: MTurk workers don’t necessarily take one task at a time, they often try to optimize their work output which can lead to unexpected behaviors. Solution: Be aware of how workers interact with and claim tasks, and the generally asynchronous nature of the MTurk interface. Set reasonable task expiration times. Initial test was stalled when a worker quickly queued all 8 of the test HITs and nobody else was able to claim them Initial test left a worker waiting in pool for 30 minutes after another worker abandoned a HIT without returning it
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Handling disconnects and abandons Problem: Workers may disconnect or leave the other person or people hanging. Always have to remember that these are people – it won’t feel good to have one’s work ripped away from them due to others. Worker interaction isn’t always perfect
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Handling disconnects and abandons Solution: ParlAI MTurk implements functionality to improve these situations. • Optional paying out to abandoned workers • Allow tasks to set a maximum act time before the worker is considered inactive and disconnected • Support reconnecting within a timeframe • Explain all failure states to the user when they happen Text displayed when a partner disconnects
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Improving results by improving tasks Work on balancing task length and pay Workers are less likely to take the time you may need for your dataset if their time spent isn’t well compensated. Engaging tasks keep people’s interest Workers aren’t robots – if you make the tasks fun or somehow rewarding, it is a better outcome for everyone involved. Improving their experience improves your data and encourages more people to work on your tasks. Clear tasks lead to proper output Ensuring that workers fully understand your task and intention is a shortcut to quality data.
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Preventing task abuse Some workers aren’t going to produce the kind of data you want. Oftentimes they may optimize an unclear or tedious problem in an unintended way that makes the data produced invalid or otherwise unwanted. While rare, these can be mitigated by a combination of: • Clarifying the problem and setting clearer restrictions of expected behavior • Checking and filtering out specific bad behavior from your results • Blocking workers who continue to abuse your HITs
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How to Use ParlAI MTurk Setting up HIT details Creating and running your HIT Extended use cases Examples A c c o m p l i s h i n g y o u r g o a l s
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Setting up HIT details Starting a ParlAI MTurk task begins with creating a task config file to customize MTurk display information: • hit_title • hit_description • hit_keywords • task_description Use this file to catch workers’ attention and give them an overview of what to expect. task_config = {} task_config['hit_title'] = ’Simulating a Customer Service Interaction’ task_config['hit_description'] = ’’’Play the role of either Customer Service or a customer with a problem and attempt to solve the problem through dialog with another MTurk worker’’’ task_config['hit_keywords'] = 'chat,dialog,customer service’ task_config['task_description'] = ''’In this task, you will be assigned the role of a customer or a customer service rep. As a customer, you will be given a problem and have to communicate it to the rep, then confirm the solution they suggest trying. As the rep, you must offer a solution to the customer and ensure that their problem is solved.''' Example HIT setup file
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Setting up ParlAI World Much of ParlAI’s MTurk functionality can be customized by implementing a few functions. Most functionality can be altered within just parley. Stubs and examples are available on our GitHub. class MTurkCustomerServiceWorld(MTurkTaskWorld): def parley(self): if not self.is_init: self.workers[0].observe(self.cust_task) self.workers[1].observe(self.rep_task) self.is_init = True else: customer_act = self.workers[0].act() self.process_customer_act(customer_act) rep_act = self.workers[1].act() self.process_rep_act(rep_act) def process_customer_act(self, act): if act[‘type’] == ’action’: if act[‘action’] == self.cust_task.req_action: self.problem_resolved = True else: # action type is message self.worker[1].observe(act) def process_rep_act(self, act): if act[‘type’] == ‘action’: if act[‘action’] == ‘resolve’: self.episode_done = self.problem_resolved else: # action type is message self.worker[0].observe(act) Example ParlAI parley code
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Creating and running the HIT Running a ParlAI MTurk hit is as simple as calling the run file for your task with a few flags: • -nc – Number of conversations • -r – payout reward per conversation • --unique – only allows each worker to complete this task once • --count-complete – only count finished conversations towards the number requested • --sandbox/--live – run the HIT on the MTurk sandbox server or push it live to workers Detailed explanations for running a hit are available on our GitHub >> python3 run.py –nc 15 –r 0.1 --sandbox --count-complete [ optional arguments: ] [ datapath: /Users/jju/ParlAI/data ] [ Mechanical Turk: ] [ mturk_log_path: /Users/jju/ParlAI/logs/mturk ] [ num_conversations: 15 ] [ unique_worker: False ] [ reward: 0.1 ] [ is_sandbox: True ] [ hard_block: False ] [ count_complete: True ] You are going to allow workers from Amazon Mechanical Turk to be an agent in ParlAI. During this process, Internet connection is required, and you should turn off your computer's auto-sleep feature. Please press Enter to continue...
  • 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Custom HIT pages Grounded dialogue often requires additional UI elements. For this we provide the ability to use custom HTML in the task. Additional JavaScript can also be used to allow for interactions with buttons and additional UI elements to be sent through to the ParlAI world as well. Information can be sent from the ParlAI world to be rendered on the frontend, allowing conversations to be grounded on something determined by the ParlAI world.
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The finished experience: Customer
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The finished experience - Rep
  • 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Extended use cases ParlAI MTurk supports much more than can be explained here. • Filtering workers through requirements: Allow successful workers to continue working on your specific HITs without damaging the reputation of unsuccessful workers with blocks • Repeat worker role assignments: Ensure that workers are only given a specific role in a conversation in cases where experiencing more than one role would disturb the task results • Task experimentation: Run one task with multiple worlds or options, randomly assigning workers to different variants in order to collect experimental data within one HIT • Hands-free iteration: Use MTurk in an evaluation loop combined with Task Experimentation in order to iterate on optimizing a model with no concrete automated evaluation metric with no additional interaction
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Major Takeaways
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Major takeaways • Lots of unexpected tasks can be done through MTurk if you’re willing to experiment • Workers are humans, thus better and more clarified experiences drive better data • ParlAI MTurk can enable both data collection and model evaluation for your dialogue needs • Simple conversation tasks can be created with almost no new code • Grounded conversation tasks are easily enabled by existing ParlAI MTurk frameworks • Bonus: ParlAI MTurk is open source and still growing. Pull requests are always welcome, and ideas for features or improvements may be addressed if they can improve the way that ParlAI supports research.
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Questions?
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thanks for attending!