14 February 2006
New York chapter
Interaction Design Association
is language design.
Fit Associates, LLC
These slides were presented at the February
meeting of the New York City chapter of the
Interaction Design Association (www.ixda.org).
You’re free to circulate them and to use the ideas in
your own work, so long as you credit the author.
These slides are more or less exactly the ones I
showed, except they have these little notes on the
side. I make slides to support the presentation, then
they don’t make any sense for people who didn’t
attend the talk. So I’ll try to “talk to you” by
scribbling notes right over the slides….
Since this is the “Interaction Designers’
Association,” and not, say, a workshop on
“How to Make a Killer Web Site,” I
thought I could get away with being non-
prescriptive. Basically this talk is me
saying, “I came across some interesting
ideas, I thought they seemed important.
I’m sharing what’s on my mind in case
you want to think about it too.”
Might be important. This
stuff is only getting
harder, and we just
keep winging it. We
foundations for our
When we’re honest with one another, we admit that
we make design decisions from the gut, from the
seat of our pants, way more often than we would
like. The work of interaction design lacks good
theoretical foundations. No blame – it’s early, we’re
just getting started.
But “winging it” is going to be less and less tenable
as a way to work. Because of technical and social
trends, the distance between control and response is
widening (inputs are becoming separated from
outputs. The context of use is becoming less
predictable for a lot of design challenges. Our
designs go out into an ecosystem of other devices,
people, and protocols – they don’t stand alone the
way we sometimes like to think.
Wouldn’t it be nice if there were some foundations
for our work that could guide decisions about
choices of symbols, prioritization of features,
relative prominence of one thing over another?
Wouldn’t it be nice if we had an underlying theory
to help us make sure our designs were going to fit
people’s expectations for controls, feedback, pace,
error-handling, and so on and on?
I do not claim to offer such a foundation in this talk.
This is much more humble. I’m just saying I looked
through this door, and it looked kind of
foundational on the other side. I sniffed it, and it
smelled like it had rubbed up against something
fundamental. Pick your metaphor.
1. Descriptive linguistics 0.101 6
2. Design languages 15
3. Interaction languages 31
4. Speech acts and 49
Let’s open the box and see what’s inside.
By god, it’s an outline of this talk!
Descriptive Linguistics 0.101The first thing in this box of clues is a designer’s
introduction to a few concepts in linguistics. The
terminology I use sits about halfway between the way
linguists talk about language and the way I imagine we
might teach interface design if we were treating
―interface linguistics‖ as one of the fundamental ways
of understanding our work.
So to warn you, if you say ―deep structure and surface
structure‖ to a linguist, it will make his or her eyebrows
go up and they’ll have a lot to say. If you say ―elements,
constructs and compositions‖ to a linguist, it will
probably sound intriguing but unfamiliar. I don’t know
whether I’ve helped or hampered your preparations for
linguist’s cocktail parties.
Layers of language
Surface Structure What we actually do when we
use language: text, voice
utterances, gestures, symbols,…
Syntax and lexicon In our minds, in our culture,
in the world
Deep Structure in your mind, in my mind.
Same for you as for me?
Here is one way to look at how
language works. This framing got
its start with Noam Chomsky and
his Transformational Grammar.
Though he and nearly everybody
else have since moved on, I want
to set this up for its value in
explaining why I think the same
thing is going on with languages
for interaction between people
key concept: surface structure
Surface structure: language as spoken, written, or
signed; the result of a language in use — what people
see, hear, or do when they use a language to express
a particular meaning.
I want cake.
key concept: deep structure
Deep structure: what is in the mind when people use
language; the meaning that underlies surface
self [singular], possess (
cake [edible object],
One deep structure many surface structures
self [singular], possess (
cake [edible object],
• /?ai uant kheikh/
• I want cake
• Cake is wanted by me
• Cake, please
• Do I see a piece of cake over there?
• Ich wünsche Kuchen
• < point at cake, point at mouth, smile,
wiggle eyebrows >
He passed the bar.
Time flies like an arrow.
I love you.
A proposition: we don’t want our languages for
interaction to have the same power of expression as
human languages. Ambiguity is just one example of
the high cost of that power.
A sentence so famous it has
its own Wikipedia entry!
Key concepts: Syntax and Lexicon
Syntax: You know, grammar. A set of rules by which
language elements are assembled into valid
constructs; by “valid,” I mean “sensible to people
literate in the language.”
Lexicon: All the stuff a dictionary tries to explain, and
more. The set of symbols shared by speakers of a
language, which map to a set of shared meanings.
To communicate, you gotta have:
• shared meanings
• shared symbols
An over-simplified way to think about syntax
A language gives us:
• A lexicon of atomic symbols – call them Elements.
A word is an element.
• Rules for assembling elements into larger Constructs,
• Rules and conventions for assembling constructs into
even larger Compositions: essays, letters to mom,
declarations of independence, fart jokes….
Ode to a Potato-Head
I lie awake
while he doth bake;
Oh melting polystyrene visage!
A language defines a set of elements which tie a
symbol in surface structure to a meaning in deep
structure. Elements are assembled into valid and
meaningful constructs according to the rules of a
grammar. In turn, constructs may be assembled
into compositions, which make up a complete
And that concludes Linguistics 0.101.
With that little set of terms and ideas behind us, let’s
look at the idea of design languages. We’re sneaking
up on our title topic one step at a time.
John Rheinfrank and Shelley Evenson wrote about
design languages in their chapter in the book,
Bringing Design to Software. Industrial designers
work with ―form language‖ all the time, and branding
folks develop ―brand languages.‖ Some of that work
really does live up to the name. (And the term is much
more common now than it was when I first gave this
talk in 2006.)
a static design language: highway signs
I first understood design languages through the example of highway signs. (First encountered through
Clement Mok’s book, Designing Business.) Maybe this same example will help you. Let's talk about
highway signs for a moment, and discover the well-formed structures used in their design.
A well-defined language
Richard Moeur: www.richardcmoeur.com
If you haven't stopped to think about this or look at the specification, you might be surprised at how thoroughly the
language of highways signs is documented These are a few of the kinds of "sentences" in the language, from a site
that documents it in detail. Each of these is a category of sign, a kind of expression in the language.
…with a detailed lexicon and grammar
As you can see, there is a lot of detail. Here we've clicked through to the details of the "W8" category
of warning signs: bumps, dips, and pavement condition.
And here is another category, the category of Guide Signs. Behind each of these links is a detailed list like
the one we just saw. By the way, you'd be amazed at how many sites there are about highway signs. It's a
little like trainspotting or something. There are fan sites. "Here are photos my brother and I took down
highway 12—we got all the state highway markers along the way." "I have all the state highway markers,
with variations since 1965."
the language allows for variation
Like "natural languages" (the term linguists and computer scientists use
for human languages, to distinguish them from "artificial languages" such
as Esperanto, and formal languages like those used for computer
programming), the language of highway signs allows for some variation
in how things are said.
Here's an example: state highway markers. In the language, they must be a
black rectangle with a white field, displaying the number of the state
highway. But states often deviate from the recommended standard,
usually by somehow representing their state on the sign: the shape of the
state (Arkansas, Arizona, etc.) or something symbolic (Pennsylvania's
keystone, for example).
ELEMENTS of highway sign language
shape, color, symbol, text, position relative to road
Syntax for Highway Signish
• There are rules for making “sentences” in the language of
• Shape and color are used as redundant cues (octagons
always red, triangles always yellow)
• Some use of layers (the “no” symbol)
• There are rules for relative position of text & symbol
• This grammar is formally codified
Writing and reading
Deep structure Surface structure
In one mile the road splits;
fork left for U.S. interstate 10,
fork right for U.S. interstate 17
You are driving on U.S. interstate 17;
In one mile, this highway enters
the Phoenix city limits
When the highway department wants to say something, they translate a ―deep‖
semantic structure or intent into surface structure – a construct in Highway
Signish. If you’re literate in the language, seeing a sign evokes deep structures in
you that are hopefully what the highway department intended.
Applying this language – using it to create clear, accurate, usable constructs
and compositions that millions of people will use every day while guiding tons of
family-bearing metal down the road at high speed – is serious craft.
These are from the Missouri
Department of Transportation
Engineering Policy Guide, ―903.8
Freeway and Expressway Guide
Another source for the curious is
the U.S. Department of
Transportation Federal Highway
Freeway Management and
Eye candy: ask google images about ―highway interchange‖
(and imagine you were hired as an ―interface designer‖ in this field; could be kinda fun!)
highway signs: compositions
• There are rules and conventions for creating good
compositions; for example, the spacing and positioning of
signs relative to the road
• Many layers of concern are woven into a single composition:
guidance, warning, regulatory advisories, etc.
• There is a grammar for compositions as well as constructs, but
it is looser, allows more room for in-context innovation. There is
art to creating a great composition (there is an annual award
for highway interchange design).
Design languages are common
We are surrounded by “design languages.” Most of
them suffer from a lack of being designed-as-
languages. They are full of irregularities that make
them difficult to “read,” to learn and remember.
Still, they fit today’s definition of real languages. They
map a lexicon of symbols to a set of underlying
meanings. They use symbols in combination
(constructs) to communicate complex meanings.
Now we’ve learned a little about languages in
general, and we’ve had an introduction to design
languages. What might we mean by ―interface
languages‖ or ―interaction languages?‖
Interaction as conversation
When you design an interactive product, you are creating
the setting for thousands of conversations. You are creating
the language which will be spoken between the product
and the person.
By this point in the presentation, you’re a freshly-minted interaction linguist. What can you say about the interplay between
deep and surface structures here? How about the elements and constructs of the design language?
Some of the things my remote control lets me say
CH + CH -
A-B Repeat 10
interaction languages: good & bad news
Good news: because we design them consciously,
and because we can spend time up front
understanding the conversations we seek to enable,
interaction languages can be far less messy than
• vocabulary just the right size
• less ambiguity
• designed for quick path to literacy
• build on people’s knowledge of other languages
More good news: information technology gives us all
sorts of ways to build explicit representations of the
underlying semantics, and use them to drive the
behavior of the interface. Woohoo!
interaction languages: good & bad news
“Watch the Simpsons on seven.”
turn on TV tune to channel 7
lexicon and (hidden, invisible) syntax
When I get down in this
layer, I tend to think in terms
of object models.
I want cake.
yes, you can have some now
yes, you can have some later
no, the cake is all gone
no. you can’t afford cake.
Bad news: interaction languages are more
complicated than static design languages.
We have to account for both sides of the conversation!
interaction languages: good & bad news
Want to hear a little
[ ] Yes [ ] No
Brian Herzfeldt and I did an
interaction design project for a
medical software products
company named Vassol. A full
case study of this work was
published in the DUX 2003
Proceedings, and you can find
that article on my site:
At this point in original talk, I
used that work as an example of
the practice of creating an
interaction language. I’ve left
those slides out because they
won’t do you much good
without me explaining
The nut: we inventoried the
meanings to be shared between
the users and the underlying
system – both nouns and verbs.
And we inventoried the
essential speech acts that
needed to happen between
different roles who use the tool.
And we built out the interface
and interaction from there.
Speech acts and
We could stop there and go for beer. What we’ve
covered so far is about all I can say I really have tried
to apply in practice. But there’s lots more to explore,
and I think there’s a lot more useful foundation that
could be worked out, with very general usefulness
and applicability. So let’s have a look at some other
ideas from linguistics….
another promising concept: speech acts
A speech act is a construct (i.e., a single assembly of
elements) that affects some change in the world, or
communicates something about the state of the
“I now pronounce you man and wife.”
“Save this file.”
Their importance for interaction design: they are the
building blocks of interaction, because they bundle
subject and verb, or subject, verb and object. If you get
the deep and surface structures of the essential speech
acts right, and you have a good framework for
generating compositions, you’re on the path to glory.
Five things you can do with an utterance:
• Assert: Commit the speaker (in varying degrees) to
something being the case -- to the truth of the
• Direct (request): Attempt (in varying degrees) to get
the hearer to do something. These include both
questions (which can direct the hearer to make an
assertive speech act in response) and commands
(which direct the hearer to carry out some linguistic or
• Commit (promise): Commit the speaker (again in
varying degrees) to some future course of action.
• Declare: Bring about the correspondence between
the propositional content of the speech act and
reality (e.g., pronouncing a couple married).
• Express: Express a psychological state about a state of
affairs (e.g., apologizing and praising).
This slide requires quite a bit of explanation,
and it’s not completely baked. At the time I
wrote these slides I was playing with slot-
and-filler frameworks for the linguistics of
The insight here is that the speech act ―Meet
me at three‖ is an instance of a particular
genre of request. The idea is that we could
sort out the speech acts for the particular
interactions we’re designing for. My gut says
there probably isn’t a very large number of
them for most interfaces. Maybe in this
example if we’re trying to help two people
get together, there is meet-request,
negotiation-turn, commitment, confirmation,
and denial. And by the way, those same
things are likely to show up in lots of
different products and services. We should be
able to get pretty far with a relatively small
number of speech acts!
Then for each of those speech acts we could
work out slots and possible fillers, and we’d
be on the trail of a syntax.
The diagram here is a work-in-progress
example of what a set of slots and possible
fillers might look like for a ―meet me‖ speech
<recipient> meet me <where> <when>
One habit I’ve formed because of this thinking,
one design implication for me: I always ask,
What does this product have to say to people?
What is it willing to respond to?
Tea kettle speech acts
―I have water in me, and it is <this temperature>‖
If you think you’ve identified a speech act, and you know something
about its underlying semantics, then you can play with the vocabulary it
might use to express the range of meanings. Is the actual numeric
temperature the useful thing? It’s not that clear to most people. Is 180F
too hot to touch? What’s a good temperature for my child’s chocolate?
Maybe exterior temperature and liquid temperature are two different
things to talk about. How does it say, ―Watch out! I’m too hot to touch!‖?
Quick note about a big topic: doing this well involves an aesthetic I might call
―alignment‖ – our work can produce either practical and aesthetic coherence or
mismatch between the surface expression and the underlying meanings. When what
we see, hear and feel is well aligned with the meaning, the result is beautiful.
Originally speech act theory talked mostly about
these acts as little independent units of action.
Since then it has gotten a lot more interesting,
and I think a lot more useful, as people married
speech act theory with discourse analysis to
make models of dialog.
Conversation for action, Winograd and Flores
Winograd, T., & Flores, F. (1986). Understanding
computers and cognition. Norwood/NJ: Ablex
The first half of this book is somewhat philosophical, and may or may not be your kettle
of tea. The second half is where, if this stuff interests you, you might find some tasty cakes.
Conversation for action,
Winograd and Flores
2 Winograd and Flores' "Conversation for Action" model
Winograd and Flores  proposed a theoretical foundation for conversational analysis which combines a hermeneutic orientation with concepts of the philosophy
They motivate their emphasis on the pragmatic aspects of interpersonal communication with their basic conception of language and cognition: The meaning of
utterances is construed during the course of social communication; knowledge is not built up via transfer of information (representations of objects in a world), but it is
the result of an interpretation in context. Thus, the social dimension is seen as essential for conversational analysis. Winograd and Flores regard the theory of speech
acts (put forward by Austin  and Searle , and Habermas' theory of action [Habermas 1981]) to be initial steps towards an adequate theory of meaning, as
these theories emphasize "language as action" (in contrast to the representational function of language). They state that in human-human conversation talking and
listening are vehicles for the expression of behavioral expectations, building up a complex web of mutual commitments to determine the course of a conversation.
(§9) According to Winograd and Flores it is only on this level that the structure of conversations can be formally described. In their view, other levels are - on principle -
inaccessible to formalization. We do not adhere to such a strong point of view, but prefer to regard their approach as an attempt to describe the pragmatic aspects
of a conversation, while other aspects/levels should be taken into consideration as well.
(§10) As a prototypical example of cooperative dialogue Winograd and Flores present a so-called "Conversation for Action". They propose a model (here referred to
as the "CfA model"), which describes possible sequences of dialogue acts and their interplay in progressive dialogue states. The dialogue genre is a two-party
negotiation of one partner's intended - extra-dialogic - action and the other partner's evaluation of the result. The CfA model is the basis for the implementation of the
"Coordinator", which is a mail system for the support of cooperative work in groups [cf. Winograd, 1988].
(§11) The CfA model is represented as the traversal of a state-transition network (Figure 1) with arcs representing speech acts and nodes representing dialogue states.
The dialogue is initiated by partner A with a `request', which may be followed by B's `promise' to comply; B's proposal of a different action (`counter'); B's `reject' to
comply; or A's own `withdraw' of his previous request, etc.
Figure 1: The basic "Conversation for Action" [Winograd and Flores, 1986, p. 65]
(§12) In this way, each of A's or B's actions gives rise to a new state, which is defined by its history and by its action space (the set of possible follow-up actions). The
circles printed in boldface represent terminal states with no further action space. They differ from the non-terminal states only by the path that led to them. Even
transitions with no corresponding utterance in the dialogue are allowed and can be interpreted as acts, i.e., the dialogue is continued as if the speech act had been
uttered. For example, consent can often be inferred without an explicit "I agree" or "I'm contented". Winograd and Flores call such acts "implicit dialogue acts". On the
level of representation, these are `jumps', which are entered into the dialogue history as regular (empty) transitions.
(§13) If neither participant quits the dialogue prematurely, at some time the state of mutual acceptance of the requested extra-dialogic action is achieved (state
<3>). This state can be followed by B's `assert' (transition <3-4>) to express that his commitment has been met, but it is also possible for B to `renege' or for A to
`withdraw' his directive. In state <4>, only A can respond, either by an evaluation (one of the `declare' acts), or by a `withdraw' act.
(§14) Winograd and Flores had straightforward and simply structured conversations in mind. More complex paths or cycles are possible at two positions only:
exchange of `counter' acts (transitions <2-6>), or A's non-acceptance of B's report of execution (transition <4-3>). Embedded clarification dialogues or meta-level
dialogues are not addressed by the CfA model.
Here. Read all about it. ―Figure 1‖ is the diagram on the previous slide.
Conversation for action, Winograd and Flores
Interesting thing 1
People are at least subconsciously aware of this “script.” It’s
not “against the rules” to answer a request with a rejection or a
counter. There are acceptable ways to do either.
Why is it so rare for our systems to answer with a counter?
“I can’t give you that, but I could give you B instead.” Instead,
they tend to say ―I can’t help you,‖ which is rude if they really
could offer B, or serve up B anyhow as though it was what you
asked for, which amounts to a broken promise.
Interesting thing 2
You can use this diagram (and ones like it) to identify points
of potential breakdown. “Communication disorders”
Unanswered request. unfulfilled promise. etc.
Conversation for action, Winograd and Flores
My friend Tom Morgan once did this exercise for the service transactions at a gas company. He found a
place where, at the end of a phone conversation with a service representative, customers believed they
had just heard a promise but the system had no memory of having made that promise.
Interface built on a dialog model
A Conversational Model of Multimodal Interaction
Adelheit Stein, Ulrich Thiel
Sitter and Stein, Conversational Roles (COR) Model
Dialog models can get a lot more
hairy than the Conversation for
Action model. People have been
out there cooking up notations
for them. I’m not sure… …this is
worth digging in to and
understanding if we’re seeking
foundations for interaction
design. But it might possibly be a
• The notion of a schema for certain types of
dialog, such as a “request for information dialog”,
seems very powerful.
• What if our devices, or… dare I say it… even our
operating systems, knew the dialog schema,
could detect breakdowns such as unfulfilled
promises or requests going without response?
Time to wrap up
My main points
• At the very least, it can be useful for interaction designers to
look at design challenges as an exercise in language design,
and to look at their designs in use as dialogs in that language.
• I suspect that a little hard work might tease out methods we
would all find useful. To me the dialog models are promising.
• We must tie our “surface structures” to the underlying
meanings/deep structures in the software. But a direct
mapping is shallow, confusing, sucky. Working in this way
could lead to interesting collaborations with developers.
• Basically, I think there’s a pony in there. No one has dug it out
• And hey, this isn’t only for software, and it isn’t only about
verbal and visual language. It’s for physical forms, motion,
gesture,… anything that expresses meaning.
Well, that’s done.
Hey thanks. It’s nice to be
able to talk about this stuff.
Since I’m putting this up on the
web (finally), I thought I would
tack some scraps, parts, and
such onto the end. It’s going to
get less coherent from here on
out. You’re mostly on your
We are increasingly mixing elements of interaction design
language with human language. For example:
• Underlining just about any word to show it is a control for
following a “link” (an entry in the lexicon of languages for
interaction which is making its way into the dictionaries of
• Making photographs and images controls for following links.
• But there are many kinds of links, for which we are missing
shared meanings and conventional design language
vocabulary. For example, this page treats links to categories
exactly like links to detail pages. Different meaning, same
Is that good or bad?
• the playing grid
• the status bar
• the menus
• the behavior of a square in
response to mouse clicks
!! Hey… It seems useful to think of a
package of behaviors as a
linguistic construct—a fragment
of a conversation. A series of
animations is a “sentence.” An
animated state change is a
“sentence.” The ripple of state
changes across the
Minesweeper board is practically
a whole paragraph.
the power of language
• the same underlying structures can generate many
• that is, you could build many games from the identical
underlying abstractions. Simply swap lexicons, and you
can “converse” about something new.
minesweeper bunny hunt
Minesweeper Minesweeper 3D Bunny Hunt
Size Size of window Size of 3D landscape Size of window
Squares Tiles in the window Faint grid on landscape Faint grid in the “grass”
Little mine pictures Buzz of sound from your mine-
Cute little bunny pictures
A clickable 3d tile A rectangle of realistic landscape A cute little plot of grass
Put a “flag” or question mark
on a square
Pull a stake from your belt with
yellow or red ribbon. Stick it in
Put a cute little carrot or a cute little
stuffed bunny in a square.
Mine “explodes” – turns
Loud explosion effect, body parts
fly all over.
A cute little bunny hops away, ever
Win game Timer stops, smiley face gets
You reach the far side of the field,
your drill instructor shouts praises
in your face
All the cute little bunnies come out to
eat the cute little carrots. Bells jingle
For a while I was gathering photographs of as many ATM interfaces as I could, taking
pictures of each step of the same transaction. It seems like a good idea to gather a
bunch of data like this, because it’s a way to look at many different surface structures
(generated from different ―lexicons‖ and ―syntaxes‖) for essentially the same
underlying semantics and social context. This effort stopped when I left CMU.
Dad’s phone Son’s phone
Yes, this is what my phone
looked like in 2003.
what people see and do in real life: adult heat water for own tea at home
Working toward alignment: annotating task steps
for later, or
zooming in on one task, just for example
required information Awareness that water is heating. Awareness that water has reached desired
temperature. Question: typical kettles issue alert only for boiling temperature.
This is different than say, the temperature for a young child’s cocoa. Do people
require information about current water temperature? If so, in what units (since
many people don’t know their desired tea temperature in degrees)?
required knowledge or skills Appropriate temperature for making tea
people, relationships Others in the room and the house. Note the conflict of interest between
sufficient notification and reluctance to disturb others.
characteristics of success Wait time < max threshold (TBD). No one is injured. Tea kettle and stove are
undamaged. Tea-maker is aware that water is hot enough for tea. Others in
house are not annoyed, perhaps even pleased.
barriers to success No water in kettle. Water is boiling, but tea-maker is unaware. In worse case this
also leads to a flame under a kettle with no liquid. Stove heat set too low,
extending time or making desired temperature impossible. If people (especially
children) are unaware that the kettle is getting hot, they could burn themselves.
Impatience could cause someone to stop the process before the desired
temperature is reached.
cognitive task Monitoring. Peripheral awareness. Time estimation. For cooking tasks, possible
coordination with other processes.
ongoing concerns Safety. No damage to appliances, utensils, or surfaces. Adequately fast heating
Wait until water is boiling
what people see and do in real life: heat water for tea
technology, materials, information, capabilities
Working toward alignment: align product qualities with steps
Fits std stove,
Range of sizes?
top, room to fit
Fits std burner.
Room for other
pots on stove.
is in progress.
in hand. Grip
it is hot? Hard
to tip over.
Does not spill
what people see and do in real life:
heat water for tea -- wait until boiling
Now we can design the language of the product, aligning its deep qualities
and capabilities with symbols that communicate those meanings well
Minimize burns. Indication
heating is in progress.
Good conductivity. Pleasant
alert. Adjustable volume on
alert. Also consider ―heat is
turned on‖ indication.