ISBN-10:
0321185765
ISBN-13:
9780321185761
Pub. Date:
02/16/2004
Publisher:
Addison-Wesley
Voice User Interface Design / Edition 1

Voice User Interface Design / Edition 1

by James P. Giangola, Jennifer Balogh

Paperback

Current price is , Original price is $64.99. You

Temporarily Out of Stock Online

Please check back later for updated availability.

Overview

This book is a comprehensive and authoritative guide to voice user interface (VUI) design. The VUI is perhaps the most critical factor in the success of any automated speech recognition (ASR) system, determining whether the user experience will be satisfying or frustrating, or even whether the customer will remain one. This book describes a practical methodology for creating an effective VUI design. The methodology is scientifically based on principles in linguistics, psychology, and language technology, and is illustrated here by examples drawn from the authors' work at Nuance Communications, the market leader in ASR development and deployment.

The book begins with an overview of VUI design issues and a description of the technology. The authors then introduce the major phases of their methodology. They first show how to specify requirements and make high-level design decisions during the definition phase. They next cover, in great detail, the design phase, with clear explanations and demonstrations of each design principle and its real-world applications. Finally, they examine problems unique to VUI design in system development, testing, and tuning. Key principles are illustrated with a running sample application.

A companion Web site provides audio clips for each example: www.VUIDesign.org

The cover photograph depicts the first ASR system, Radio Rex: a toy dog who sits in his house until the sound of his name calls him out. Produced in 1911, Rex was among the few commercial successes in earlier days of speech recognition. Voice User Interface Design reveals the design principles and practices that produce commercial success in an era when effective ASRs are not toys but competitive necessities.



Product Details

ISBN-13: 9780321185761
Publisher: Addison-Wesley
Publication date: 02/16/2004
Edition description: New Edition
Pages: 368
Sales rank: 422,265
Product dimensions: 7.00(w) x 9.00(h) x 0.80(d)

About the Author

Michael Cohen is the cofounder of Nuance Communications. He has played a variety of roles at Nuance, including creation of the Professional Services organization and the Dialog Research and Development group. Michael is a popular speaker and a consulting professor at Stanford University. He has published more than seventy papers, holds eight patents.

James Giangola is an industrial linguist, who designs, researches, and mentors others in creating VUIs that reflect the linguistic features and principles that shape everyday, human-to-human conversations. An innovator in prompt-writing and dialog design, James has ten years of experience teaching languages and linguistics, and maintains a consulting practice.

Jennifer Balogh is a speech consultant at Nuance Communications, where she designs and evaluates interfaces for spoken language systems. She also conducts research on dialog design techniques and holds several patents. Jennifer is a university lecturer and frequent contributor to conferences and journals.



0321185765AB11172003

Read an Excerpt

In the past decade, there has been an explosion in the creation and commercial deployment of voice user interfaces, especially for use over the telephone. Voice user interfaces (VUIs) use speech technology to provide callers with access to information, allow them to perform transactions, and support communications.

This proliferation is driven by a number of factors: customer dissatisfaction with touchtone interactions, a growing desire for mobile access, the need for enterprises to more effectively and inexpensively meet their customers' needs, and the development of speech technology that is finally robust and reliable enough to deliver effective and reliable interaction in well-defined domains.

At the beginning of this decade of growth (starting around 1994), the biggest obstacle that needed to be overcome was skepticism about the capabilities of the technology. Speech technology had been promised for decades and had disappointed many times. The enterprises that could potentially improve their customer service and save money, as well as the venture capital firms that could fund the early start-ups, needed proof points. Within a few years, many such proof points existed: Millions of phone calls every day were being handled successfully by speech technology. Although technology improvement will continue to play a key role in providing better experiences for end users, increased business value to enterprises, and new capabilities enabling new types of applications, it is no longer the key bottleneck to growth of the speech industry.

The biggest challenge now is the design of the user interface. There are too few practitioners who have the knowledge and skills to createall the systems needed and to advance our understanding as new technology enables new capabilities. Current practitioners come from a wide variety of backgrounds: speech technology, user interface design, cognitive psychology, linguistics, and software development. All these fields have contributed to our current level of understanding of voice user interface design. In fact, the field has benefited substantially from the diversity of influences. However, the need to pull together information from diverse fields has made it difficult to codify and teach the rationale for design.

In this book we aim to offer in one place much of the background information needed for practitioners to design specific applications and to contribute to the advance of the field. We try to take a principled approach to deriving best practices, with the hope that designers will then have a basis for approaching new design situations and new technologies.Organization of The Book

Given that our primary focus is to teach VUI design, we chose to organize it according to the design methodology we recommend. The parts are as follows:

Part I, introduction: Chapters 1-3 provide the necessary introductory material, including an overview of voice user interfaces and design issues, a description of the technology, and a high-level view of the design methodology we detail throughout the remainder of the book.

Part II, definition phase: Chapters 4-7 cover the definition phase of a project: discovering the requirements and making high-level design decisions that will guide the detailed design.

Part III, design phase: Chapters 8-14 cover the detailed design phase. Design principles are covered in detail, with many examples of how to apply them to real applications.

Part IV, realization phase: Chapters 15-18 cover the realization phase: development, testing, and tuning. A number of issues that are unique to voice user interface design, such as grammar development, are covered in detail.

Each part begins with a chapter covering the methodological details for that phase of design. Following that are a few chapters describing the design principles and approaches relevant to that phase. The final chapter of each section presents a design example.Audience

This book is meant to address a variety of audiences:

  • Practitioners: The primary audience is practitioners or those intending to become practitioners. We try to lay the groundwork so that beginners can understand all the material. The book should provide value to both experienced and inexperienced designers. Practitioners can benefit from reading all the chapters.
  • Students of human-computer interfaces: Students studying human-computer interface design will find that VUIs have many things in common with other types of user interfaces. On the other hand, a number of issues and design approaches are unique to voice user interfaces. The entire book is useful to students, although Chapters 1-4, 6, 8-13, and 15-16 stand out as most important.
  • Business managers: Business managers making decisions about how speech technology can best meet the needs of their organizations will benefit most from Chapters 1-4 and 6.
  • Project managers: Project managers who must understand the steps in designing and deploying an application will benefit most from Chapters 1-4, 6-8, 14-15, and 18.Web Site

    Throughout the book we stress the value of listening to your interface rather than only looking at a written specification. We would be remiss if we did not provide a means by which readers could listen to the many examples presented. We have created a Web site (http://www.VUIDesign.org) that tracks the book and provides audio versions of the examples. We recommend following the Web site as you read.

  • Table of Contents



    About the Authors and Radio Rex.


    Preface.

    I. INTRODUCTION.


    1. Introduction to Voice User Interfaces.

    What Is a Voice User Interface?

    Why Speech?

    Where Do We Go from Here?


    2. Overview of Spoken Language Technology.

    Architecture of a Spoken Language System.

    The Impact of Speech Technology on Design Decisions.

    Conclusion.


    3. Overview of the Methodology.

    Methodological Principles.

    Steps of the Methodology.

    Applying the Methodology to Real-World Applications.

    Conclusion.

    II. DEFINITION PHASE: REQUIREMENTS GATHERING AND HIGH-LEVEL DESIGN.


    4. Requirements and High-Level Design Methodology.

    Requirements Definition.

    High-Level Design.

    Conclusion.


    5. High-Level Design Elements.

    Dialog Strategy and Grammar Type.

    Pervasive Dialog Elements.

    Conclusion.


    6. Creating Persona, by Design.

    What Is Persona?

    Where Does Persona Come From?

    A Checklist for Persona Design.

    Persona Definition.

    Conclusion.


    7. Sample Application: Requirements and High-Level Design.

    Lexington Brokerage.

    Requirements Definition.

    High-Level Design.

    Conclusion.

    III. DESIGN PHASE: DETAILED DESIGN.


    8. Detailed Design Methodology.

    Anatomy of a Dialog State.

    Call Flow Design.

    Prompt Design.

    User Testing.

    Design Principles.

    Conclusion.


    9. Minimizing Cognitive Load.

    Conceptual Complexity.

    Memory Load.

    Attention.

    Conclusion.


    10. Designing Prompts.

    Conversation as Discourse.

    Cohesion.

    Information Structure.

    Spoken Versus Written English.

    Register and Consistency.

    Jargon.

    The Cooperative Principle.

    Conclusion.


    11. Planning Prosody.

    What Is Prosody?

    Functions of Prosody.

    Stress.

    Intonation.

    Concatenating Phone Numbers.

    Minimizing Concatenation Splices.

    Pauses.

    TTS Guidelines.

    Conclusion.


    12. Maximizing Efficiency and Clarity.

    Efficiency.

    Clarity.

    Balancing Efficiency and Clarity.

    Conclusion.


    13. Optimizing Accuracy and Recovering from Errors.

    Measuring Accuracy.

    Dialog Design Guidelines for Maximizing Accuracy.

    Recovering from Errors.

    Conclusion.


    14. Sample Application: Detailed Design.

    Call Flow Design.

    Prompt Design.

    User Testing.

    Conclusion.

    IV. REALIZATION PHASE: DEVELOPMENT, TESTING, AND TUNING.


    15. Development, Testing, and Tuning Methodology.

    Development.

    Testing.

    Tuning.

    Conclusion.


    16. Creating Grammars.

    Grammar Development.

    Grammar Testing.

    Grammar Tuning.

    Conclusion.


    17. Working with Voice Actors.

    Scripting for Success.

    Choosing Your Voice Actor.

    Running a Recording Session.

    Conclusion.


    18. Sample Application: Development, Testing, and Tuning.

    Development.

    Testing.

    Tuning.


    19. Conclusion.

    Appendix.

    Bibliography.

    Index.

    Preface

    In the past decade, there has been an explosion in the creation and commercial deployment of voice user interfaces, especially for use over the telephone. Voice user interfaces (VUIs) use speech technology to provide callers with access to information, allow them to perform transactions, and support communications.

    This proliferation is driven by a number of factors: customer dissatisfaction with touchtone interactions, a growing desire for mobile access, the need for enterprises to more effectively and inexpensively meet their customers' needs, and the development of speech technology that is finally robust and reliable enough to deliver effective and reliable interaction in well-defined domains.

    At the beginning of this decade of growth (starting around 1994), the biggest obstacle that needed to be overcome was skepticism about the capabilities of the technology. Speech technology had been promised for decades and had disappointed many times. The enterprises that could potentially improve their customer service and save money, as well as the venture capital firms that could fund the early start-ups, needed proof points. Within a few years, many such proof points existed: Millions of phone calls every day were being handled successfully by speech technology. Although technology improvement will continue to play a key role in providing better experiences for end users, increased business value to enterprises, and new capabilities enabling new types of applications, it is no longer the key bottleneck to growth of the speech industry.

    The biggest challenge now is the design of the user interface. There are too few practitioners who have the knowledge and skills to create all the systems needed and to advance our understanding as new technology enables new capabilities. Current practitioners come from a wide variety of backgrounds: speech technology, user interface design, cognitive psychology, linguistics, and software development. All these fields have contributed to our current level of understanding of voice user interface design. In fact, the field has benefited substantially from the diversity of influences. However, the need to pull together information from diverse fields has made it difficult to codify and teach the rationale for design.

    In this book we aim to offer in one place much of the background information needed for practitioners to design specific applications and to contribute to the advance of the field. We try to take a principled approach to deriving best practices, with the hope that designers will then have a basis for approaching new design situations and new technologies.

    Organization of The Book

    Given that our primary focus is to teach VUI design, we chose to organize it according to the design methodology we recommend. The parts are as follows:

    Part I, introduction: Chapters 1-3 provide the necessary introductory material, including an overview of voice user interfaces and design issues, a description of the technology, and a high-level view of the design methodology we detail throughout the remainder of the book.

    Part II, definition phase: Chapters 4-7 cover the definition phase of a project: discovering the requirements and making high-level design decisions that will guide the detailed design.

    Part III, design phase: Chapters 8-14 cover the detailed design phase. Design principles are covered in detail, with many examples of how to apply them to real applications.

    Part IV, realization phase: Chapters 15-18 cover the realization phase: development, testing, and tuning. A number of issues that are unique to voice user interface design, such as grammar development, are covered in detail.

    Each part begins with a chapter covering the methodological details for that phase of design. Following that are a few chapters describing the design principles and approaches relevant to that phase. The final chapter of each section presents a design example.

    Audience

    This book is meant to address a variety of audiences:

  • Practitioners: The primary audience is practitioners or those intending to become practitioners. We try to lay the groundwork so that beginners can understand all the material. The book should provide value to both experienced and inexperienced designers. Practitioners can benefit from reading all the chapters.
  • Students of human-computer interfaces: Students studying human-computer interface design will find that VUIs have many things in common with other types of user interfaces. On the other hand, a number of issues and design approaches are unique to voice user interfaces. The entire book is useful to students, although Chapters 1-4, 6, 8-13, and 15-16 stand out as most important.
  • Business managers: Business managers making decisions about how speech technology can best meet the needs of their organizations will benefit most from Chapters 1-4 and 6.
  • Project managers: Project managers who must understand the steps in designing and deploying an application will benefit most from Chapters 1-4, 6-8, 14-15, and 18.
  • Web Site

    Throughout the book we stress the value of listening to your interface rather than only looking at a written specification. We would be remiss if we did not provide a means by which readers could listen to the many examples presented. We have created a Web site (http://www.VUIDesign.org) that tracks the book and provides audio versions of the examples. We recommend following the Web site as you read.



    Customer Reviews

    Most Helpful Customer Reviews

    See All Customer Reviews