Voice Activated Phone System - Human Computer Interaction Report

University Degree Mathematical and Computer Sciences

Applying HCI Design Principles for a Voice Activated Home Cordless Phone System

Abstract

Applying HCI design principles to develop a method for controlling the operation of a telephone; specifically a cordless telephone. This method refers to the implementation of a user interface of a cordless telephone. It has been decided that it is advantageous to apply a voice-controlled interface as opposed to the regular keypad interface. Cordless phones provide an opportunity for free movement around the home, and we will like to add to this advantage by introducing a voice activated phone system. The method developed will hereinafter be called VAPS.

Course Name: Computer Science (BSc)

Module: Human-Computer Interaction

Module Code: COMP2006

Academic Year: 2005/06

Table of Contents

Design 2

Introduction 2

Key HCI Issues 2

Design Options 2

Environment 3

Placing and answering calls 3

Performing directory activities 3

Accessing voice mail 4

Alternative 1: Remote Control with menu interface 4

Alternative 2: LED Display 4

Alternative 3: Voice / Sound Feedback 5

Feasibility of the alternatives 5

Alternative 1: Remote Control with menu interface 5

Alternative 2: LED Display 6

Alternative 3: Voice / Sound feedback 6

Justification of Alternative 3 as the interaction Style 6

Accessibility 6

Requirements Elicitation 6

Evaluation 7

Usability Testing 7

Conclusion 7

References 11

Table of Figures

Figure 1 Class Diagram depicting main components 3

Figure 2 State Chart Diagram depicting calling and answering phone calls 4

Figure 3 Remote Control Example 5

Figure 4 Use Case for Graphical-based LED Display 5

Figure 5 Example showing how Use Cases can produce Requirements 8

Figure 6 Guidelines for Usability Testing 9

Figure 7 Introductory Questions for Questionnaire 10

Figure 8 Questions about each system for Questionnaire 10

Figure 9 Questions comparing the three systems for Questionnaire 11

Design

Introduction

Design an interactive system for maintaining and using a cordless phone at home. Assuming one is already in place, a module shall be added to the existing program to create a voice enabling utility. It is geared towards people with disabilities (quadriplegic, visually impaired) or those who just prefer the convenience of hands free phone. The intended user should be someone who is familiar with using a basic telephone and those selected to evaluate the prototypes will fall under this category.

Key HCI Issues

Originality: Generally speaking, most telephones do not contain this desired functionality. They do not allow certain features to enable ease of use by a person with a disability. If one has voice mail for example, it is essential to pick up the phone after a certain number of rings. This might prove difficult for a person with limited mobility.

Analysis: The actors are:

The user of this system - residents at home.
The interface - provides feedback to the user, and be convenient. There are three different components:

Voice Recognition Engine (houses the word recognizer, the dictionary database)
Controller (the software program)
The Hardware (the actual phone and its components including the microphone)

The user actions consist of placing and answering calls, entering names in their phone directory, and obtaining voice messages via voice commands.

Usability: (Attributes) This is an add-in to an existing phone, and thus requires learning. This additional programming system will focus on usability, and HCI knowledge, principles and methods will guide all design decisions. It is duly noted that for the masses, understandability, familiarity, ease of performing simple tasks such as making a call, and a usable interface are the most necessary features, and sometimes may lead to tradeoffs in uniformity, simplicity and efficiency. However, HCI ...

This is a preview of the whole essay

The user actions consist of placing and answering calls, entering names in their phone directory, and obtaining voice messages via voice commands.

Consistency
Speaking the user’s language
Cognitive overload is avoided
Preventing errors and
Helping the user get started.

The system should be extremely easy to use for experienced users, and instinctive for new users. Issues involved with this aspect are the accessibility of the system, performance of the system and intuitiveness of the interface. Our goal is to provide an easy integration of this module, and for it to scale so well that it is possible to create more elaborate programs in the future. This new system has been coined Voice Activated Phone System (VAPS).

Design Options

The major design options concentrate on the user interface. It is responsible for receiving input from the user and producing output for the user, i.e. communicates with the user. How it responds is what is in question. However, before discussing the alternatives, a little background on the design first:

The Voice Recognition Engine (VRE): This consists of the Word Recognizer which has a dictionary database. It should be flexible. The built in commands should make “common sense” for ease of learning. It shall exist on both the base and the cordless phone. The VRE evaluates the commands and returns it to the Controller. Analysis time should not be more than 5 seconds.

Controller: Software Program that designates the commands so that they are carried out by the correct object/class.

Hardware: The actual hardware will be attached to the base of the cordless phone via an audio cable connector. The microphones will exist on both the base and handset, and both can communicate with the hardware device. They pick up the voice commands and send it to the VRE. Furthermore, the hardware device is detachable, and thus can be used on other telephones.

Figure 1 Class Diagram depicting main components

Environment

The environment of the system relates to the application that the user interacts with in order to perform the following functions:

Placing and answering calls

The user should be able to place a call via voice command. Voice command shall consist of two words: “Call X”, where X is the name of the person in the directory or a location in memory such as the Hospital. X can also be the telephone number: “Call 5554321”. On hearing that, the phone should route the call.

When it comes to answering a call, the phone should announce who is calling: “Call by X”, where X is the name of the individual on the caller id, and if it is unknown, it shall say “Call by Unknown”. Once the user says “hello” it shall pick up. To end calls, the user should state “End Call”.

Performing directory activities

This is limited to storing a name and number. To delete or edit, one will need to do this manually. User should be able to start the directory by stating “New Entry”. The name shall then be spelt out: “J – A – N – E” followed by “Jane” so the system recognizes the pronunciation, and it signals

Figure 2 State Chart Diagram depicting calling and answering phone calls

the end for ‘name entry”. Then the number (as one will dial it) shall be stated: “5-5-5-4-3-2-1“, followed by “Save” to let VAPS know it is completed. If the user should make a mistake during the procedure, they should state “Cancel” and it shall re-start.

Accessing voice mail

To access voice mail, the user states “Messages”. The most recent message will play first. After each message there shall be a pause where user can “Delete” or “Save” message. During the message playback, the user may state “Skip” or “Stop”.

Our alternatives concentrate on our voice controlled interface: How will the user communicate and receive feedback from the system?

Alternative 1: Remote Control with menu interface

This is a large remote switch with a single menu command line interface. The user has the benefit of using both voice prompts and the remote control itself to control the system. The remote control responds to voice commands by highlighting the command spoken (as seen above with End Call). However, commands such as Skip, Stop, Save and Cancel will not be highlighted as these are command buttons. A joystick can be used to scroll down the commands, which will also cause the highlighting. Pressing the selector will confirm choice. If the user use

Alternative 2: LED Display

The LED Display consists of three different sets of lights (red, yellow, green) to give feedback that command was recognized and is being implemented. The Red Light for “OFF” State, Green for “GO” to indicate command is being implemented and the “Yellow” to Indicate that system is on standby, waiting for command to be given. There shall be also a series of numbers 1-9 on the LED Display for built in command storage. For example “1” may represent Answer Call. So if “Green 1” is showing, the user will know that the call is being answered.

Figure 3 Remote Control Example

Figure 4 Use Case for Graphical-based LED Display

Alternative 3: Voice / Sound Feedback

This is more of a virtual interface in the fact that no graphical display will be given. Instead, the system shall respond via voice feedback. For example if the user states “Hello” whilst the call is ringing, the system may respond via a beep. If the user states “Call Hospital”, the system will echo feedback, to show that it understood. It too, will state “Call Hospital”. If there is an error, the system will place “ERROR” beside the last command spoken by the user. So if the user stated “Call 555-1234”, and there was an error in dialing, the system will reply “Error Call 555-1234”.

Feasibility of the alternatives

This section outlines the feasibility of the three alternatives described above.

Alternative 1: Remote Control with menu interface

This alternative is feasible and can be built for this project. The duality of using both voice and a keypad is a bonus, but then again the user always had a choice of using the actual telephone pad itself too. Also, the user will have to keep track of the remote controls location. Moreover, our system is also geared towards people who are disabled, and the joystick will prove useless.

Alternative 2: LED Display

This alternative is simple, consistent and uniform. However, it requires one to familiarize himself with the key for command storage. For example, they will need to know that “1” means answering a call. Moreover, for the visually impaired, this interface is not so useful as well.

Alternative 3: Voice / Sound feedback

This alternative may prove difficult to implement, but appears to be the most beneficial. In choosing, the first decision I faced was whether to focus on being visual or auditory. I decided the latter, as this was in line with the purpose of VAPS. When reviewing how VAPS will work, I placed myself in the “users” role. I tended to use audio inputs and outputs to determine the behaviors and actions of VAPS.

Justification of Alternative 3 as the interaction Style

The user enters the command via speaking to the system, and the system performs it, providing feedback via a beep or echo. This provides both speed and flexibility. The user can be allowed to turn voice/sound feedback on or off; or to limit feedback to just beeps only.

Like all the other alternatives, commands do have to be remembered. Nevertheless, they observe the HCI principle to speak the user’s language. Commands are of natural language, mimicking daily vocabulary. For example, just stating “hello” when the phone rings answers a call and “End Call” ends a phone call. It eliminates the use of menus and submenus. Commands are finite and input from the user is thus constrained.

There is simplicity and consistency between input and output with the voice echo feedback. The user does not have to decipher what the system means.

Accessibility

VAPS lies under the category of assistive technology as it caters towards those who have limited mobility or are visually impaired. However, it is universal and can accommodate any user with busy hands. It is easy to understand, simple and intuitive with low physical efforts required. As the commands are finite, language does not come into play as they can easily be translated. In addition, finite commands act as a means for error control. The microphone may be prone to picking up sounds from daily equipment in its surrounding environment such as television, radio, kitchen devices, and conversation; however it will ignore extra audio unless they are equal to a command. Also, the command has to be relevant to an action taking place. If the phone is not ringing, and the user states “Hello”, nothing occurs.

However, with the interface we chose, we do segregate a certain group of people and that is those with limited hearing and/or speech impairments as voice input control is required. Perhaps in the future, I will combine the Voice/Sound Feedback with the LED Display.

Requirements Elicitation

We are designing an update to an already existing product. Therefore, we need to align it to present functionalities. Our non functional requirements will be based on the constraints the existing system provides. For example, we cannot add a new function without changing the already existing system. We shall study the manual of a telephone to obtain instructions on how to perform the procedures VAPS is to do such as answering a call.

To gather requirements, I will interview all potential users at their home to ascertain how they go about performing tasks such as placing calls or answering calls and what they would like improved. I shall also use questionnaires, so that I can reach a wide range of individuals. All interviewees must own a home phone and use it frequently. At least 50% of the people interviewed should have one or more of the following characteristics:

A physical disability / limited mobility
Visually impaired

Then a user problem statement will be made. From this, I shall develop use cases. Use cases model the systems functional behavior. They will put us in the driver’s seat, and will allow us to examine different scenarios. Scenarios will act as storyboards that will define the systems functionalities. Furthermore, we will be able to discover any error-prone situations (what may occur and what can we do about it) early. As we discover the functionalities, we can create the users and functional requirements and track them with the use cases.
Figure 5 is an example.

I believe the drawbacks to this method if any, are limited. It is extremely crucial for the programmers who receive the systems functionalities, and it allows the designers to be thorough.

Evaluation

Usability Testing

We have decided to use the informal HCI method of direct observation for testing. The three alternative systems will be tested by a group of potential users in order to determine how usable (easy to learn and use) they are. It is the aim of usability testing that test participants will identify flaws in design, confusing sections and navigation problems. The users that will be selected are those that are familiar with basic cordless telephones.

It will involve being in the room and observing the participants carrying out specific assigned tasks. It is a cheap and effective method. However, it may create a Hawthorne effect where the user may modify their behavior based on my presence and/or facial expression. Furthermore, it assumes that my attention is on the user’s activity 100% of the time, and this may not always be the case. To limit these negative points, I have set up some guidelines which can be seen in Figure 6.

In addition to the above guidelines, a questionnaire is used instead of an interview because there will be one participant at every session and only one person who would be able to conduct the interviews. The questionnaire ideally, will have no more than 20 questions and use even-point scales (force committal - no midpoint). Participants will be asked to answer questions after using each system to prevent confusion. It shall ask specific questions about each system and then comparison questions relating to all three systems. See Figure 8, Figure 8, Figure 9.

Conclusion

Using HCI, we have attempted to create an intuitive design that it simple to learn and understand. Using walkthroughs during our evaluation coupled with questionnaire, that uses different rating scales, will essentially tell us if we are right. However, it is apparent that we do segregate certain users when we do use a voice / sound feedback, and our design should be expanded further to include a graphical display that will present audio cues.

Figure 5 Example showing how Use Cases can produce Requirements

Figure 6 Guidelines for Usability Testing

Figure 7 Introductory Questions for Questionnaire

Figure 8 Questions about each system for Questionnaire

Figure 9 Questions comparing the three systems for Questionnaire

References

Able Phone. (n.d.) AblePhone. Retrieved December 4th from < http://www.ablephone.com/>
Dr. Gary Wills et al. (2004), COMP2006 Human Computer Interaction (HCI) Class notes. Retrieved December 4th, 2005 from <>
Dix et al. (2003), Human-Computer Interaction, 3rd Edition. Prentice Hall.
Dynamic Living. (2005). Voice Activated Telephone. Retrieved December 3rd, 2005 from < >
Google Images. (n.d.) “Cordless Telephone”. Retrieved December 6th, 2005 from
Havisto et al. (Jan 26, 1999). Method and Apparatus for Controlling a Telephone with Voice Commands. Background of the Invention. Retrieved December 5th, 2005 from <>
Lowgren, Jonas. (n.d.) HCI Techniques and Wisdom. Retrieved December 5th, 2005 from <>