Smart Assistants and Forgiveness

Human Computer Interaction Case Study

This is a project that lasted about 3 months long. We studied the smart assistants, Google Home and Jibo, and compared the human's aspects of forgiveness between the two.

Project Overview

Introduction

Humans, for the most part, are forgiving to unintentional mistakes. Giving a driver the wrong directions to a restaurant probably won’t ruin a friendship, nor will ignoring a requested song at a party cause a tantrum. But, as technology becomes ever increasingly intertwined with our lives, the question arises: do humans exhibit the same leniency when relying on robots to complete tasks? Personal robots, also known as smart assistants, have become commonplace devices for their ability to do numerous, everyday tasks for us. They are not perfect, however, and their potential to misinterpret directions or fail at a given task undoubtedly affects how we interact with them; still, this is limited to a certain extent because after all, they are just robots. Enter Jibo, a smart assist, who looks at you when you speak with it, nods and dances, and even tells jokes. Jibo’s distinguishing characteristic of anthropomorphism, the concept of projecting human-like qualities upon otherwise non-human objects or animals, is what prompts the thesis question of this project: will people respond to a smart assistant differently when it makes a mistake if that smart assistant more closely resembles another human? To answer that question, we will run a series of curated user studies in which people will rely on Jibo and a non-anthropomorphic competitor, the Google Home, to complete certain tasks. We will measure the responses to collect data which will hopefully lead to insights on whether or not a more human-like robot influences humans to be more forgiving towards it even if that robot is indifferent towards such feedback.

Background

Smart assistants have become more popular every year since their first mainstream introduction with Apple’s Siri in 2011. In October of 2017, after 3 years of development, Jibo, a robotic smart assistant, was released. Jibo was praised for his fluid movements and his strong personality that allowed individuals to quickly connect with him. However, Jibo was also strongly criticized for his lack of “intelligence”  and overall combusome voice commands. People have become accustomed to the general “intelligence” of smart assistants like Alexa and Google Home. The release of Jibo, and more importantly its shortcomings, create an opportunity to examine the role that anthropomorphism can play in how people interact with smart assistants.

Difficulty

The main difficulties of this project are twofold. Firstly, it will be difficult to ensure that a smart assistant fails in a given scenario. Controlling whether or not Jibo or Google Home fail is important because it will allow us manipulate the conditions of a user study in order to prompt certain responses by participants. Those responses present the second difficulty because they will likely be emotional, gestural,  and/or subtle, and therefore harder to measure. It will be up to us to create clever scenarios for the studies and be very meticulous about observing them. Additionally, there lies the difficulty in studying a large and diverse enough population in order to reach representative conclusions. Based on these, we believe this project has a difficulty rating of 7 out of 10.

Relevance

As smart assistants’ prevalence in humans’ day to day lives continues to increase at a rapid rate, so must the research that analyzes our interactions with them. Personal robots currently do and will inevitably continue to make mistakes when completing certain tasks they are asked of, so it is important to test and understand how humans may react in these situations. This way, we can identify and implement the characteristics in these robots that will minimize frustration when these situations arise. In addition, this project will allow us to explore class-related topics such as mental models, experiential cognitive processes, interaction styles and paradigms, and speech-based interaction.

System Design

Overview of User Studies and Scenarios

Description of Users

Users in this experiment will consist mostly of Cal Poly students, both male and female, ranging in age from 18 to 26. The benefits of using this group of individuals is that it is easy to get a large sample size, and most users will likely be familiar with, or have at least heard of, current technologies.  Some downsides to these users are the lack of age variety and these users may be more lenient with these technologies making mistakes.

Users in this experiment will consist mostly of Cal Poly students, both male and female, ranging in age from 18 to 26. The benefits of using this group of individuals is that it is easy to get a large sample size, and most users will likely be familiar with, or have at least heard of, current technologies.  Some downsides to these users are the lack of age variety and these users may be more lenient with these technologies making mistakes.

User studies will be conducted in any indoor setting that allows access to power outlets and Wi-Fi connectivity so that the devices are at full functionality. Ambient noise must be minimal, except for when specific scenarios purposely introduce noise. Users interacting with the Jibo and users interacting with the Google Home will be kept in separate environments as to not influence the other’s interactions, unless a specific scenario calls for both groups to be brought into the same environment.

Outline of User Study

A single study will consist of a series of scenarios that will be conducted within two groups (one Jibo, one Google Home) in a consistent order separately and simultaneously. A sequential approach is necessary to allow users to become acquainted with the Jibo and Google Home devices. Moreover, since the observations and data we are collecting are emotional in nature, such an approach is necessary to allow users to develop, or at least begin to develop,  a sense of rapport with their respective device. Users will perform interactions with only one device and not the other for the entirety of the study. An entire study is estimated to take 30 minutes to complete.

#1 Setup and First Impressions

In this scenario, users will be introduced to Jibo and Google Home by asking each device a series of questions. The goal of this scenario is to frame the Jibo as having human-like qualities, and the Google Home as lacking them. Establishing this first impression of the device will be key for measuring differences in subsequent scenarios.

#2 Basic Tasks

After the user has been introduced to the device, they will be given a chance to see what kind of functions the device is capable of. This scenario is intended to allow the user to become comfortable issuing commands and set some basic foundations for what to expect as responses to those commands.

#3 Commands from a Distance

This scenario and subsequent scenarios are designed to influence the Jibo and Google Home to fail or make mistakes. Users will be asked to perform basic tasks that encourage using the personal assistants for help, however, the devices will be placed at a fixed distance (~50 feet away) that exceed the device’s recommended range from the user. This scenario is meant to exploit the interaction space of the devices and mimic everyday use; as such, the users will be limited from moving from a small area while performing various tasks.

#4 Ambiguous Pronunciations/Spelling

In this scenario, users will be prompted to assign tasks or ask questions to their respective device  that contain words with ambiguous pronunciations or unusual spelling. The prompts will include names, places, and things that we can reasonably expect users to mispronounce or the Jibo or Google Home to misinterpret.  This scenario is intended to challenge the auditory interaction language used when communicating with the Jibo and Google Home.

#5 Competitive Tasks and Trivia

In this scenario, the Jibo and Google Home groups will be brought into the same environment. The groups will compete in a trivia-style game where they must rely on their respective device to provide answers to various questions or complete certain tasks in a very short amount of time. The questions and tasks presented will be specifically tailored to the weaknesses of each device so that mistakes are more likely. This scenario is designed to provide a competitive atmosphere to influence emotional responses from the users based on the performance of their respective device. Every time a user is successfully able to use their device to provide a correct answer or complete a task, they will earn a point. The game will be turn-based and a score will be kept between the two groups; the group with the highest score at the game’s completion will receive some type of reward.

Experimental Observation and Data Collection

For each scenario, the following data will be collected and evaluated for each user as they interact with their respective device.

Results of our User Studies

Structure of User Studies

Exceeding Expectations – In order to encourage responses from users, we will create a competitive setting by implementing a point-reward system during the study. This will be disclosed to the participants at the very beginning and the system will work as follows:

Scenario 1: Setup & First Impressions

  • 1.1 - What is your name?
  • 1.2 - Where are you from/who made you?
  • 1.3 - How are you feeling right now?
  • 1.4 - What is your favorite color?
  • 1.5 - What do you like to do?

Scenario 2: Basic Tasks

  • 2.1 - What time is it right now?
  • 2.2 - Play [song] by [artist].
  • 2.3 - Who is the President of [country]?
  • 2.4 - Set an alarm for [time].

Scenario 3: Commands from a Distance

  • 3.1 - On the poster, write the address of the nearest pizza shop
  • 3.2 - Set a pencil on the floor for exactly 10 seconds
  • 3.3 - Get the weather for tomorrow in San Luis Obispo
  • 3.4 - Ask your device to dance, and rate their performance
  • 3.5 - Ask your device to sing the ABCs, and rate their performance
  • 3.6 - Ask your device if they dream
  • 3.7 - Ask your device to play music by 3lau

Scenario 4: Ambiguous Pronunciations & Spelling

  • 4.1 - Who was Thomas Aquinas?
  • 4.2 - Give me directions to the city of Perris (in Riverside county)
  • 4.3 - Answer this math problem (integral)
  • 4.4 - What is a oligosaccharide?
  • 4.5 - Is worcestershire sauce vegan?
  • 4.6 - How do you spell Isthmuses?

Identifying the population

We conducted a pre-study survey to collect information about our study participants. The following is a synopsis of the sample of users that produced the data presented hereinafter:

Method of Data Collection

During our trials, an observer was present to record the users’ reactions to the Jibo and Google Home devices. To help mitigate the Hawthorne Effect, the user was not made aware that his or her behavior was being monitored during the study (but it was later disclosed at the end of the study for ethical reasons). The observer used a Google Form to record information, specifically in the form of a checkbox grid. The grid contained a breakdown of each scenario into rows by its individual tasks, as well as columns of general behavioral responses. If a behavior was observed during a specific task, it would be recorded. The record was binary for each individual task, but records were accumulated during the four scenarios over the course of one trial.

Results from Jibo

Results from Google Home

We found that our study showed no significant evidence that a focus on anthropomorphic design will influence users to be more forgiving of their mistakes. Our experiments showed no significant differences in reaction to the Jibo or Google home when the devices failed, but the inconclusive results of our experiments could be attributed to the following reasons:

Features

Performance Aspects

Interesting Technical & Implementation Issues

Lessons Learned

There are many variable biases that impeded on our ability to get concrete and conclusive evidence of
smart assistant forgiveness:

Future Work

Back to top