Calibrating Trust Towards An Autonomous Image Classifier, 2019

DOI

Successful adoption of autonomous systems requires appropriate trust from human users, with trust calibrated to reflect true system performance. Autonomous image classifiers are one such example and can be used in a variety of settings to independently identify the contents of image data. We investigated users’ trust when collaborating with an autonomous image classifier system that we created using the AlexNet model (Krizhevsky et al., 2012). Participants collaborated with the classifier during an image classification task in which the classifier provided labels that either correctly or incorrectly described the contents of images. This task was complicated by the quality of the images processed by the human-classifier team: 50% of the trials featured images that were cropped and blurred, thereby partially obscuring their contents. Across 160 single-image trials, we examined trust towards the classifier, while we also looked at how participants complied with the classifier by accepting or rejecting the labels it provided. Furthermore, we investigated whether trust towards the classifier could be improved by increasing the transparency of the classifier’s interface, by displaying system confidence information in three different ways, which were compared to a control interface without confidence information. Results showed that trust towards the classifier was primarily based on system performance, yet this also was influenced by the quality of the images and individual differences amongst participants. While participants typically preferred classifier interfaces that presented confidence information, it did not appear to improve participants’ trust towards the classifier.The project will seek to investigate which parameters influence trust between artificial intelligences and human users. Our partner for this project, Qumodo, are a company dedicated to helping people interface with artificial intelligence; we will examine their Intelligent Iris system. Intelligent Iris is a modular data analysis system which is designed to facilitate human users in extracting meaningful results from large sets of data, including images (such as photos, medical scans, military sensor data etc.). The visual nature of this task makes it challenging as humans bring a wealth of social expectancies and uniquely human visual processes to understand an image. Fostering trust within man-machine teams is expected to improve both mental health and productivity. Guided by recent research into trust from domains like autonomous vehicles and social robotics, we will perform experiments to examine which parameters influence the calibration of trust when interacting with the image understanding software. We hope to advance a conceptual understanding of trust between man and machine and identify effective strategies to adjust system parameters to properly calibrate trust. These results will be valuable in advancing product development at Qumodo and will importantly inform the wider debate over how to design intelligent systems.

74 participants (37F, 36M, 1 Non-Binary), primarily university students (Mean Age = 26.2, Min = 19, Max = 55), were recruited through the University of Glasgow's Psychology department subject pool. Approximately half (51%) of the participants considered themselves native English speakers. The main component of this study was a human-computer interaction experiment. Participants also completed some short questionnaires: The NASA Task Load Index (NASA-TLX) (Hart and Staveland, 1988) and the Propensity to Trust Machines Questionnaire (Merritt et al., 2013). In the human-computer interaction experiment, participants completed an image identification task, while collaborating with an Autonomous Image Classifier which was based on the AlexNet image classifier model (Krizhevsky et al., 2012) Participants used a mouse and keyboard to interact with the classifier’s Graphical User Interface (GUI), which was built within MATLAB app designer, (MATLAB ver. R2017a). Throughout the experiment, participants viewed a series of 160 images which were selected from The Open Images Dataset V4 (OIDV4), (Kuznetsova et al., 2020). These images featured neutral image categories such as household objects, nature scenes, food items, vehicles, and animals. Participants would decide whether to keep or replace the classifier's label for each image, and also provided ratings on how accurate the classifier was, how much they trusted the classifier, and how well they knew the contents of each image. We also compiled a short series of questions in a debriefing questionnaire, in which we asked participants to rate their general opinions of the image classifier after working with it.

Identifier
DOI https://doi.org/10.5255/UKDA-SN-854151
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=a2f34dceca20295e6c4a212442a514b222dca720da4bcebbbe2d093e9df6ab86
Provenance
Creator Ingram, M, University of Glasgow; Pollick, F, University of Glasgow
Publisher UK Data Service
Publication Year 2021
Funding Reference Economic and Social Research Council; Scottish Graduate School of Social Science
Rights Martin Ingram, University of Glasgow. Frank Pollick, University of Glasgow; The Data Collection is available to any user without the requirement for registration for download/access.
OpenAccess true
Representation
Resource Type Numeric
Discipline Psychology; Social and Behavioural Sciences
Spatial Coverage Greater Glasgow; Scotland