




版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、Ask not what AI can do, but what AI should do: Towards a framework of task delegability Brian Lubars University of Colorado Boulder Chenhao Tan University of Colorado Boulder Abstract While artifi cial intelligence (AI) holds promise for addressing so
2、cietal challenges, issues of exactly which tasks to automate and to what extent to do so remain un- derstudied. We approach this problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to AI. We consider four high-level factors
3、that can contribute to a delegation deci- sion: motivation, diffi culty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from aca- demic papers, popular media portrayal of AI, and everyday life, and administer a survey ba
4、sed on our proposed framework. We fi nd little preference for full AI control and a strong preference for machine-in-the-loop designs, in which hu- mans play the leading role. Among the four factors, trust is the most correlated with human preferences of optimal human-machine delegation. This framew
5、ork represents a fi rst step towards characterizing human preferences of AI automation across tasks. We hope this work encourages future efforts towards understand- ing such individual attitudes; our goal is to inform the public and the AI research community rather than dictating any direction in te
6、chnology development. 1Introduction Recent developments in machine learning have led to signifi cant excitement about the promise of artifi cial intelligence. Ng 35 claims that “artifi cial intelligence is the new electricity.” Artifi cial intelligence indeed approaches or even outperforms human-lev
7、el intelligence in critical domains such as hiring, medical diagnosis, and judicial systems 7, 10, 12, 23, 29. However, we also observe growing concerns about which problems are appropriate applications of AI. For instance, a recent study used deep learning to predict sexual orientation from images
8、45. This study has caused controversy 32, 34: Glaad and the Human Rights Campaign denounced the study as “junk science” that “threatens the safety and privacy of LGBTQ and non-LGBTQ people alike” 2. In general, researchers also worry about the impact on jobs and the future of employment 14, 42, 44.
9、Such excitement and concern begs a fundamental question at the interface of artifi cial intelligence and human society: which tasks should be delegated to AI, and in what way? To answer this question, we need to at least consider two dimensions. The fi rst dimension is capability. Machines may excel
10、 at some tasks, but struggle at others; this area has been widely explored since Fitts fi rst tackled function allocation in 1951 13, 36, 38. The goal of AI research has also typically focused on pushing the boundaries of machine ability and exploring what AI can do. The second dimension is human pr
11、eferences, i.e., what role humans would like AI to play. The automation of some tasks is celebrated, while others should arguably not be automated for reasons beyond capability alone. For instance, automated civil surveillance may be disastrous from ethical, privacy, and legal standpoints. Motivatio
12、n is another reason: no matter the quality of machine text generation, it is unlikely that an aspiring writer will derive the same satisfaction or value from 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. delegating their writing to a fully automated syst
13、em. Despite the clear importance of understanding human preferences, the question of what AI should do remains understudied in AI research. In this work, we present the fi rst empirical study to understand how different factors relate to human preferences of delegability, i.e., to what extent AI sho
14、uld be involved. Our contributions are three- fold. First, building on prior literature on function allocation, mixed-initiative systems, and trust and reliance on machines 20, 27, 36, inter alia, we develop a framework of four factors motivation, diffi culty, risk, and trust to explain task delegab
15、ility. Second, we construct a dataset of diverse tasks ranging from those found in academic research to ones that people routinely perform in daily life. Third, we conduct a survey to solicit human evaluations of the four factors and delegation pref- erences, andvalidatetheeffectivenessofourframewor
16、k. Wefi ndthatoursurveyparticipantsseldom prefer full automation, but value AI assistance. Among the four factors, trust is the most correlated with human preferences of delegability. However, the need for interpretability, a component in trust, does not show a strong correlation with delegability.
17、Our study contributes towards a framework of task delegability and an evolving database of tasks and their associated human preferences. 2Related Work As machine learning grows further embedded in society, human preferences of AI automation gain relevance. We believe surveying, tracking, and underst
18、anding such preferences is important. How- ever, apart from human-machine integration studies on specifi c tasks, the primary mode of thinking in AI research is towards automation. We summarize related work in three main areas. Task allocation and delegation. Several studies have proposed theories o
19、f delegation in the context of general automation 6, 33, 36, 38. Function allocation examines how to best divide tasks based on human and machine abilities 13, 36, 38. Castelfranchi and Falcone 6 emphasize the role of risk, uncertainty, and trust in delegation, which we build on in developing our fr
20、amework. Milewski and Lewis 33 suggest that people may not want to delegate to machines in tasks characterized by low trust or low confi dence, where automation is unnecessary, or where automation does not add to utility. In the context of jobs and their susceptibility to automation, Frey and Osborn
21、e 14 fi nd social, creative, and perception-manipulation requirements to be good prediction criteria for machine ability. Parasuraman et al.s Levels of Automation is the closest to our work 36. However, their work is primarily concerned with performance-based criteria (e.g., capability, reliability,
22、 cost), while our interest involves human preferences. Shared-control design paradigms. Many tasks are amenable to a mix of human and machine control. Mixed-initiative systems and collaborative control have gained traction over function allo- cation, driven by the need for fl exibility and adaptabil
23、ity, and the importance of a users goals in optimizing value-added automation 4, 20, 21. We broadly split work on shared-control systems into two categories. We fi nd this split more fl exible and practical for our application than the Levels of Automation. On one side, we have human-in- the-loop ma
24、chine learning designs, wherein humans assist machines. The human role is to oversee, handle edge cases, augment training sets, and refi ne system outputs. Such designs enjoy prevalence in applications from vision and recognition to translation 5, 11, 18. Alternatively, a machine-in- the-loop paradi
25、gm places the human in the primary position of action and control while the machine assists. Examples of this include a creative-writing assistance system that generates contextual sug- gestions 8, 41, and predicting situations in which people are likely to make judgmental errors in decision-making
26、1. Even tasks which should not be automated may still benefi t from machine assistance 16, 17, 24, especially if human performance is not the upper bound as Kleinberg et al. found in judge bail decisions 23. Trust and reliance on machines. Finally, we consider the communitys interest in trust. As au
27、- tomation grows in complexity, complete understanding becomes impossible; trust serves as a proxy for rational decisions in the face of uncertainty, and appropriate use of technology becomes criti- cal 27. As such, calibration of trust continues to be a popular avenue of research 15, 28. Lee and Se
28、e 27 identify three bases of trust in automation: performance, process, and purpose. Perfor- mance describes the automations ability to reliably achieve the operators goals. Process describes the inner workings of the automation; examples include dependability, integrity, and interpretability (in pa
29、rticular, interpretable ML has received signifi cant interest 9, 22, 39, 40). Finally, purpose refers to the intent behind the automation and its alignment with the users goals. 2 FactorsComponents MotivationIntrinsic motivation, goals, utility Diffi cultySocial skills, creativity, effort required,
30、expertise required, human ability RiskAccountability, uncertainty, impact TrustMachine ability, interpretability, value alignment Table 1: An overview of the four factors in our AI task dele- gability framework. Motivation Delegation Decision DifficultyTrust Risk Figure 1: Factors behind task dele-
31、gability. 3A Framework for Task Delegability To explain human preferences of task delegation to AI, we develop a framework with four factors: a persons motivation in undertaking the task, their perception of the tasks diffi culty, their perception of the risk associated with accomplishing the task,
32、and fi nally their trust in the AI agent. We choose these factors because motivation, diffi culty, and risk respectively cover why a person chooses to performatask, theprocessofperformingatask, andtheoutcome, whiletrustcapturestheinteraction between the person and the AI. We now explain the four fac
33、tors, situate them in literature, and present the statements used in the surveys to capture each component. Table 1 presents an overview. Motivation. Motivation is an energizing factor that helps initiate, sustain, and regulate task-related actions by directing our attention towards goals or values
34、25, 30. Affective (emotional) and cogni- tive processes are thought to be collectively responsible for driving action, so we consider intrinsic motivation and goals as two components in motivation 30. We also distinguish between learning goals and performance goals, as indicated by Goal Setting Theo
35、ry 31. Finally, the expected utility of a task captures its value from a rational cost-benefi t analysis perspective 20. Note that a task may be of high intrinsic motivation yet low utility, e.g., reading a novel. Specifi cally, we use the following statements to measure these motivation components
36、in our surveys: 1. Intrinsic Motivation: “I would feel motivated to perform this task, even without needing to; for example, it is fun, interesting, or meaningful to me.” 2. Goals: “I am interested in learning how to master this task, not just in completing the task.” 3. Utility: “I consider this ta
37、sk especially valuable or important; I would feel committed to com- pleting this task because of the value it adds to my life or the lives of others.” Diffi culty. Diffi culty is a subjective measure refl ecting the cost of performing a task. For delegation, we frame diffi culty as the interplay bet
38、ween task requirements and the ability of a person to meet those requirements. Some tasks are diffi cult because they are time-consuming or laborious; others, because of the required training or expertise. To differentiate the two, we include effort required and expertise required as components in d
39、iffi culty. The third component, belief about abilities possessed, can also be thought of as task-specifi c self-confi dence (also called self-effi cacy 3) and has been empirically shown to predict allocation strategies between people and automation 26. Additionally, we contextualize our diffi culty
40、 measures with two specifi c skill requirements: the amount of creativity and social skills required. We choose these because they are considered more diffi cult for machines than for humans 14. 1. Social skills: “This task requires social skills to complete.” 2. Creativity: “This task requires crea
41、tivity to complete.” 3. Effort: “This task requires a great deal of time or effort to complete.” 4. Expertise: “It takes signifi cant training or expertise to be qualifi ed for this task.” 5. (Perceived) human ability: “I am confi dent in my own/a qualifi ed persons ability to complete this task.” 1
42、 Risk. Real-world tasks involve uncertainty and risk in accomplishing the task, so a rational decision on delegation involves more than just cost and benefi t. Delegation amounts to a bet: a choice considering the probabilities of accomplishing the goal against the risks and costs of each agent 6. P
43、erkins et al. 37 defi ne risk practically as a “probability of harm or loss,” fi nding that people rely on automation less as the probability of mortality increases. Responsibility or accountability may play a role if delegation is seen as a way to share blame 28, 43. We thus decompose risk into thr
44、ee 1 We fl ip this component in coherence analysis (Fig. 2c) so that higher lack of confi dence in human ability indicates higher diffi culty. 3 components: personal accountability for the task outcome; the uncertainty, or the probability of errors; and the scope of impact, or cost or magnitude of t
45、hose errors. 1. Accountability: “In the case of mistakes or failure on this task, someone needs to be held ac- countable.” 2. Uncertainty: “A complex or unpredictable environment/situation is likely to cause this task to fail.” 3. Impact: “Failure would result in a substantial negative impact on my
46、life or the lives of others.” Trust. Trust captures how people deal with risk or uncertainty. We use the defi nition of trust as “the attitude that an agent will help achieve an individuals goals in a situation characterized by uncertainty and vulnerability” 27. Trust is generally regarded as the mo
47、st salient factor in reliance on automation 27, 28. Here, we operationalize trust as a combination of perceived ability of the AI agent, agent interpretability (ability to explain itself), and perceived value alignment. Each of these corresponds to a component of trust in Lee and See 27: performance
48、, process, and purpose. 1. (Perceived) machine ability: “I trust the AI agents ability to reliably complete the task.” 2. Interpretability: “Understanding the reasons behind the AI agents actions is important for me to trust the AI agent on this task (e.g., explanations are necessary).” 2 3. Value a
49、lignment: “I trust the AI agents actions to protect my interests and align with my values for this task.” Degree of delegation. We develop this framework of motivation, diffi culty, risk, and trust to explain human preferences of delegation. To measure human preferences, we split the degree of deleg
50、ation into the following four categories (refer to Supplementary Material for the wordings in the survey): 1. No AI assistance: the person does the task completely on their own (henceforth, “human only”). 2. The human leads and the AI assists: the person does the task mostly on their own, but the AI
51、 offers recommendations or help when appropriate (e.g., human gets stuck or AI sees possible mistakes) (henceforth, “machine in the loop”). 3. The AI leads and the human assists: the AI performs the task, but asks the person for sugges- tions/confi rmation when appropriate (henceforth, “human in the
52、 loop”). 4. Full AI automation: decisions and actions are made automatically by the AI once the task is assigned; no human involvement (henceforth, “AI only”). Fig. 1 presents our expectation of how motivation, diffi culty, risk, and trust relate to delegability. Motivationdescribeshowinvestedsomeon
53、eisinthetask, includinghowmuchefforttheyarewilling to expend, while diffi culty determines the amount of effort the task requires. Therefore, we expect diffi culty and motivation to relate to each other: we hypothesize that people are more likely to delegate tasks which they fi nd diffi cult (or hav
54、e low confi dence in their abilities), and less likely to delegate tasks which they are highly invested in. Risk refl ects uncertainty and vulnerability in performing a task, the situational conditions necessary for trust to be salient 27. We thus expect risk to moderate trust. Finally, we hypothesi
55、ze that the correlation between components within each factor should greater than that across different factors, i.e., factors should show coherence in component measurements. 4A Task Dataset and Survey Results To evaluate our framework empirically, we build a database of diverse tasks covering sett
56、ings rang- ing from academic research to daily life, and develop and administer a survey to gather perceptions of those tasks under our framework. We examine survey responses through both quantitative analy- ses and qualitative case studies. 4.1A Dataset of Tasks We choose 100 tasks drawn from acade
57、mic conferences, popular discussions in the media, well- known occupations, and mundane tasks people encounter in their everyday lives. These tasks are generally relevant to current AI research and discussion, and present challenging delegability deci- sions with which to evaluate our framework. Exa
58、mple tasks can be found in 4.3. To additionally balance the variety of tasks chosen, we categorize them as art, creative, business, civic, entertain- ment, health, living, and social, and keep a minimum of 7 tasks of each (a task can belong to multiple 2 We fl ip this component in coherence analysis
59、 (Fig. 2c and 2d) so that higher lack of need for interpretability indicates higher trust. 4 categories; refer to Supplementary Material for details). Ideally, the tasks would cover the entire au- tomation “task space”; our task set is intended as a reasonable starting point. Since some tasks, e.g., medical diagnosis, require expertise, and since motivation does not apply if the subject does not personally perform the task, we develop two survey versions. Personal survey. We include all the four
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 家庭教育政策的未來(lái)發(fā)展趨勢(shì)預(yù)測(cè)
- 教育政策背景下教師職業(yè)發(fā)展現(xiàn)狀分析
- 中職數(shù)學(xué)課件下載
- 醫(yī)療健康中的教育心理學(xué)引導(dǎo)患者自主康復(fù)
- 醫(yī)療繼續(xù)教育培訓(xùn)與教育心理學(xué)的關(guān)系
- 從心理學(xué)角度探討學(xué)生的學(xué)習(xí)動(dòng)力來(lái)源
- 教育心理學(xué)在提升學(xué)生自主學(xué)習(xí)能力中的作用
- 智慧城市公共安全體系建設(shè)及營(yíng)銷規(guī)劃
- 教育心理學(xué)視角下的教育改革方向探索
- 教育地產(chǎn)的未來(lái)規(guī)劃與布局
- 北京石油化工學(xué)院大一高等數(shù)學(xué)上冊(cè)期末考試卷及答案
- 公路基本建設(shè)工程概算、預(yù)算編制辦法交公路發(fā)〔1996〕
- 腦卒中的早期護(hù)理和康復(fù)
- 九宮格智力數(shù)獨(dú)200題(題+答案)
- 消防新聞寫(xiě)作培訓(xùn)課件
- 國(guó)家電網(wǎng)公司施工項(xiàng)目部標(biāo)準(zhǔn)化工作手冊(cè)
- 言語(yǔ)治療練習(xí)卷附答案
- 湖州旺盟生物科技有限公司年產(chǎn)3億個(gè)環(huán)??山到饧埲萜?、1500噸食品用降解材料塑料包裝容器項(xiàng)目環(huán)境影響報(bào)告
- AQL抽樣標(biāo)準(zhǔn)表完整
- 站臺(tái)雨棚施工方案
- MySQL數(shù)據(jù)庫(kù)教程PPT完整全套教學(xué)課件
評(píng)論
0/150
提交評(píng)論