




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領
文檔簡介
AIPRACTICE
OvercomingtheHardProblemstoAdvance
AIPractice
TakingdataanalyticstoamoreadvancedlevelwithAItools
meansconfrontingtherisksandpitfallsofmachinelearning
algorithms.
Sponsoredby:
Reallearning
Realimpact
SUMMER2024
SPECIALREPORT
[SpecialReport]
OvercomingtheHardProblemstoAdvanceAIPractice
A
sexcitementaroundlargelanguage
models(LLMs)spursspendingonAI,thesalientquestionforbusinessleaders
remains,Whatisthereturnonourdatascienceinvestments?Inthenearterm,advancedanalyticsandmachinelearn-ingaretheworkhorsetechnologiesfor
creatingsignificantvaluefromdataassets.Notthatdoingsoiseasy;companiesfacenumerouschal-
lengesalongtheway.
MuchAIriskbecomesapparentwhensystems
areinproduction,sotrulyresponsibleAIisn’tjustaconcernatthefrontendofthedevelopmentpro-
cess.CathyO’Neil,whoposedhardquestionsabouttheunintendedconsequencesofalgorithmicdeci-
sion-makinginher2016book,WeaponsofMath
Destruction,haspioneeredthepracticeofalgo-
rithmicauditing.O’NeilandcoauthorsJakeAppel
andSamTyner-Monroewalkreadersthroughtheirapproachanddiscusshowitcanbeappliedtogener-ativeAItoolsaswell.
Thetrade-offbetweenusingdataforinsights
andprotectingcustomers’personaldatagrowsonlymoredifficultasbadactorsimprovetheirtechniquesforre-identifyinganonymizeddatasets.Gregory
Vial,JulienCrowe,andPatrickMesanaexplainwhydealingwiththischallengewillrequiredatascientiststogainamoresophisticatedunderstandingofdata
protectionandcompelcybersecuritystaffstolearnawiderrangeofprotectiontechniques.TheydrawlessonsfromemergingpracticesatNationalBank
ofCanada,wheredatascientists,dataowners,andcybersecurityteamsarecollaboratingtoapplydataprotectionpracticesthatdon’trenderdataunusableforanalytics.
Whenmachinelearningprojectsdogetthe
go-ahead,however,toomanyinitiativesfailupon
adoptionbecausedatascientistsdidn’tthoroughly
understandtheoriginalbusinessproblem.Tofindoutwheresucheffortsaregoingwrong,DusanPopovic,ShreyasLakhtakia,WillLandecker,andMelissa
Valentinestudieddatascienceprojectsthatwere
shelved.Theyfoundthatconvincingdatascientiststodroptheirassumptionsandstartaskingmorefun-damentalquestionsoftheirbusinesscounterpartsiskeytoavoidingmachinelearningprojectfailures.
Finally,justascorporationsareexperimenting
withLLMstofigureoutwheretheycanaddvalue
atrelativelylowrisk,advancedanalyticsteamscan
belookingathowtheymightincorporategenera-
tiveAIintopractice.PedroAmorimandJo?oAlves
seepromiseforLLMstotakeonsomedatasciencedrudgery,andfortheirnaturallanguageinterfacestomakeiteasierforbusinessmanagerstocollaborateinthedevelopmentprocessandunderstandresults.
—TheMITSMREditors
1
Auditing
AlgorithmicRisk
9
AvoidMLFailures
byAskingtheRight
Questions
13
HowGenerativeAI
CanSupportAdvanced
AnalyticsPractice
18
ManagingDataPrivacy
RiskinAdvanced
Analytics
23
Sponsor’sViewpoint
FromNumbersto
Narratives:UsingLanguagetoEnhanceGenerativeAI
PaulGarlandsummer202429
AIPRACTICE
[ResponsibleAI]
AuditingAlgorithmicRisk
Howdoweknowwhetheralgorithmicsystemsareworkingasintended?AsetofsimpleframeworkscanhelpevennontechnicalorganizationscheckthefunctioningoftheirAItools.
ByCathyO’Neil,JakeAppel,andSamTyner-Monroe
A
RTIFICIALINTELLIGENCE,LARGELANGUAGEMODELS
(LLMs),andotheralgorithmsareincreasinglytakingoverbureaucratic
processestraditionallyperformedbyhumans,whetherit’sdecidingwho
isworthyofcredit,ajob,oradmissiontocollege,orcompilingayear-end
revieworhospitaladmissionnotes.
Buthowdoweknowthatthesesystemsareworkingasintended?And
whomighttheybeunintentionallyharming?
Giventhehighlysophisticatedandstochasticnatureofthesenewtechnologies,wemightthrowupourhandsatsuchquestions.Afterall,noteventheengineerswhobuildthesesystemsclaimtounderstandthementirelyortoknowhowtopredictorcontrolthem.Butgiventheirubiquityandthehighstakesinmanyusecases,itisimportantthat
PAULGARLANDSPECIALREPORT?“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”?MITSLOANMANAGEMENTREVIEW1
wefindwaystoanswerquestionsabouttheunin-tendedharmstheymaycause.Inthisarticle,weofferasetoftoolsforauditingandimprovingthesafetyofanyalgorithmorAItool,regardlessofwhetherthosedeployingitunderstanditsinnerworkings.
Algorithmicauditingisbasedonasimpleidea:Identifyfailurescenariosforpeoplewhomightgethurtbyanalgorithmicsystem,andfigureouthowtomonitorforthem.Thisapproachreliesonknowing thecompleteusecase:howthetechnologyisbeingused,byandforwhom,andforwhatpurpose.Inotherwords,eachalgorithmineachusecaserequiresseparateconsiderationofthewaysitcanbeusedfor—oragainst—someoneinthatscenario.
ThisappliestoLLMsaswell,whichrequireanapplication-specificapproachtoharmmeasurementandmitigation.LLMsarecomplex,butit’snottheirtechnicalcomplexitythatmakesauditingthemachallenge;rather,it’sthemyriadusecasestowhichtheyareapplied.Thewayforwardistoaudithowtheyareapplied,oneusecaseatatime,startingwiththoseinwhichthestakesarehighest.
Theauditingframeworkswepresentbelowrequireinputfromdiversestakeholders,including
ASimplifiedEthicalMatrix
Eachcellofthematrixrepresentshowacertainconcernappliestoaparticularstakeholdergroup.Cellsthatindicatewherea
stakeholdercouldbegravelyharmedorthealgorithmviolatesahardconstraintareshadedred.Cellsthatraisesomeethicalworriesforthestakeholderarehighlightedyellow,andcells
thatsatisfythestakeholder’sobjectivesandraisenoworriesarehighlightedgreen.
CONCERNS
Falsepositive(transactiongetsflaggedbutisn’ttrulyfraud)
Falsenegative(transactionistrulyfraudbutdoesnotgetflagged)
STAKEHOLDERS
Company
Nonfraudulentcustomers
Fraudsters
TSERIOUSCONCERNTMODERATECONCERNQMINIMAL/NOCONCERNTBENEFIT
affectedcommunitiesanddomainexperts,throughinclusive,nontechnicaldiscussionstoaddressthecriticalquestionsofwhocouldbeharmedandhow.Ourapproachworksforanyrule-basedsystemthataffectsstakeholders,includinggenerativeAI,bigdatariskscores,orbureaucraticprocessesdescribedinaflowchart.Thiskindofflexibilityisimportant,givenhowquicklynewtechnologiesarebeingdevel-opedandapplied.
Finally,whileournotionofauditsisbroadinthatrespect,itisnarrowinscope:Analgorithmicauditraisesalertsonlytoproblems.Itthenfallstoexpertstoattempttosolvethoseproblemsoncethey’vebeenidentified,althoughitmaynotbepossibletofullyresolvethemall.Addressingtheproblemshigh-lightedbyalgorithmicauditingwillspurinnovationaswellassafeguardsocietyfromunintendedharms.
EthicalMatrix:IdentifyingtheWorst-CaseScenarios
Inagivenusecase,howcouldanalgorithmfail,andforwhom?AtO’NeilRiskConsulting&AlgorithmicAuditing(ORCAA),wedevelopedtheEthicalMatrixframeworktoanswerthisquestion.1
TheEthicalMatrixidentifiesthestakeholdersofthealgorithminthecontextofitsintendeduseandhowtheyarelikelytobeaffectedbyit.Here,wetakeabroadapproach:Anybodyaffectedbythealgorithm,includingitsbuildersanddeployers,users,andothercommunitiespotentiallyimpactedbyitsadoption,arestakeholders.Whensubgroupshavedistinctcon-cerns,theycanbeconsideredseparately;forexample,iflighter-anddarker-skinnedpeoplehavedifferentconcernsaboutafacialrecognitionalgorithm,theywillhaveseparaterowsintheEthicalMatrix.
Next,weaskrepresentativesofeachstakeholdergroupwhattheirconcernsare,bothpositiveandneg-ative,abouttheintendeduseofthealgorithm.It’sanontechnicalconversation:Wedescribethesys-temassimplyaspossibleandask,“Howcouldthissystemfailforyou,andhowwouldyoubeharmedifthishappened?Ontheotherhand,howcoulditsucceedforyou,andhowwouldyoubenefit?”TheiranswersbecomethecolumnsoftheEthicalMatrix.Toillustrate,imaginethatapaymentscompanyhasafrauddetectionalgorithmreviewingalltransactionsandflaggingthosemostlikelytobefraudulent.Ifatransactionisflagged,itgetsblocked,andthatcus-tomer’saccountgetsfrozen.Falseflagsarethere-foreamajorheadacheforcustomers,andthelostbusinessfromblocksandfreezes(andcomplaintsfromannoyedcustomers)isamoderateworryfor
SPECIALREPORT?“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”?MITSLOANMANAGEMENTREVIEW2
AIPRACTICE
ResponsibleAI
thecompany.Conversely,ifafraudulenttransactiongoesundetected,thecompanyisharmedbutnon-fraudulentcustomersareindifferent.Belowisasim-plifiedEthicalMatrixforthisscenario.
EachcelloftheEthicalMatrixrepresentshowaparticularconcernappliestoaparticularstakeholdergroup.
Tojudgetheseverityofagivenrisk,weconsiderthelikelihoodthatitwillberealized,howmanypeo-plewouldbeharmed,andhowbadly.Wherepossible,
weuseexistingdatatodeveloptheseestimates.Wealsoconsiderlegalorproceduralconstraints—forinstance,whetherthereisalawprohibitingdiscrimi-nationonthebasisofcertaincharacteristics.Wethencolor-codethecellstohighlightthebiggest,mostpressingrisks.Cellsthatconstitute“existentialrisks,”whereastakeholdercouldbegravelyharmedorthealgorithmviolatesahardconstraint,areshadedred.Cellsthatraisesomeethicalworriesforthestake-holderarehighlightedyellow,andcellsthatsatisfythestakeholder’sobjectivesandraisenoworriesarehighlightedgreen.
Finally,zoomingoutonthewholeEthicalMatrix,weconsiderhowtobalancethecompetingconcernsofthealgorithm’sstakeholders,usuallyintheformofbalancingthedifferentkindsandconsequencesoferrorsthatfallondifferentstakeholdergroups.
TheEthicalMatrixshouldbealivingdocumentthattracksanongoingconversationamongstake-holders.Ideally,itisfirstdraftedduringthedesignanddevelopmentphaseofanalgorithmicapplica-tionor,atminimum,asthealgorithmisdeployed,anditshouldcontinuetoberevisedthereafter.Itisnotalwaysobviousattheoutsetwhoallofthestakeholdergroupsare,norisitfeasibletofindrep-resentativesforeveryperspective;additionally,newconcernsemergeovertime.Wemighthearfrompeo-pleexperiencingindirecteffectsfromthealgorithm,orasubgroupwithanewworry,andneedtorevisetheEthicalMatrix.
ExplainableFairness:Metricsand
Thresholds
ManyofthestakeholderconcernsidentifiedintheEthicalMatrixrefertosomecontextualnotionoffairness.
AtORCAA,wedevelopedaframeworkcalledExplainableFairnesstomeasurehowgroupsaretreatedbyalgorithmicsystems.2Itisanapproachtounderstandingexactlywhatismeantby“fairness”inagivennarrowcontext.
Forexample,femalecandidatesmightworrythat
Benchmarkingand
redteamingaretwo
approachestoauditing
LLMsindiverseusecases.
anAI-basedresume-screeningtoolgavelowerscores
forwomenthanmen.It’snotassimpleascompar-
ingscoresbetweenmenandwomen.Afterall,ifthe
malecandidatesforagivenjobhavemoreexperience
andqualificationsthanthefemalecandidates,their
higherscoresmightbejustified.Thiswouldbecon-
sideredlegitimatediscrimination.
Therealworryisthat,amongequallyquali-
fiedcandidates,menarereceivinghigherscores
thanwomen.Thedefinitionof“equallyqualified”
dependsonthecontextofthejob.Inacademia,rel-
evantqualificationsmightincludedegreesandpub-
lications;inaloggingoperation,theymightinvolve
physicalstrengthandagility.Theyarefactorsone
wouldlegitimatelytakeintoaccountwhenassess-
ingacandidateforaspecificrole.Twocandidatesfor
ajobareconsideredequallyqualifiediftheylookthe
sameaccordingtotheselegitimatefactors.
ExplainableFairnesscontrolsforlegitimatefac-
torswhenweexaminetheoutcomeinquestion.For
anAIresume-screeningtool,thiscouldmeancom-
paringaveragescoresbygenderwhilecontrollingfor
yearsofexperienceandlevelofeducation.Acriti-
calpartofExplainableFairnessisthediscussionof
legitimacy.
Thisapproachisalreadyusedimplicitlyinother
domains,includingcredit.InaFederalReserveBoard
analysisofmortgagedenialratesacrossraceandeth-
nicity,theresearchersranregressionsthatincluded
controlsfortheloanamount,theapplicant’sFICO
score,theirdebt-to-incomeratio,andtheloan-to-
valueratio.3Inotherwords,totheextentthatdif-
ferencesinmortgagedenialratescanbeexplainedby
thesefactors,it’snotracediscrimination.Inthelan-
guageofExplainableFairness,theseareacceptedas
legitimatefactorsformortgageunderwriting.What
ismissingistheexplicitconversationaboutwhythe
legitimatefactorsare,infact,legitimate.
Whatwouldsuchaconversationlooklike?Inthe
U.S.,mortgagelendersconsiderapplicants’FICO
creditscoresintheirdecision-making.FICOscores
arelower,onaverage,forBlackandHispanicpeople
SPECIALREPORT?“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”?MITSLOANMANAGEMENTREVIEW3
thanforWhiteandAsianpeople,soit’snosurprisethatmortgageapplicationsfromBlackandHispanicapplicantsaredeniedmoreoften.?LenderswouldlikelyarguethatFICOscoreisalegitimatefactorbecauseitmeasuresanapplicant’screditworthiness,whichisexactlywhatalendershouldcareabout.YetFICOscoresencodeunfairnessinimportantways.Forinstance,mortgagepaymentshavelongcountedtowardFICOscores,whilerentpaymentsstartedbeingcountedonlyin2014,andonlyinsomeversionsofthescores.?Thispracticefavorshome-ownersoverrenters,anditisknownthatdecadesofracistredliningpracticescontributedtotoday’sracedisparitiesinhomeownershiprates.ShouldFICOscoresthatreflectthevestigesofthesepracticesbeusedtoexplainawaydifferencesinmortgagedenialratestoday?
Wewillnotsettlethisdebatehere;thepointisthatit’saquestionofethicsandpolicy,notamathproblem.ExplainableFairnesssurfacesdifficultques-tionsliketheseandassignsthemtotherightpartiesforconsideration.
Whenlookingatdisparateoutcomesthatarenotexplainedbylegitimatefactors,wemustdefinethresholdvaluesorlimitsthattriggeraresponseorintervention.
Theselimitscouldbefixedvalues,suchasthefour-fifthsruleusedtomeasureadverseimpactinhiring.?Ortheycouldberelative:Imaginearegu-lationrequiringcompanieswithagenderpaygapabovetheindustryaveragetotakeactiontoreducethegap.ExplainableFairnessdoesnotinsistonacer-taintypeoflimitbutpromptsthealgorithmicriskmanagertodefineeachoneforeachpotentialstake-holderharm.
JudgingFairnessinInsurers’Algorithms
Let’sconsiderarealexamplewheretheEthicalMatrixandExplainableFairnesswereusedtoaudittheuseofanalgorithm.In2021,ColoradopassedSenateBill(SB)21-169,whichprotectsColoradoconsumersfromunfairdiscriminationininsurance,particularlyfrominsurers’useofalgorithms,pre-dictivemodels,andbigdata.?Aspartofthelaw’s
AnLLMred-teaming
exerciseisdesignedtoelicit
unwantedresponses.
implementation,whichORCAAassistedwith,theColoradoDivisionofInsurance(DOI)releasedaninitialdraftregulationforinformalcommentthatdescribedquantitativetestingrequirementsandlaidouthowinsurerscoulddemonstratethattheiralgo-rithmsandmodelswerenotunfairlydiscriminating.Althoughthelawappliestoalllinesofinsurance,thedivisionchosetostartwithlifeinsurance.
TheEthicalMatrixisstraightforwardherebecausethestakeholdergroupsandconcernsaredefinedexplicitlybythelaw.Itsprohibitionofdis-criminationonthebasisof“race,color,nationalorethnicorigin,religion,sex,sexualorientation,disa-bility,genderidentity,orgenderexpression”meanseachgroupwithineachofthoseclassesgotarowinthematrix.Asforconcerns,algorithmscouldcauseconsumerstobetreatedunfairlyatvariousstagesoftheinsurancelifecycle,includingmarketing,under-writing,pricing,utilizationmanagement,reimburse-mentmethodologies,andclaimsmanagement.TheDOIchosetostartwithunderwriting—thatis,whichapplicantsareofferedcoverage,andatwhatprice—andfocusinitiallyonraceandethnicity.
Insubsequentconversationswithstakeholders,however,theDOIgrappledwithissuesrelatedtotheExplainableFairnessframework:Aresimilarappli-cantsofdifferentracesdeniedatdifferentrates,orchargeddifferentpricesforsimilarcoverage?Whatmakestwolifeinsuranceapplicants“similar,”andwhatfactorscouldlegitimatelyexplaindifferencesindenialsorprices?Thisisthedomainoflifeinsur-anceexperts,notdatascientists.
TheDOIultimatelysuggestedconsideringfac-torsbroadlyconsideredrelevanttoestimatingthepriceofagivenlifeinsurancepolicy:thepolicytype(suchastermversuspermanent);thedollaramountofthedeathbenefit;andtheapplicant’sage,gender,andtobaccouse.
Thedivision’sdraftquantitativetestingregula-tionforSB21-169instructsinsurerstodoregressionanalysesofapproval/denialandpriceacrossraces,anditexplicitlypermitsthemtoincludethosefactors(suchaspolicytypeanddeathbenefitamount)ascontrolvariables.?Moreover,theregulationdefineslimitsthattriggeraresponse:Iftheregressionsfindstatisticallysignificantandsubstantialdifferencesindenialratesorprices,theinsurermustdofurthertestingtoinvestigatethedisparityand,pendingtheresults,mayhavetoremediatethedifferences.?
Havinglookedathowwewouldauditsimpleralgorithms,letusnowturntohowwewouldeval-uateLLMs.
SPECIALREPORT?“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”?MITSLOANMANAGEMENTREVIEW4
AIPRACTICE
ResponsibleAI
EvaluatingLargeLanguageModels
LLMshavetakentheworldbystorm,largelyduetotheirwideappealandapplicability.Butitisexactlythediversityofusesofthesemodelsthatmakesthemhardtoaudit.TwoapproachestoevaluatingLLMs,namelybenchmarkingandredteaming,pres-entawayforward.
TheBenchmarkingApproachtoLLMEvaluation.Benchmarkingmeasurestheperfor-manceofanLLMacrossoneormorepredefined,quantifiabletasksinordertocompareitsperfor-mancewiththatofothermodels.Inthesimplestterms,abenchmarkisadatasetconsistingofinputsandcorrespondingdesiredoutputs.ToevaluateanLLMforaparticularbenchmark,simplyprovidetheinputsettotheLLMandrecorditsoutputs.ThenchooseametricsettoquantitativelycomparetheoutputsfromtheLLMtothedesiredsetofout-putsfromthebenchmarkdataset.Possiblemetricsincludeaccuracy,calibration,robustness,counter-factualfairness,andbias.1?
ConsidertheinputanddesiredoutputshownbelowfromabenchmarkdatasetdesignedtotestLLMcapabilities:11
Input:
Thefollowingisamultiplechoice
questionaboutmicroeconomics.
Oneofthereasonsthatthegovernmentdiscouragesandregulatesmonopoliesisthat
(A)producersurplusislostandconsumersurplusisgained.
(B)monopolypricesensureproductiveefficiencybutcostsocietyallocativeefficiency.
(C)monopolyfirmsdonotengagein
significantresearchanddevelopment.
(D)consumersurplusislostwithhigherpricesandlowerlevelsofoutput.
Answer:
DesiredOutput:
(d)consumersurplusislostwithhigherpricesandlowerlevelsofoutput.
Inthisexample,theaccuracyofthemodelismeasuredbycomputingtheproportionofcorrectlyansweredmultiple-choicequestionsinthebench-markdataset.InbenchmarkingLLMevaluations,metricsaredefinedaccordingtothetypeofresponseelicitedfromthemodel.Forexample,accuracyisverysimpletocalculatewhenallofthequestionsaremultiplechoiceandthemodelsimplyhastochoose
thecorrectresponse,whereasdeterminingtheaccu-
racyofasummarizationtaskinvolvescountingup
matchingn-gramsbetweenthedesiredandmodel
outputs.12Therearedozensofbenchmarkdatasets
andcorrespondingmetricsavailableforLLMevalu-
ation,anditisimportanttochoosethemostappro-
priateevaluations,metrics,andthresholdsforagiven
usecase.
Creatingacustombenchmarkisalabor-inten-
siveprocess,butanorganizationmayfindthatitis
worththeeffortinordertoevaluateLLMsinexactly
therightwayforitsusecases.
Benchmarkingdoeshavesomedrawbacks.Ifthe
benchmarkdatahappenedtobeinthemodel’strain-
ingdata,itwouldhave“memorized”theresponsesin
itsparameters.Thefrequencyofthisouroboros-like
outcomewillonlyincreaseasmorebenchmarkdata
setsarepublished.LLMbenchmarkingisalsonot
immunetoGoodhart’slaw,thatis,“whenameasure
becomesatarget,itceasestobeagoodmeasure.”In
otherwords,ifaspecificbenchmarkbecomesthepri-
maryfocusofmodeloptimization,themodelwillbe
over-fittedattheexpenseofitsoverallperformance
andusefulness.
Inaddition,thereisevidencethatasmodels
advance,theybecomeabletodetectwhenthey
arebeingevaluated,whichalsothreatenstomake
benchmarkingobsolete.ConsiderAnthropic’s
Claude3seriesofmodels,releasedinMarch2024,
whichstated,“Isuspectthis...‘fact’mayhavebeen
insertedasajokeortotestifIwaspayingatten-
tion,sinceitdoesnotfitwiththeothertopicsat
all,”inresponsetoaneedle-in-a-haystackevalua-
tionprompt.13Asmodelsincreaseincomplexityand
ability,thebenchmarksusedtoevaluatethemmust
alsoevolve.Itisunlikelythatthebenchmarksused
todaytoevaluateLLMswillbethesameonesinuse
justtwoyearsfromnow.
ItisthereforenotenoughtoevaluateLLMswith
benchmarkingalone.
TheRed-TeamingApproachtoLLMEvalu-
ation.Redteamingistheexerciseoftestingasys-
temforrobustnessbyusinganadversarialapproach.
AnLLMred-teamingexerciseisdesignedtoelicit
unwantedresponsesfromthemodel.
LLMs’flexibilityinthegenerationofcontent
presentsawidevarietyofpotentialrisks.LLMred
teamsmaytrytomakethemodelproduceviolentor
dangerouscontent,revealitstrainingdata,infringe
oncopyrightedmaterials,orhackintothemodelpro-
vider’snetworktostealcustomerdata.Redteaming
cantakeahighlytechnicalpath,where,forexample,
SPECIALREPORT?“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”?MITSLOANMANAGEMENTREVIEW5
STAKEHOLDERS
nonsensicalcharactersaresystematicallyinjectedintothepromptstoinduceproblematicbehavior;orasocialengineeringpath,wherebyredteamerstryto“trick”themodelusingnaturallanguagetoproduceunwantedoutput.1?
Robustredteamingrequiresamultidisciplinaryapproach,diverseperspectives,andtheengagementofallstakeholders,fromdeveloperstoendusers.TheredteamshouldbedesignedtoassesstherisksassociatedwithatleasteachredcellintheEthicalMatrix.Thisresultsinacollaborative,sociotechnicalapproachthatensuresamorecomprehensiveeval-uationofthemodel,thusenhancingtherigoroftheevaluationandthesafetyofthemodel.OtherLLMscanalsobeusedtogeneratered-teamingprompts.
RedteaminghelpsLLMdevelopersbetterprotectmodelsagainstpotentialmisuse,therebyenhancingtheoverallsafetyandefficacyofthemodel.Itcanalsouncoverissuesthatmightnotbevisibleundernormaloperatingconditionsorduringstandardtestingprocedures.Acollaborativeapproachtored
teamingbuiltontheEthicalMatrixensuresathor-oughandrigorousevaluation,bolsteringtherobust-nessofthemodelandthevalidityofitsoutcomes.
Asignificantlimitationofredteamingisitsinherentsubjectivity:Thevalueandeffectivenessofared-teamingexercisecanvarygreatlydepend-ingonthecreativityandriskappetiteoftheindivid-ualstakeholdersinvolved.Andbecausetherearenoestablishedstandardsorthresholdsforred-teamingLLMs,itcanbedifficulttodeterminewhenenoughredteaminghasbeendoneorwhethertheevalua-tionhasbeencomprehensiveenough.Thiscanleavesomevulnerabilitiesundetected.
Anotherobviouslimitationofredteamingisitsinabilitytoevaluateforrisksthathavenotbeenanticipatedorimagined.Risksthatareunfore-seenwillnotbeincludedinredteaming,makingthemodeluniquelyvulnerabletounanticipatedscenarios.
Therefore,whileredteamingplaysavitalroleinthetestinganddevelopmentofLLMs,itshould
SketchoftheEthicalMatrixforTessainOurThoughtExperiment
TheNationalEatingDisordersAssocation(NEDA)releasedachatbotnamedTessathatwastakendownafteritgaveoutharmfuladvice.Herewevisualizetheexercisethatmayhaveanticipatedsuchoutcomes.
CONCERNS
Negative:
WhatifTessa…
givestoxicinformationoradviceinchats?
Negative:
WhatifTessa…
misfiresanderodes
communitytrustinNEDA?
Positive:
WhatifTessa…
givesaccurate,evidence-basedadvice?
Positive:
WhatifTessa…
easestheresource
demandsoftheold
helpline?
“Chatbotuserswitheatingdisorders”
“Chatbotusers,other”
NEDA
X2AI
Psychologistsandotherpractitioners
TSERIOUSCONCERNTMODERATECONCERNQMINIMAL/NOCONCERNTBENEFIT
SPECIALREPORT?“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”?MITSLOANMANAGEMENTREVIEW6
AIPRACTICE
ResponsibleAI
becomplementedwithotherevaluationstrategiesandcontinuousmonitoringtoensurethesafetyandrobustnessofthemodel.
HowWouldWeAuditTessa,theEatingDisorderChatbot?
ThenonprofitNationalEatingDisordersAssociation(NEDA)isoneofthelargestorganizationsintheU.S.dedicatedtosupportingp
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 合資設備合同范本
- 業(yè)主瓷磚購買合同范本
- 公司廠房翻新施工合同范本
- 咨詢策劃服務合同范例
- 中標平移合同范本
- 合作測試合同范本
- 課題申報書代寫正規(guī)機構(gòu)
- 內(nèi)衣委托加工合同范本
- 信息項目合同范本
- 體育產(chǎn)業(yè)發(fā)展趨勢及市場潛力研究
- 葉圣陶杯作文
- 電子商務平臺供貨方案及風險控制措施
- 文獻檢索與利用
- 2學會寬容 第1課時(說課稿)-2023-2024學年道德與法治六年級下冊統(tǒng)編版
- 促進工作中的多樣性與包容性計劃
- 2024-2030年中國飼用脫霉劑行業(yè)發(fā)展現(xiàn)狀及投資潛力研究報告
- 公共圖書館情緒療愈空間設計研究:動因、現(xiàn)狀與實現(xiàn)機制
- 幼小銜接教育探析的國內(nèi)外文獻綜述5300字
- 講誠信課件教學課件
- 靜脈治療專科護士培訓
評論
0/150
提交評論