人工智能:復(fù)雜算法與有效的數(shù)據(jù)保護(hù)監(jiān)督-落實(shí)數(shù)據(jù)主體權(quán)利_第1頁
人工智能:復(fù)雜算法與有效的數(shù)據(jù)保護(hù)監(jiān)督-落實(shí)數(shù)據(jù)主體權(quán)利_第2頁
人工智能:復(fù)雜算法與有效的數(shù)據(jù)保護(hù)監(jiān)督-落實(shí)數(shù)據(jù)主體權(quán)利_第3頁
人工智能:復(fù)雜算法與有效的數(shù)據(jù)保護(hù)監(jiān)督-落實(shí)數(shù)據(jù)主體權(quán)利_第4頁
人工智能:復(fù)雜算法與有效的數(shù)據(jù)保護(hù)監(jiān)督-落實(shí)數(shù)據(jù)主體權(quán)利_第5頁
已閱讀5頁,還剩17頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

SUPPORTPOOL

OFEXPERTSPROGRAMME

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision

Effectiveimplementationofdatasubjects’rights

byDr.KrisSHRISHAK

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

2

AspartoftheSPEprogramme,theEDPBmaycommissioncontractorstoprovidereportsandtoolsonspecifictopics.

TheviewsexpressedinthedeliverablesarethoseoftheirauthorsandtheydonotnecessarilyreflecttheofficialpositionoftheEDPB.TheEDPBdoesnotguaranteetheaccuracyoftheinformationincludedinthedeliverables.NeithertheEDPBnoranypersonactingontheEDPB’sbehalfmaybeheldresponsibleforanyusethatmaybemadeoftheinformationcontainedinthedeliverables.

Someexcerptsmayberedactedorremovedfromthedeliverablesastheirpublicationwouldunderminetheprotectionoflegitimateinterests,including,interalia,theprivacyandintegrityofanindividualregardingtheprotectionofpersonaldatainaccordancewithRegulation(EU)2018/1725and/orthecommercialinterestsofanaturalorlegalperson.

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

3

TABLEOFCONTENTS

Introduction 4

1Challenges 4

2Howtodeleteandunlearn 5

3Whattounlearn 7

4Approximateunlearningverification 8

5ConcernswithMachineUnlearning 8

6LimitingpersonaldataoutputfromgenerativeAI 9

Conclusion 10

Bibliography 11

DocumentsubmittedinMarch2024

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

4

INTRODUCTION

TheGeneraldataProtectionRegulation(GDPR)empowersdatasubjectsthrougharangeofrights.Adatasubjecthastherighttoinformation(Articles12-14),therightofaccess(Article15),therighttorectification(Article16),therighttoerasure(Article17),therighttorestrictprocessing(Article18),therighttodataportability(Article20),therighttoobject(Article21)andtherightnottobesubjecttoadecisionbasedsolelyonautomatedprocessing(Article22).

Thisreportcoverstechniquesandmethodsthatcanbeusedforeffectiveimplementationofdatasubjectrights,specifically,therightstorectificationandtherighttoerasurewhenAIsystemshavebeendevelopedwithpersonaldata.Thisreportaddressestheserightstogetherbecauserectificationinvolveserasurefollowedbytheinclusionofnewdata.Thesetechniquesandmethodsaretheresultofearly-stageresearchbytheacademiccommunity.Improvementsandalternativeapproachesareexpectedtobedevelopedinthecomingyears.

1CHALLENGES

AIsystemsaretrainedondatathatisoftenmemorisedbythemodels(Carlinietal.,2021).Machinelearningmodelsbehavelikelossycompressorsoftrainingdataandtheperformanceofthesemodelsbasedondeeplearningisfurtherattributedtothisbehaviour(Schelter,2020;Tishby&Zaslavsky,2015).Inotherwords,machinelearningmodelsarecompressedversionsofthetrainingdata.Additionally,AImodelsarealsosusceptibletomembershipinferenceattacksthathelptoassesswhetherdataaboutapersonisinthetrainingdataset(Shokrietal.,2017).Thus,implementingtherighttoerasureandrectificationrequiresreversingthememorisationofpersonaldatabythemodel.Thisinvolvesdeletionof(1)thepersonaldatausedasinputfortraining,and(2)theinfluenceofthespecificdatapointsinthetrainedmodel.

Thereareseveralchallengestoeffectivelyimplementtheserights(Bourtouleetal.,2021):

1.Limitedunderstandingofhoweachdatapointimpactsthemodel:Thischallengeisparticularlyprevalentwiththeuseofdeepneuralnetworks.Itisnotknownhowspecificinputdatapointsimpacttheparametersofamodel.Thebestknownmethodsrelyon“influencefunctions”involvingexpensiveestimations(bycomputingsecond-orderderivativesofthetrainingalgorithm)(Cook&Weisberg,1980;Koh&Liang,2017).

2.Stochasticityoftraining:TrainingAImodelsisusuallyperformedbyrandomsamplingofbatchesofdatafromthedataset,randomorderingofthebatchesinhowandwhentheyareprocessed,andparallelisationwithouttime-synchronisation.Allthesemakethetrainingprocessprobabilistic.Asaresult,amodeltrainedwiththesamealgorithmanddatasetcouldresultindifferenttrainedmodels(Jagielskietal.,2023).

3.Incrementaltrainingprocess:Modelsaretrainedincrementallysuchthatanupdaterelyingonspecifictrainingdatapointwillaffectallsubsequentupdates.Inotherwords,updatesinthetrainingprocessdependonallpreviousupdates.Inthedistributedtrainingsettingoffederatedlearning,multipleclientskeeptheirdataandtrainamodellocallybeforesendingtheupdatestoacentralserver.Insuchasetting,evenwhenaclientonlyoncesendsitsupdateandcontributestotheglobalmodelatthecentralserver,thedataandthecontributionofthisclientinfluencesallfutureupdatestotheglobalmodel.

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

5

4.Stochasticityoflearning:Inadditiontothetrainingprocess,thelearningalgorithmisalsoprobabilistic.Thechoiceoftheoptimiser,forexample,forneuralnetworkscanresultinmanydifferentlocalminima(resultoftheoptimisation).Thismakesitdifficulttocorrelatehowaspecificdatapointcontributedtothe“l(fā)earning”inthemodel.

2HOWTODELETEANDUNLEARN

1.DataCurationandProvenance:EssentialelementstoimplementtherightsinArticles15-17ofGDPRaredatacurationandprovenance.However,thesearenecessarybutnotsufficientforimplementingtheserightscompletelyastheydonotincludeinformationrelatedtohowthedatainfluencedthetrainedmodel.Theseareprerequisitesfortheotherapproachesinthisreport.

2.Retrainingofmodels:Deletingthemodel,removingthepersonaldatarequestedtobeerased,andthenretrainingthemodelwiththerestofthedataisthemethodthatimplementstherightsinArticles16-17oftheGDPReffectively.Forsmallmodels,thismethodworkswell.However,forlargermodels,thetrainingcostisveryexpensiveandoftenalternativeapproachesmightberequired,especiallywhennumerousdeletionrequestsareexpected.Furthermore,thisapproach,andmanyoftheotherapproaches,assumesthatthemodeldeveloperisinpossessionofthetrainingdatasetswhentherequirementtodeleteandretrainarises.

3.Exactunlearning:Toavoidretrainingtheentiremodel,approachestounlearnthedatahavebeenproposed.Despitethegrowingliterature,thereareveryfewunlearningmethodsthatarecurrentlymostlikelytobeeffective.

a.Modelagnosticunlearning:Thismethodisnotdependentonthespecificmachinelearningtechnique.Itistheonlyapproachwhichhasbeenshowntoworkfordeepneuralnetworks.Thisapproacheither(1)reliesonstoringmodelgradients(Wuetal.,2020),or(2)reliesonthemeasurementofsensitivityofmodelparameterstochangesindatasetsusedinfederatedlearning(Taoetal.,2024),or(3)modifiesthelearningprocesstobemoreconducivetounlearning(Bourtouleetal.,2021).

Thelatter,knownasSISA(Sharded,Isolated,Sliced,andAggregated),iscurrentlythebest-knownapproach.Itinvolvesmodifyingthetrainingprocess,butisindependentofspecificlearningalgorithms(Bourtouleetal.,2021).Thisapproachpresetstheorderinwhichthelearningalgorithmisqueriedtoeasetheunlearningprocess.Theapproachcanbedescribedasfollows:

i.Thetrainingdatasetisdividedintomultiple“shards”suchthateachtrainingdatapointispresentinonlyone“shard”.Thisallowsforanon-overlappingpartitionofthedataset.Itisalsopossibletofurther“slice”the“shards”sothatthetrainingismoremodularanddeletioniseasedfurther.

ii.Themodelisthentrainedoneachoftheseshardsorslices.Thislimitstheinfluenceofthedatapointstothesespecificshardsorslices.

iii.Whenarequestforerasureorrectificationarrives,unlearningisperformed,notbyretrainingtheentiremodel,butbyretrainingonlytheshardorslicethathadincludedthe“deleterequested”data.

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

6

Thismethodisflexible.Forinstance,theshardscanbechosensuchthatthemostlikely“deleterequest”dataareinoneshard.Then,fewershardswillneedtoberetrained,assumingthatpersonaldataandnon-personaldataareseparatedaspartofdatacuration.

b.Modelintrinsicunlearning:ThesemethodsaredevelopedforspecificAItechniques.Forinstance,themethodsthataresuitablefordecisiontreesandrandomforestshavebeenshowntobeeffective(Brophy&Lowd,2021)byusinganewapproachtodevelopdecisiontreesandthenrelyingonstrategicthresholdingatdecisionnodesforcontinuousattributes,andathigh-levelrandomnodes.Thenthenecessarystatisticsarecachedatallthenodestofacilitateremovalofspecifictraininginstances,withouthavingtoretraintheentiredecisiontree.

c.Applicationspecificunlearning:Whileexactunlearningisgenerallyexpensiveintermsofcomputationandstorage,someapplicationsandtheiralgorithmsaremoresuitabletoexactunlearning.Specifically,recommendersystemsbasedonk-nearestneighbourmodelsarewellsuitedduetotheiruseofsparseinteractiondata.Suchmodelsarewidelyusedinmanytechniquesincludingcollaborativefilteringandrecentrecommendersystemapproachessuchasnext-basketrecommendation.Usingefficientdatastructures,sparsedataandparallelupdates,personaldatacanberemovedfromrecommendationsystems(Schelteretal.,2023).

4.ApproximateUnlearning:Significantamountoftechnicalliteratureonmachineunlearningfocusesonapproximateunlearning,wherethedataisnotdeleted,butinstead,themodelisadjustedsuchthattheprobabilityoftheinfluenceofthedata,estimatedbasedonproxysignals,onthemodelisreduced.Approximateunlearningislessexpensiveintermsofcomputationandstoragerequirements.

a.Finetuning:Onceamodelistrained,itcanbefinetunedformanypurposesincludingtheapproximateremovaloftheeffectofthedatathathasbeenrequestedtobedeleted(Golatkaretal.,2020;Warneckeetal.,2023).Whenadeletionrequestalongwiththe“removaldataset”(thedatatoberemoved)isreceived,themodelistrainedagainforafewepochsonthis“removaldataset”suchthatthemodel“forgets”it.

b.Influenceunlearning:Approximateunlearningapproacheshavebeenproposedthatrelyonestimatingtheinfluenceofspecificdataonthemodel(Izzoetal.,2021;Koh&Liang,2017).Thisestimationisthenusedtoupdatethemodelforunlearning,whichisakintofinetuning.Usually,theseapproachesalsorequireadditionalmodeltraining.However,toreducethecomputation,itisalsopossibletoprunethemodel(orreducethesize)beforetheunlearningprocess(Jiaetal.,2023).

c.Intentionalmisclassification:Whenarequesttodeletespecificdataaboutapersonisreceived,themodelownerintentionallymisclassifiesthesedatapoints.Thiscanbeachievedwithaccesstothepre-trainedmodelandthedatapointsprovidedbythedatasubjectwiththedeletionrequestbutdoesnotrequireaccesstotherestofthetrainingdataset(Chaetal.,2024).Anotherapproach,saliencyunlearning,tacklestheproblemofunlearningatthelevelofweightsratherthandataormodel.Itreliesonestimatingtheweightsthataremostrelevant(salient)forunlearningbeforedeployingrandomlabelsforthedatatobedeleted(Fanetal.,2024).Thisapproachhasbeenproposedforimageclassificationandgeneration.

d.Parameterdeletion:Anotherapproachtounlearnwithoutdeletingthedatafromthemodelbutremovingitsinfluenceinvolvesstoringalistofdataandparameterupdates

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

duringthetrainingprocess.Whenadeletionrequestarrives,theparameterupdatesareundone(Gravesetal.,2021).Duetotheneedtostoretheparameterupdates,thisapproachhasahighstoragerequirement,especiallyforlargemodels,althoughlessthanthatforexactunlearning.

5.Differentialprivacyandmodelretiringpolicy:Differentialprivacygivesamathematicalguaranteethatthereisaboundonthecontributionofindividualdatapointtothemodelandthatthiscontributionissmall.However,thecontributionisnotzero,

1

thusnecessitating“unlearning”(Chandrasekaranetal.,2021).Oneapproachistocombinedifferentialprivacywithapolicytoperiodicallyretireordeletethemodelandretrainadifferentiallyprivatemodel,insteadofretrainingforeverydeletionrequest.

Whenadeletionrequestisreceived,iftherelevantpersonaldataisinthepossessionofthedatacontroller,thenthedatashouldbedeleted.Themodeldeletionisnotperformedforeveryrequestbecauseitisunclearhowindividualpersonaldatapointsimpactthedifferentiallyprivatemodel.However,oncethereisasufficientlylargenumberofrequests,then,puttogether,thesedatapointswouldaffectthemodel(stillunknownhowexactly),andthusthereisreasonenoughtodeletethemodelandretrainthemodelwithdifferentialprivacy.

3WHATTOUNLEARN

1.Samples:Adeletionrequestforaspecificpieceofinformationorsampleaboutaperson.Methodsdescribedintheprevioussectionhavebeendevelopedforthissetting.

2.Features:Insomeapplications,featuresandlabelsmayholdcertainpersonalcharacteristicsthataretobedeleted.Anapproximateunlearningmethodhasbeenproposedforthispurposebyestimatingtheinfluenceofspecificfeaturesonthemodelparameters(Warneckeetal.,2023).Thismethodcanbeusedtounlearnfeaturesinatrainedmodelforthousandsofdatasubjects.Anotherapproachinvolvesestimatingthecorrelationbetweenfeaturesthatcouldrepresentthepersonalcharacteristicsandthentoprogressivelyunlearnthesefeatures(Guoetal.,2022).Thismethodismostapplicablefordeepneuralnetworksintheimagedomain,forexample,facialrecognitionsystems,wherethedeeperlayersoftheneuralnetworksaresmaller(Nguyenetal.,2022).

3.Class:AIsystemscanbedesignedtoclassifyoutputsintoone,twoormanydifferentclasses.Incertainapplications,thedatatobedeletedisrepresentedasaclassinthetrainedmodel.Insomefacialrecognitionapplications,alldatapointsaboutapersonintheformoffacialimagesbelongtoaparticularclassandifapersonrequestsfortheirpersonaldatatobedeleted,thentheclassificationshouldnotworkforthisperson’sclass.Acoupleofapproximateunlearningmethodsintroducenoisesuchthattheclassificationerrorforthedeletionclassismaximisedandthenthemodelis“repaired”tomaintaintheperformancefortherestofthedata(Chundawatetal.,2023;Tarunetal.,2024).Thesemethodsdonotdeleteallthesamplesassociatedwiththeclass,butinsteadmanipulatethetrainedmodelforthisclassdirectly.

1Itwouldbeimpossibleforamodeltolearnfromthetrainingdataifthecontributioniszero(Bourtouleetal.,2021).

7

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-

Effectiveimplementationofdatasubjects’rights

8

WhenimageclassificationorfacialrecognitiontechnologyisdevelopedbytrainingConvolutionalNeuralNetwork(CNN)modelswithfederatedlearning,theclassisselectivelyprunedbasedonextractingfeaturesintheimagesthatcontributetodifferentclasses(Wangetal.,2022).Thepersonmakingthedeletionrequestlocallyextractsthesefeaturesfortheirimagesandsendsittothecentralserver,whothenprunestheclassfromtheglobalmodel.

4.Client:WhenAIsystemsaredevelopedwithfederatedlearningthatincludescontributionfrommultipleclients,aclient(oraperson)mightrequestthattheirentirecontributiontotheglobalmodelduetotheirlocaldatasetbedeleted.Duetotheincrementaltrainingprocess,onlydeletingtheupdatestotheglobalmodelmadebythisclientisinsufficienttoremovetheinfluenceofthisclient’sdata.AnapproachknownasFedEraserstoreshistoricalparameterupdatesatacentralservertosanitiseallupdatesthatfollowedtheupdatesofthisclient(Liuetal.,2021).Thesanitisationprocessinvolvescollaborativeupdatesfromtheremainingclientswhosecontributionsarestillpartoftheglobalmodel.

4APPROXIMATEUNLEARNINGVERIFICATION

Approximateunlearningmethodshavebeenproposedwiththeclaimthattheyareindistinguishablefromretrainingthemodelfromscratchwithoutthedeleteddata.Theclaimsareusuallybasedonmetricssuchasindistinguishabilitytoahypotheticallymodelretrainedfromscratch,unlearningaccuracy,remainingaccuracyandmembershipinferenceattacks.

Unlearningaccuracyistheaccuracyoftheunlearnedmodelonthedataexpectedtobeforgotten.Remainingaccuracyistheaccuracyoftheunlearnedmodelontheremainingdata.Membershipinferenceattacks(MIAs)areusedinanattempttoextract“deleted”datafromtheupdatedmodel.Iftheprobabilityofsuchextractionisaround50%,thenthe“deletion”istreatedasasuccess.However,MIAisaprivacyattackandrelyingonitfortestingisunreliable.Awell-developedmodelwillnotbesusceptibletoMIA,inwhichcase,MIAcannotbeusedasaproxysignaltotestunlearning.

Furthermore,approximatelearninglacksstrongguarantees.Thesemetricsdonotaddressaverybasicconcern:itispossibletoobtaintwomodelswithsimilarweightsandparameterswithnon-overlappingtrainingdata(Thudietal.,2022).Thatis,removinganinfluenceofaparticularparameterisnotsufficienttohave“deleted”thedataastheinfluencecouldhavebeenfromadifferentdata.Moreover,theassumptionofhavingtounlearnamodelthatisindistinguishablefromretrainingfromscratchitselfmaynotbetherightapproach.Thisisbecauseamodelretrainedfromscratchcouldhavedifferentmodeldistributionsduetothestochasticityoftraining(Goeletal.,2022;Yang&Shami,2020).

5CONCERNSWITHMACHINEUNLEARNING

1.Privacy:Justlikemachinelearning,machineunlearningalsointroducesprivacyconcerns.Membershipinferenceattacks(Shokrietal.,2017)thathavebeenshowntoattackmachinelearningcanalsobeusedagainstmachineunlearning(Chenetal.,2021).Theconcernhereisthatwhenitispossibletoqueryamodeltwice,oncebeforeunlearningandonceafterunlearning,thepersonqueryingcoulddeducewhichdatawasdeleted.

2.Bias:Whendeletionrequestsaremade,minorityclassesaremoreadverselyaffectedbecausethedatasetsintherealworldarenotbalanced.Whenitcomestodatadeletionrequests,noteveryoneisequallylikelytomakesuchrequests.Ithasbeenshownthatthereisacorrelation

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

betweentheunlearningprobabilityandclasslabels(Koch&Soll,2023).Thus,itisimperativethataccuracyofmodelsforsub-categoriesareassessedafterunlearningtoassessforbias.

6LIMITINGPERSONALDATAOUTPUTFROMGENERATIVEAI

Theapproachesdiscussedthusfaraddressapplicationsincludingfacialrecognitiontechnologywherepersonaldataprocessingisconcerned.AIsystemsaresusceptibletoprivacyleakagesandtoadversarialattackssuchasMIA.ThisisalsotrueofgenerativeAIsystems,whichcouldgeneratepersonaldataaspartofitsoutput.TextgenerationAIbasedonlargelanguagemodelshavebeenshowntobemoresusceptibletoMIAthansmallmodels(Carlinietal.,2021).

IngenerativeAIsystems,personaldataisoutputwhenexplicitlyprompted(E.g.,Givemethebirthdateof[personname]).Thesamecantakeplacewithimageandvideogenerationtoolsaswell.Personaldataisalsooutputwhennotexplicitlyprompted.ThesegenerativeAItoolsmakethingsupor“hallucinate”(Maynezetal.,2020)andgeneratefactuallyincorrectcontentthatcouldrevealpersonaldataaboutpeople.E.g.,wheninformationaboutonepersonisaskedandalargelanguagemodeloutputsinformationaboutanotherperson(withtheirname)(D.Zhangetal.,2023).

TheareaofresearchtolimitgenerationofpersonaldatafromgenerativeAIisnew,andmuchlessmaturethanthefieldofmachineunlearning,whichbyitselfisquiteyoung.

1.Modelfinetuning:Inthecaseofdiffusionmodels(e.g.,StableDiffusion),amethodhasbeenproposedtofinetunethemodelsuchthatspecificconceptsarenotoutputintheimages(Gandikotaetal.,2023).Thismethodeliminatesvisualconceptssuchasspecificartisticstyles,nudityandcertainobjects.Asimilarapproachcanbeusedtopreventgenerationofimageswithspecificpersonalcharacteristics(E.J.Zhangetal.,2023).Anotherapproachknownas“selectiveamnesia”appliescontinuouslearningtoforgetconceptsfromgenerativemodelsbasedonvariationalautoencodersanddiffusionmodels(Heng&Soh,2024).

2.Dataredaction:Avariantofmodelfinetuningusesdataandclassredactiontechniquestolimitgenerationofspecificoutputsingenerativeadversarialnetworks(GANs).Asetofdatathatshouldnotbegeneratedisselectedasaredactionset,whichisthenusedtogeneratea“fakedistribution”suchthatoutputsfallingwithintheredactionsetarepenalized(Kong&Chaudhuri,2023).Thisapproachisbasedonsimilarapproachesthatre-trainmodelstolimitgenerationofspecificoutputs(Asokan&Seelamantula,2020;Hannekeetal.,2018;Sinhaetal.,2021).

3.Outputmodification:Theoutputofimagegeneratorscanbemodifiedtonotgeneratespecifickindsofimages.Thiscanbeachievedbytrainingamachinelearningclassifiertomodifyoutputsbeforetheyarerevealedtotheendusers(Randoetal.,2022)orbyincorporatingadditionalinformationandguidingtheinferenceprocess(Schramowskietal.,2023).Alternatively,reinforcementlearningwithhumanfeedbackcanbeused(Baietal.,2022;Ouyangetal.,2022)topreventgenerationofpersonaldata.However,suchmethodshave

manyshortcomings(Casperetal.,2023)andareshowntobeeasytocircumvent,especiallywhentheenduserhasaccesstotheparameters,asisthecasewithfullyopen-sourcemodels.

2

2/r/StableDiffusion/comments/wv2nw0/tutorial_how_to_remove_the_safety_filter_i

n_5/

9

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

10

CONCLUSION

TheGDPRoffersdatasubjectswithmanyrights.ThisreportcoverstechniquesandmethodstoimplementtherighttorectificationandtherighttoerasurewhenAIsystemsprocesspersonaldata.Implementingtheserightsischallengingbutmanytechnicalapproacheshavebeenproposed.Datacurationandprovenanceareprerequisitesfortheseapproaches.SomeofthechallengessuchasstochasticityoftrainingAImodelscanbemodifiedtomakecompliancewithdataerasurerequestseasier(Bourtouleetal.,2021).Suchdesignchoicesmighthaveperformancetrade-offbutareanaspectofdataprotectionbydesign.OtherimportantrightsofferedbytheGDPRtodatasubjectsarelefttofutureprojects.

Asastrongrecommendationregardingdataprotection,onlytheuseofcompletelyanonymiseddataforthedevelopmentanddeploymentofAImodelswouldavoidobligationsrelatedtothecorrectionanddeletionofpersonaldatainAImodels.Ifitisnecessarytousepersonaldata,includingpseudonymiseddata,todevelopanAImodelthenthelegalobligationstoimplementdatasubjectrightsapply.TheupdatesandchangesmadetotheAImodelshouldbeadequatelyloggedanddocumentedsuchthatsubsequentrequestforrectificationanderasureofpersonaldatacanbefulfilled.

AI-ComplexAlgorithmsandeffectiveDataProtectionSupervision-Effectiveimplementationofdatasubjects’rights

11

BIBLIOGRAPHY

Asokan,S.,&Seelamantula,C.(2020).TeachingaGANwhatnottolearn.AdvancesinNeuralInformationProcessingSystems,33,3964–3975.

Bai,Y.,Kadavath,S.,Kundu,S.,Askell,A.,Kernion,J.,Jones,A.,Chen,A.,Goldie,A.,Mirhoseini,A.,

McKinnon,C.,Chen,C.,Olsson,C.,Olah,C.,Hernandez,D.,Drain,D.,Ganguli,D.,Li,D.,Tran-Johnson,E.,Perez,E.,…Kaplan,J.(2022).ConstitutionalAI:HarmlessnessfromAIFeedback.CoRR,

abs/2212.08073.

/10.48550/ARXIV.2212.08073

Bourtoule,L.,Chandrasekaran,V.,Choquette-Choo,C.A.,Jia,H.,Travers,A.,Zhang,B.,Lie,D.,&Papernot,N.(2021).MachineUnlearning.2021IEEESymposiumonSecurityandPrivacy(SP),141–159.

/10.1109/SP40001.2021.00019

Brophy,J.,&Lowd,D.(2021).MachineUnlearningforRandomForests.ICML,139,1092–1104.

Carlini,N.,Tramèr,F.,Wallace,E.,Jagielski,M.,Herbert-Voss,A.,Lee,K.,Roberts,A.,Brown,T.B.,Song,D.,Erlingsson,ú.,Oprea,A.,&Raffel,C.(2021).ExtractingTrainingDatafromLargeLanguageModels.USENIXSecuritySymposium,2633–2650.

Casper,S.,Davies,X.,Shi,C.,Gilbert,T.K.,Scheurer,J.,Rando,J.,Freedman,R.,Korbak,T.,Lindner,D.,Freire,P.,Wang,T.,Marks,S.,Segerie,C.-R.,Carroll,M.,Peng,A.,Christoffersen,P.,Damani,M.,Slocum,S.,Anwar,U.,…Hadfield-Menell,D.(2023).OpenProblemsandFundamentalLimitationsofReinforcementLearningfromHumanFeedback.

/10.48550/ARXIV.2307.15217

Cha,S.,Cho,S.,Hwang,D.,Lee,H.,Moon,T.,&Lee,M.(2024).LearningtoUnlearn:Instance-wiseUnlearningforPre-trainedClassifiers(arXiv:2301.11578).arXiv.

/abs/2301.11578

Chandrasekaran,V.,Jia,H.,Thudi,A.,Travers,A.,Ya

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論