數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性

上傳人：y*** IP屬地：山東上傳時(shí)間：2024-10-12 格式：PPTX 頁(yè)數(shù)：62 大小：44.71MB 積分：15 舉報(bào) 版權(quán)申訴

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性_第2頁(yè)

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性_第3頁(yè)

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性_第4頁(yè)

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性_第5頁(yè)

已閱讀5頁(yè)，還剩57頁(yè)未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說(shuō)明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

Explainability&

CommonRobustness姜育剛，馬興軍，吳祖煊

What

Machine

Learning

Machine

Learning

Paradigms3.

Loss

FunctionsRecap:

week

14.

Optimization

MethodsMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfModel？Deep

Neural

Networks/neural-network-zoo/;/articles/cc-machine-learning-deep-learning-architectures/Feed-Forward

Neural

NetworksFeed-ForwardNeuralNetworks

(FNN)Fully

Connected

Neural

Networks

(FCN)Multilayer

Perceptron

(MLP)The

simplest

neural

networkFully-connectedbetweenlayersFordatathathasNOtemporalorspatialorder/ConvolutionalNeuralNetworksForimagesordatawithspatialorderCan

stack

>100

layers/Neurons

dimensionsNeurons

one

flat

layerRecurrent

Neural

Networks/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networksTraditional

RNNTransformersVaswani,Ashish,etal."Attentionisallyouneed."

Advancesinneuralinformationprocessingsystems

30(2017)Transformer:

new

type

DNNs

based

attentionEncoderDecoderSelf-Attention

Explained/illustrated-self-attention-2d627e33b20aCNN

ExplainedLearns

different

levels

representations/A

brief

history

CNNs:LeNet,1990sAlexNet,2012ZFNet,2013GoogLeNet,2014VGGNet,2014ResNet,2015InceptionV4,2016ResNeXt,

2017ViT,

2021AnImageisWorth16x16Words:TransformersforImageRecognitionatScale,

ICLR

2021Explainable

AI深度學(xué)習(xí)可解釋性學(xué)習(xí)機(jī)理推理機(jī)理泛化機(jī)理認(rèn)知機(jī)理魯棒性學(xué)習(xí)過(guò)程學(xué)習(xí)結(jié)果決策依據(jù)推理機(jī)制泛化原因泛化條件認(rèn)知科學(xué)認(rèn)知啟發(fā)的智能普通魯棒性對(duì)抗魯棒性我們想要弄清楚下列問(wèn)題：DNN是怎么學(xué)習(xí)的、學(xué)到了什么、靠什么泛化、在什么情況下行又在什么情況下不行？深度學(xué)習(xí)是否是真正的智能，與人類智能比誰(shuí)更高級(jí)，它的未來(lái)是什么？是否存在大一統(tǒng)的理論，不但能解釋而且能提高？Methodological

PrinciplesVisualizationAblationContrastModelComponentLayerOperationNeuronSuperclassClassTraining/Test

setSubsetSampleTrainingInferenceTransferReverseHow

Understand

Machine

LearningLearning

the

process

empirical

risk

minimization

(ERM)Learning

MechanismTraining/Test

Error/AccuracyPrediction

Confidence

Explanation

via

observation:

just

plot!Wang

al.

SymmetricCrossEntropyforRobustLearningwithNoisyLabels,

ICCV

2019.Learning

MechanismParameter

dynamicsGradient

dynamicsExplanation

via

dynamics

and

informationTRADI:Trackingdeepneuralnetworkweightdistributions,

ECCV

2020;

Shwartz-ZivR,TishbyN.Openingtheblackboxofdeepneuralnetworksviainformation[J].arXiv:1703.00810,2017.Learning

MechanismDecision

boundary,

learning

process

visualizationExplanation

via

dynamics

and

informationhttps://distill.pub/2020/grand-tour/（March16,2020）;

/Learning

MechanismData

influence/valuation:

how

training

sample

impacts

the

learning

outcome?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020.Datashapley:Equitablevaluationof

data

formachinelearning,

ICML,

2019.Influence

FunctionData

ShapleyInfluence

FunctionHow

model

parameter

would

change

sample

removed

from

the

training

set?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

目標(biāo)：

Cook,R.D.andWeisberg,S.Residualsandinfluenceinregression.NewYork:ChapmanandHall,1982

所以：

Training

Data

InfluenceHow

model

loss

z’

would

change

update

sample

z?PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020First-order

approximation

the

above

(assuming

one

step

update

small)?Checkpoints

store

the

interim

updates所以：Understanding

the

Learned

ModelLoss

LandscapeDeep

featurest-SNE

plotMaaten

al.Visualizingdatausingt-SNE.

JMLR,

2008.https://distill.pub/2016/misread-tsne/?_ga=2.135835192.888864733.1531353600-1779571267.1531353600Understanding

the

Learned

ModelClass-wise

PatternsIntermediate

Layer

Activation

MapActivation/Attention

MapLi

al.

NeuralAttentionDistillation:ErasingBackdoorTriggersfromDeepNeuralNetwork,

ICLR

2021;

Zhao

etal.Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace.arXiv:2101.06898

(2021).One

predictive

pattern

for

each

classWhat

deep

nets

learn?Zhao,Shihao,etal."Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace."

arXiv:2101.06898

(2021).Goal:

understanding

knowledge

learned

model

particular

class.Method:

Extract

one

single

pattern

for

one

class,

then

what

this

pattern

would

be?

Other

considerations:

need

this

pixel

space,

they

are

interpretableHow

Find

the

Class-wise

Pattern:

canvas

imagePatterns

extracted

different

canvases

(red

rectangles)Class-wise

Patterns

RevealedPatterns

extracted

original,

non-robust,

robust

CIFAR-10and

patterns

adversarially

trained

modelsPredictive

power

different

sizes

patternsInference

MechanismClass

Activation

Map

(Grad-CAM)Guided

BackpropagationSelvaraju

etal.Grad-cam:Visualexplanationsfromdeepnetworksviagradient-basedlocalization.

ICCV

2017.Springenberg

al.

StrivingforSimplicity:TheAllConvolutionalNet,

ICLR

2015.Guided

BackpropagationSpringenbergetal.StrivingforSimplicity:TheAllConvolutionalNet,ICLR2015.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709ReLU

forward

passReLU

backward

passDeconvolution

for

ReLUGuided

BackpropagationClass

Activation

Mapping

(CAM)Zhou

al.LearningDeepFeaturesforDiscriminativeLocalization.CVPR,2016.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709GAP:

Global

Average

PoolingGrad-CAMB.Zhou,A.Khosla,L.A.,A.Oliva,andA.Torralba.LearningDeepFeaturesforDiscriminativeLocalization.InCVPR,2016;

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709Grad-CAM

generalization

CAMCompute

neuron

importance:

Weighted

combination

activation

map,

then

interpolation:LIMELocalInterpretableModel-agnosticExplanations(LIME)Ribeiro

al.“Whyshoulditrustyou?”Explainingthepredictionsofanyclassifier.“

SIGKDD,

2016./marcotcr/lime

Integrated

GradientsSundararajanM,TalyA,YanQ.Axiomaticattributionfordeepnetworks,

ICML,2017./TianhongDai/integrated-gradient-pytorch

Integrate

the

gradients

along

the

wayCognitive

DistillationHuang

al.

DistillingCognitiveBackdoorPatternswithinanImage,

ICLR

2023MaskextractbycognitivedistillationUsefulandnon-usefulfeaturesUsefulfeatures:highlycorrelatedwiththetruelabelinexpectation,

soIfremoved,predictionchangeBackdoortriggerisausefulfeatureNon-usefulfeatures:notcorrelated

with

predictionIfremoved,predictiondoesnotchangeIlyas,Andrew,etal."Adversarialexamplesarenotbugs,theyarefeatures.”NeurIPS2019CognitiveDistillationObjective:distilltheminimalessenceofusefulfeaturesModelTotalVariationLossRandomnoisevectorOriginalimageMaskCognitivePatternCognitiveDistillationDistilledpatternsonbackdoored

samplesxcpmxHow

VerifyCognitivePatterns

are

EssentialBackdooredimageBinarizedmask{0,1}OriginalimageConstruct

simplified

backdoor

patterns:Backdoor

Patterns

Can

Made

Simplerxcpmxxbd’Backdoor

Patterns

Can

Made

SimplerSimplified

backdoor

patterns

also

work!L1Norm

Distributionofthe

Distilled

MaskDetect

Backdoor

SamplesAttacks:12backdoorattacksModels:ResNet-18,Pre-ActivationResNet-101,MobileNetv2,VGG-16,Inception,EfficientNet-b0Datasets:CIFAR-10/GTSRB/ImageNetsubsetEvaluation

metric:areaundertheROCcurve(AUROC)Detectionbaselines:Anti-BackdoorLearning(ABL)[2]ActivationClustering(AC)[3]Frequency[4]STRIP[5]SpectralSignatures[6]CD-L(logitslayer)andCD-F(lastactivationlayer)Superb

Detection

PerformanceCelebA

dataset:40binaryfacialattributes(gender,bald,andhaircolor)KnownbiasbetweengenderandblondhairApply

CDinthesamewayasbackdoordetectionSelectsubsetofsampleswithlowL1normExamineattributesofthesubsetCalculatedistributionshiftbetweensubsetandthefulldatasetDiscover

Biases

Facial

Recognition

ModelsDiscover

Biases

Facial

Recognition

ModelsMasks

distilled

for

predicting

each

attributeDiscover

Biases

Facial

Recognition

ModelsGeneralization

MechanismConvergenceGeneralizationDeep

Learning

TheoryConvergenceConvex

(Linear

model)Nonconvex

(DNN)Saddle

pointGeneralizationTraining

time‘Cat’Test

time‘Cat’?Traditional

theory:

simpler

model

better,

data

betterGeneralization

Theory/~ninamf/ML11/lect1117.pdf;/watch?v=zlqQ7VRba2YComponents

Generalization

Error

Boundsgeneralizationerror

empiricalerror

hypothesisclasscomplexity

confidencesample

sizeRHS:

for

all

terms,

the

lower

the

better:

small

training

errorsimpler

model

classmore

samples

less

confidenceGeneralization

TheoryZhang

al.

Understandingdeeplearningrequiresrethinkinggeneralization.

ICLR

2017.Small

training

error≠low

generalization

errorZero

training

error

was

achieved

purely

random

labels

(meaningless

learning)0

training

error

vs.

0.9

test

errorList

Existing

TheoriesRademacher

Complexity

bounds

(Bartlett

al.

2017)PAC-Bayes

bounds

(Dziugaite

and

Roy

2017)Information

bottleneck

(Tishby

and

Zaslavsky

2015)Neural

tangent

kernel/Lazy

training

(Jacot

al.

2018)Mean-field

analysis

(Chizat

and

Bach

2018)Doule

Descent

(Belkin

al.

2019)Entropy

SGD

(Chaudhari

al.

2019)/watch?v=zlqQ7VRba2YA

few

interesting

questions:Should

consider

the

role

data

generalization

analysis?Should

representation

quality

appear

the

generalization

bound?Generalization

about

math

(the

function

the

model)

knowledge?How

visualize

generalization?

Existing

approachestest

errorVisualization:

loss

landscape,

prediction

attribution,

etc.Training

test:

distribution

shift,

out-of-distribution

analysisNoisy

labels

test

data

–

questioning

data

quality

and

reliable

evaluationThe

remaining

questions:

how

generalization

happens?Math≠KnowledgeComputation

finding

patterns

understanding

the

underlying

knowledgeWhat

the

relation

computational

generalization

human

behavior?Cognitive

MechanismOpenAI

reveals

the

multimodal

neurons

CLIP/blog/multimodal-neurons/;/blog/clip/Cognitive

MechanismRitter

al.

CognitivePsychologyforDeepNeuralNetworks:AShapeBiasCaseStudy,

ICML,

2017cognitivepsychology

inspired

evaluation

DNNsshape

match

prob

means

shape

biasCognitive

MechanismGeirhos,Robert,etal."Shortcutlearningindeepneuralnetworks."

NatureMachineIntelligence

2.11(2020):665-673.DeepneuralnetworkssolveproblemsbytakingshortcutsCognitive

MechanismRajalingham,Rishi,etal.“Large-scale,high-resolutioncomparisonofthecorevisualobjectrecognitionbehaviorofhumans,monkeys,andstate-of-the-artdeepartificialneuralnetworks.”

JournalofNeuroscience

38.33(2018):7255-7269.

Rajalingham,Rishi,KailynSchmidt,andJamesJ.DiCarlo."Comparisonofobjectrecognitionbehaviorinhumanandmonkey."

JournalofNeuroscience

35.35(2015):12127-121

人人文庫(kù)> 全部分類> 教育資料 > 課件下載

溫馨提示

1. 本站所有資源如無(wú)特殊說(shuō)明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

數(shù)據(jù)與模型安全 課件 第2周：可解釋性和普通魯棒性

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

相關(guān)文檔

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性