數(shù)據(jù)與模型安全 課件 第2周:可解釋性和普通魯棒性_第1頁(yè)
數(shù)據(jù)與模型安全 課件 第2周:可解釋性和普通魯棒性_第2頁(yè)
數(shù)據(jù)與模型安全 課件 第2周:可解釋性和普通魯棒性_第3頁(yè)
數(shù)據(jù)與模型安全 課件 第2周:可解釋性和普通魯棒性_第4頁(yè)
數(shù)據(jù)與模型安全 課件 第2周:可解釋性和普通魯棒性_第5頁(yè)
已閱讀5頁(yè),還剩57頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

Explainability&

CommonRobustness姜育剛,馬興軍,吳祖煊

1.

What

is

Machine

Learning

2.

Machine

Learning

Paradigms3.

Loss

FunctionsRecap:

week

14.

Optimization

MethodsMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfModel?Deep

Neural

Networks/neural-network-zoo/;/articles/cc-machine-learning-deep-learning-architectures/Feed-Forward

Neural

NetworksFeed-ForwardNeuralNetworks

(FNN)Fully

Connected

Neural

Networks

(FCN)Multilayer

Perceptron

(MLP)The

simplest

neural

networkFully-connectedbetweenlayersFordatathathasNOtemporalorspatialorder/ConvolutionalNeuralNetworksForimagesordatawithspatialorderCan

stack

up

to

>100

layers/Neurons

in

3

dimensionsNeurons

in

one

flat

layerRecurrent

Neural

Networks/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networksTraditional

RNNTransformersVaswani,Ashish,etal."Attentionisallyouneed."

Advancesinneuralinformationprocessingsystems

30(2017)Transformer:

a

new

type

of

DNNs

based

on

attentionEncoderDecoderSelf-Attention

Explained/illustrated-self-attention-2d627e33b20aCNN

ExplainedLearns

different

levels

of

representations/A

brief

history

of

CNNs:LeNet,1990sAlexNet,2012ZFNet,2013GoogLeNet,2014VGGNet,2014ResNet,2015InceptionV4,2016ResNeXt,

2017ViT,

2021AnImageisWorth16x16Words:TransformersforImageRecognitionatScale,

ICLR

2021Explainable

AI深度學(xué)習(xí)可解釋性學(xué)習(xí)機(jī)理推理機(jī)理泛化機(jī)理認(rèn)知機(jī)理魯棒性學(xué)習(xí)過(guò)程學(xué)習(xí)結(jié)果決策依據(jù)推理機(jī)制泛化原因泛化條件認(rèn)知科學(xué)認(rèn)知啟發(fā)的智能普通魯棒性對(duì)抗魯棒性我們想要弄清楚下列問(wèn)題:DNN是怎么學(xué)習(xí)的、學(xué)到了什么、靠什么泛化、在什么情況下行又在什么情況下不行?深度學(xué)習(xí)是否是真正的智能,與人類智能比誰(shuí)更高級(jí),它的未來(lái)是什么?是否存在大一統(tǒng)的理論,不但能解釋而且能提高?Methodological

PrinciplesVisualizationAblationContrastModelComponentLayerOperationNeuronSuperclassClassTraining/Test

setSubsetSampleTrainingInferenceTransferReverseHow

to

Understand

Machine

LearningLearning

is

the

process

of

empirical

risk

minimization

(ERM)Learning

MechanismTraining/Test

Error/AccuracyPrediction

Confidence

Explanation

via

observation:

just

plot!Wang

et

al.

SymmetricCrossEntropyforRobustLearningwithNoisyLabels,

ICCV

2019.Learning

MechanismParameter

dynamicsGradient

dynamicsExplanation

via

dynamics

and

informationTRADI:Trackingdeepneuralnetworkweightdistributions,

ECCV

2020;

Shwartz-ZivR,TishbyN.Openingtheblackboxofdeepneuralnetworksviainformation[J].arXiv:1703.00810,2017.Learning

MechanismDecision

boundary,

learning

process

visualizationExplanation

via

dynamics

and

informationhttps://distill.pub/2020/grand-tour/(March16,2020);

/Learning

MechanismData

influence/valuation:

how

a

training

sample

impacts

the

learning

outcome?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020.Datashapley:Equitablevaluationof

data

formachinelearning,

ICML,

2019.Influence

FunctionData

ShapleyInfluence

FunctionHow

model

parameter

would

change

if

a

sample

z

is

removed

from

the

training

set?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

目標(biāo):

Cook,R.D.andWeisberg,S.Residualsandinfluenceinregression.NewYork:ChapmanandHall,1982

所以:

Training

Data

InfluenceHow

model

loss

on

z’

would

change

if

update

on

a

sample

z?PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020First-order

approximation

of

the

above

(assuming

one

step

update

is

small)?Checkpoints

store

the

interim

updates所以:Understanding

the

Learned

ModelLoss

LandscapeDeep

featurest-SNE

plotMaaten

et

al.Visualizingdatausingt-SNE.

JMLR,

2008.https://distill.pub/2016/misread-tsne/?_ga=2.135835192.888864733.1531353600-1779571267.1531353600Understanding

the

Learned

ModelClass-wise

PatternsIntermediate

Layer

Activation

MapActivation/Attention

MapLi

et

al.

NeuralAttentionDistillation:ErasingBackdoorTriggersfromDeepNeuralNetwork,

ICLR

2021;

Zhao

etal.Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace.arXiv:2101.06898

(2021).One

predictive

pattern

for

each

classWhat

do

deep

nets

learn?Zhao,Shihao,etal."Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace."

arXiv:2101.06898

(2021).Goal:

understanding

knowledge

learned

by

a

model

of

a

particular

class.Method:

Extract

one

single

pattern

for

one

class,

then

what

this

pattern

would

be?

Other

considerations:

we

need

to

do

this

in

pixel

space,

as

they

are

more

interpretableHow

to

Find

the

Class-wise

Pattern:

a

canvas

imagePatterns

extracted

on

different

canvases

(red

rectangles)Class-wise

Patterns

RevealedPatterns

extracted

on

original,

non-robust,

robust

CIFAR-10and

patterns

of

adversarially

trained

modelsPredictive

power

of

different

sizes

of

patternsInference

MechanismClass

Activation

Map

(Grad-CAM)Guided

BackpropagationSelvaraju

etal.Grad-cam:Visualexplanationsfromdeepnetworksviagradient-basedlocalization.

ICCV

2017.Springenberg

et

al.

StrivingforSimplicity:TheAllConvolutionalNet,

ICLR

2015.Guided

BackpropagationSpringenbergetal.StrivingforSimplicity:TheAllConvolutionalNet,ICLR2015.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709ReLU

forward

passReLU

backward

passDeconvolution

for

ReLUGuided

BackpropagationClass

Activation

Mapping

(CAM)Zhou

et

al.LearningDeepFeaturesforDiscriminativeLocalization.CVPR,2016.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709GAP:

Global

Average

PoolingGrad-CAMB.Zhou,A.Khosla,L.A.,A.Oliva,andA.Torralba.LearningDeepFeaturesforDiscriminativeLocalization.InCVPR,2016;

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709Grad-CAM

is

a

generalization

of

CAMCompute

neuron

importance:

Weighted

combination

of

activation

map,

then

interpolation:LIMELocalInterpretableModel-agnosticExplanations(LIME)Ribeiro

et

al.“Whyshoulditrustyou?”Explainingthepredictionsofanyclassifier.“

SIGKDD,

2016./marcotcr/lime

Integrated

GradientsSundararajanM,TalyA,YanQ.Axiomaticattributionfordeepnetworks,

ICML,2017./TianhongDai/integrated-gradient-pytorch

Integrate

the

gradients

along

the

wayCognitive

DistillationHuang

et

al.

DistillingCognitiveBackdoorPatternswithinanImage,

ICLR

2023MaskextractbycognitivedistillationUsefulandnon-usefulfeaturesUsefulfeatures:highlycorrelatedwiththetruelabelinexpectation,

soIfremoved,predictionchangeBackdoortriggerisausefulfeatureNon-usefulfeatures:notcorrelated

with

predictionIfremoved,predictiondoesnotchangeIlyas,Andrew,etal."Adversarialexamplesarenotbugs,theyarefeatures.”NeurIPS2019CognitiveDistillationObjective:distilltheminimalessenceofusefulfeaturesModelTotalVariationLossRandomnoisevectorOriginalimageMaskCognitivePatternCognitiveDistillationDistilledpatternsonbackdoored

samplesxcpmxHow

to

VerifyCognitivePatterns

are

EssentialBackdooredimageBinarizedmask{0,1}OriginalimageConstruct

simplified

backdoor

patterns:Backdoor

Patterns

Can

Be

Made

Simplerxcpmxxbd’Backdoor

Patterns

Can

Be

Made

SimplerSimplified

backdoor

patterns

also

work!L1Norm

Distributionofthe

Distilled

MaskDetect

Backdoor

SamplesAttacks:12backdoorattacksModels:ResNet-18,Pre-ActivationResNet-101,MobileNetv2,VGG-16,Inception,EfficientNet-b0Datasets:CIFAR-10/GTSRB/ImageNetsubsetEvaluation

metric:areaundertheROCcurve(AUROC)Detectionbaselines:Anti-BackdoorLearning(ABL)[2]ActivationClustering(AC)[3]Frequency[4]STRIP[5]SpectralSignatures[6]CD-L(logitslayer)andCD-F(lastactivationlayer)Superb

Detection

PerformanceCelebA

dataset:40binaryfacialattributes(gender,bald,andhaircolor)KnownbiasbetweengenderandblondhairApply

CDinthesamewayasbackdoordetectionSelectsubsetofsampleswithlowL1normExamineattributesofthesubsetCalculatedistributionshiftbetweensubsetandthefulldatasetDiscover

Biases

in

Facial

Recognition

ModelsDiscover

Biases

in

Facial

Recognition

ModelsMasks

distilled

for

predicting

each

attributeDiscover

Biases

in

Facial

Recognition

ModelsGeneralization

MechanismConvergenceGeneralizationDeep

Learning

TheoryConvergenceConvex

(Linear

model)Nonconvex

(DNN)Saddle

pointGeneralizationTraining

time‘Cat’Test

time‘Cat’?Traditional

theory:

simpler

model

is

better,

more

data

is

betterGeneralization

Theory/~ninamf/ML11/lect1117.pdf;/watch?v=zlqQ7VRba2YComponents

of

Generalization

Error

Boundsgeneralizationerror

empiricalerror

hypothesisclasscomplexity

confidencesample

sizeRHS:

for

all

terms,

the

lower

the

better:

small

training

errorsimpler

model

classmore

samples

less

confidenceGeneralization

TheoryZhang

et

al.

Understandingdeeplearningrequiresrethinkinggeneralization.

ICLR

2017.Small

training

error≠low

generalization

errorZero

training

error

was

achieved

on

purely

random

labels

(meaningless

learning)0

training

error

vs.

0.9

test

errorList

of

Existing

TheoriesRademacher

Complexity

bounds

(Bartlett

et

al.

2017)PAC-Bayes

bounds

(Dziugaite

and

Roy

2017)Information

bottleneck

(Tishby

and

Zaslavsky

2015)Neural

tangent

kernel/Lazy

training

(Jacot

et

al.

2018)Mean-field

analysis

(Chizat

and

Bach

2018)Doule

Descent

(Belkin

et

al.

2019)Entropy

SGD

(Chaudhari

et

al.

2019)/watch?v=zlqQ7VRba2YA

few

interesting

questions:Should

we

consider

the

role

of

data

in

generalization

analysis?Should

representation

quality

appear

in

the

generalization

bound?Generalization

is

about

math

(the

function

of

the

model)

or

knowledge?How

to

visualize

generalization?

Existing

approachestest

errorVisualization:

loss

landscape,

prediction

attribution,

etc.Training

->

test:

distribution

shift,

out-of-distribution

analysisNoisy

labels

in

test

data

questioning

data

quality

and

reliable

evaluationThe

remaining

questions:

how

generalization

happens?Math≠KnowledgeComputation

=

finding

patterns

or

understanding

the

underlying

knowledgeWhat

is

the

relation

of

computational

generalization

to

human

behavior?Cognitive

MechanismOpenAI

reveals

the

multimodal

neurons

in

CLIP/blog/multimodal-neurons/;/blog/clip/Cognitive

MechanismRitter

et

al.

CognitivePsychologyforDeepNeuralNetworks:AShapeBiasCaseStudy,

ICML,

2017cognitivepsychology

inspired

evaluation

of

DNNsshape

match

=

prob

means

shape

biasCognitive

MechanismGeirhos,Robert,etal."Shortcutlearningindeepneuralnetworks."

NatureMachineIntelligence

2.11(2020):665-673.DeepneuralnetworkssolveproblemsbytakingshortcutsCognitive

MechanismRajalingham,Rishi,etal.“Large-scale,high-resolutioncomparisonofthecorevisualobjectrecognitionbehaviorofhumans,monkeys,andstate-of-the-artdeepartificialneuralnetworks.”

JournalofNeuroscience

38.33(2018):7255-7269.

Rajalingham,Rishi,KailynSchmidt,andJamesJ.DiCarlo."Comparisonofobjectrecognitionbehaviorinhumanandmonkey."

JournalofNeuroscience

35.35(2015):12127-121

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論