機(jī)器學(xué)習(xí)題庫_第1頁
機(jī)器學(xué)習(xí)題庫_第2頁
機(jī)器學(xué)習(xí)題庫_第3頁
機(jī)器學(xué)習(xí)題庫_第4頁
機(jī)器學(xué)習(xí)題庫_第5頁
已閱讀5頁,還剩40頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、機(jī)器學(xué)習(xí)題庫極大似然1、 MLestimationofexponentialmodel(10)AGaussiandistributionisoftenusedtomodeldataontherealline,butissometimesinappropriatewhenthedataareoftenclosetozerobutconstrainedtobenonnegative.Insuchcasesonecanfitanexponentialdistribution,whoseprobabilitydensityfunctionisgivenby1展pxe節(jié)bGivenNobservatio

2、nsxidrawnfromsuchadistribution:(a) Writedownthelikelihoodasafunctionofthescaleparameterb.(b) Writedownthederivativeoftheloglikelihood.(c)GiveasimpleexpressionfortheMLestimateforb.期(LL)IX:6=TT-d二人t工3工(b)-Alog(i)-TkHI=I髀 T 十由i=ldistributionp|withhyperparameters 丫,suchthattheposteriordistributionp|X,pX

3、|p|p|與先驗(yàn)的分布族相同(a)Supposethatthelikelihoodisgivenbytheexponentialdistributionwithrateparameter 人:x2、換成 Poisson 分布:px|e,y0,1,2x!,Nllogpx|i1NxilogNi1Nxilogi1Nlogxii1logx!貝葉斯1、貝葉斯公式應(yīng)用假設(shè)在考試的多項(xiàng)選擇中,考生知道正確答案的概率為 p,猜測答案的概率為 1-p,并且假設(shè)考生知道正確答案答對題的概率為 1,猜中正確答案的概率為 1/-m,其中 m 為多選項(xiàng)的數(shù)目。那么已知考生答對題目,求他知道正確答案的概率。:pknown

4、|correctpknown,correctppknown11p1p一m2、ConjugatepriorsGivenalikelihoodpx|foraclassmodelswithparameters0,aconjugatepriorisapx|ShowthatthegammadistributionGamma|,isaconjugatepriorfortheexponential.Derivetheparameterupdategivenobservationsx1,,xNandthepredictiondistributionpxN1|4,xN.(n)Ey|ioiiontial;mdGm

5、ninnTheiktilinclisF(XA)=口3%cxp(Mrandtlirpriori/J(A|八,J)=中打門后壯,|nr:”i-LtKth。L山山LJiepowtcriuiinP(A|KXAexp(-A)-fr,r1+exp%(;+產(chǎn)國)Tlircforothepiraiiwtrrupclatesarcasfollows:FortbepredictiondistributionwecomputetheIblljriugintegral;XcxM事州|tr+A.d+(rg+wr(Or+.V)(3+HJV+.nnirTlirn=it|fl)=(1-即ifandpriorisabft+1b

6、+k-1FVrUBptHdlctioiidlstributtnncomputethsfallowingkrtcgr近jp(工工s件Xi=kM陽/(L-例八%曰,怙W+L匕+4-1%出口“+/*+*-)r(A+jrg+r曲r(fj+i)r(&+ji-jr(+A+A-+-i)r(fj+匕+Q/(七+/+/-2)a+1r(4+A:1)r(ci+14-A+f_Ijcj+fe+Zk+f-1r(nb+k)r(&+A+f-2)=1)一十十+mwluretlwlipinill:imntestepHw時(shí)打口4加。1fnmuUan/依+cnwricd?)(j)(Forextracredit)A,tatisticT

7、(1)isjsaidtuheforsipajEigriet.erorinotherwords,itisiiidcpciHlcntof/ShowthatforarandomvariallcXtirnwiifromancxpoDcutialfhiuilydeikiityij),inasiifficieutstatisticfor牛(Shuwthatafactorization川 北 廿 (/ : ) ; *)=/( 止 ) 七 (1” : 葉 )hiHciryrindsiiffi.ci.eiLtfor打(看)to1KUastatisticMrtf).(k) (Forextracredit)Supp

8、oseX一XnaredrawniidfromanexponentiaJfamilydensity嚇What如nowthesufficientstatisticT(2r14.u_j-ft)tbrif?三、判斷題(1)給定 n 個(gè)數(shù)據(jù)點(diǎn),如果其中一半用于訓(xùn)練,另一半用于測試,則訓(xùn)練誤差和測試誤差之間的差別會(huì)隨著 n的增加而減小。(2)極大似然估計(jì)是無偏估計(jì)且在所有的無偏估計(jì)中方差最小,所以極大似然估計(jì)的風(fēng)險(xiǎn)最小。(3)回歸函數(shù) A 和 B,如果 A 比 B 更簡單,則 A 幾乎一定會(huì)比 B 在測試集上表現(xiàn)更好。(4)全局線性回歸需要利用全部樣本點(diǎn)來預(yù)測新輸入的對應(yīng)輸出值,而局部線性回歸只需利用查詢

9、點(diǎn)附近的樣本來預(yù)測輸出值。所以全局線性回歸比局部線性回歸計(jì)算代價(jià)更高。(5)Boosting 和 Bagging 都是組合多個(gè)分類器投票的方法,二者都是根據(jù)單個(gè)分類器的正確率決定其權(quán)重。(6)Intheboostingiterations,thetrainingerrorofeachnewdecisionstumpandthetrainingerrorofthecombinedclassifiervaryroughlyinconcert(F)Whilethetrainingerrorofthecombinedclassifiertypicallydecreasesasafunctionofbo

10、ostingiterations,theerroroftheindividualdecisionstumpstypicallyincreasessincetheexampleweightsbecomeconcentratedatthemostdifficultexamples.(7)OneadvantageofBoostingisthatitdoesnotoverfit.(F)(8)Supportvectormachinesareresistanttooutliers,i.e.,verynoisyexamplesdrawnfromadifferentdistribution.(F)(9)在回歸

11、分析中,最佳子集選擇可以做特征選擇,當(dāng)特征數(shù)目較多時(shí)計(jì)算量大;嶺回歸和 Lasso模型計(jì)算量小,且 Lasso 也可以實(shí)現(xiàn)特征選擇。(10)當(dāng)訓(xùn)練數(shù)據(jù)較少時(shí)更容易發(fā)生過擬合。(12)在核回歸中,最影響回歸的過擬合性和欠擬合之間平衡的參數(shù)為核函數(shù)的寬度。(13)IntheAdaBoostalgorithm,theweightsonallthemisclassifiedpointswillgoupbythesamemultiplicativefactor.(T)=exp(d)7.2pointstrue/falsenAdaBoost,weightedtrainingerrorqofthetif,we

12、akclassifierontrainingdatdwitliweightsDttendstoiiKTeaseasafunctionoff.SOLUTION:True.Inthecourseofboostingiterationstheweakclassifiersareforcedtotrytoclarifymoredifficultexamples.Theweightswillincreaseforexamplesthatarerepeatedlymisclassifiedbytheweakclassifiers.Theweightedtrainingerrorctoftheitfweak

13、classifieronthetrainingdat日thereforetendstoincrease.9.2pointsConsiderapointthatiscorrectlyclassifiedanddistantfromthedecisionboundary.WhywouldSrMsdecisionboundarybeunaffectedbyihispoint,buttheoneIrrtTiHtlyloginlicrrgriss.ioiibrHlfFd(HI?SOLUJION:Thehingelo5susedbySUM5giveszeroweighttothesepointswhile

14、thelog-lossusedbylogisticregressiongivesalittlebitofweighttothesepoints.(14) True/False:Inaleast-squareslinearregressionproblem,addinganL2regularizationpenaltycannotdecreasetheL2errorofthesolutionw?onthetrainingdata.(F)(15) True/False:Inaleast-squareslinearregressionproblem,addinganL2regularizationp

15、enaltyalwaysdecreasestheexpectedL2errorofthesolutionw?onunseentestdata(F).(16)除了 EM 算法,梯度下降也可求混合高斯模型的參數(shù)。(T)(20)Anydecisionboundarythatwegetfromagenerativemodelwithclass-conditionalGaussiandistributionscouldinprinciplebereproducedwithanSVMandapolynomialkernel.True!Infact,sinceclass-conditionalGaussia

16、nsalwaysyieldquadraticdecisionboundaries,theycanbereproducedwithanSVMwithkernelofdegreelessthanorequaltotwo.(21)AdaBoostwilleventuallyreachzerotrainingerror,regardlessofthetypeofweakclassifierituses,providedenoughweakclassifiershavebeencombined.False!Ifthedataisnotseparablebyalinearcombinationofthew

17、eakclassifiers,AdaBoostcantachievezerotrainingerror.(22)TheL2penaltyinaridgeregressionisequivalenttoaLaplacepriorontheweights.(F)(23)Thelog-likelihoodofthedatawillalwaysincreasethroughsuccessiveiterationsoftheexpectationmaximationalgorithm.(F)(24) Intrainingalogisticregressionmodelbymaximizingthelik

18、elihoodofthelabelsgiventheinputswehavemultiplelocallyoptimalsolutions.(F)四、回歸1、考慮回歸一個(gè)正則化回歸問題。在下圖中給出了懲罰函數(shù)為二次正則函數(shù),當(dāng)正則化參數(shù) C取不同值時(shí),在訓(xùn)練集和測試集上的 10g 似然(meanlog-probability)。(10 分)(1)說法“隨著 C 的增加,圖 2 中訓(xùn)練集上的 10g 似然永遠(yuǎn)不會(huì)增加”是否正確,并說明理由。(2)解釋當(dāng) C 取較大值時(shí),圖 2 中測試集上的 10g 似然下降的原因。22、考慮線性回歸模型:yNWoW1X,,訓(xùn)練數(shù)據(jù)如下圖所不。(10 分)(1)用

19、極大似然估計(jì)參數(shù),并在圖(a)中畫出模型。(3 分)(2)用正則化的極大似然估計(jì)參數(shù),即在 10g 似然目標(biāo)函數(shù)中加入正則懲罰函數(shù)并在圖(b)中畫出當(dāng)參數(shù) C 取很大值時(shí)的模型。(3 分)(3)在正則化后,高斯分布的方差2-是變大了、變小了還是不變?(4 分)3,1,我們用 1-10 階多項(xiàng)式特征,采用線性回歸模型來學(xué)習(xí) x 與 y 之間的關(guān)系(高階特征模型包含所有低階特征)(1)現(xiàn)在n20個(gè)樣本上,訓(xùn)練 1 階、2 階、8 階和 10 階特征的模型,然后在一個(gè)大規(guī)模的獨(dú)立的測試集上測試,則在下 3 列中選擇合適的模型(可能有多個(gè)選項(xiàng)),并解釋第 3 列中你選擇的模型為什么測試誤差小。(10

20、分)訓(xùn)練誤差最小訓(xùn)練誤差最大測試誤差最小1 階特征的線性模型X2 階特征的線性模型X8 階特征的線性模型X10 階特征的線性模型X現(xiàn)在n106個(gè)樣本上,訓(xùn) I 練 1 階、2 階、8 階和 10 階特征的模型,然后在一個(gè)大規(guī)模的獨(dú)3.考慮二維輸入空間點(diǎn)XTX1,X2上的回歸問題,其中Xj1,1,j1,2在單位正方形內(nèi)訓(xùn)練樣本和測試樣本在單位正方形中均勻分布輸出模型為yNX3X510 x1x27x125x2,損失函數(shù)取平方誤差損失。立的測試集上測試,則在下 3 列中選擇合適的模型(可能有多個(gè)選項(xiàng)),并解釋第 3 列中你選擇的模型為什么測試誤差小。(10 分)訓(xùn)練誤差最小訓(xùn)練誤差最大測試誤差最小1

21、 階特征的線性模型X2 階特征的線性模型8 階特征的線性模型XX10 階特征的線性模型X(3)Theapproximationerrorofapolynomialregressionmodeldependsonthenumberoftrainingpoints.(T)(4)Thestructuralerrorofapolynomialregressionmodeldependsonthenumberoftrainingpoints.(F)4、Wearetryingtolearnregressionparametersforadatasetwhichweknowwasgeneratedfroma

22、polynomialofacertaindegree,butwedonotknowwhatthisdegreeis.Assumethedatawasactuallygeneratedfromapolynomialofdegree5withsomeaddedGaussiannoise(thatis2345yw)wxw?xwxw4xwxFortrainingwehave100 x,ypairsandfortestingweareusinganadditionalsetof100 x,ypairs.Sincewedonotknowthedegreeofthepolynomialwelearntwom

23、odelsfromthedata.ModelAlearnsparametersforapolynomialofdegree4andmodelBlearnsparametersforapolynomialofdegree6.Whichofthesetwomodelsislikelytofitthetestdatabetter?Answer:Degree6polynomial.Sincethemodelisadegree5polynomialandwehaveenoughtrainingdata,themodelwelearnforasixdegreepolynomialwilllikelyfit

24、averysmallcoefficientforx6.Thus,eventhoughitisasixdegreepolynomialitwillactuallybehaveinaverysimilarwaytoafifthdegreepolynomialwhichisthecorrectmodelleadingtobetterfittothedata.5、Input-dependentnoiseinregression八,0,1Ordinaryleast-squaresregressionisequivalenttoassumingthateachdatapointisgeneratedacc

25、ordingtoalinearfunctionoftheinputpluszero-mean,constant-varianceGaussiannoise.Inmanysystems,however,thenoisevarianceisitselfapositivelinearfunctionoftheinput(whichisassumedtobenon-negative,i.e.,x=0).a)Whichofthefollowingfamiliesofprobabilitymodelscorrectlydescribesthissituationintheunivariatecase?(H

26、int:onlyoneofthemdoes.)(iii)iscorrect.InaGaussiandistributionovery,thevarianceisdeterminedbythecoefficientofy2;so2.2byreplacingbyx,wegetavariancethatincreaseslinearlywithx.(Notealsothechangetothenormalization“constant.)(i)hasquadraticdepenXe 的 edoesnotchangethevarianceatall,itjustrenamesw1.b)Circlet

27、heplotsinFigure1thatcouldplausiblyhavebeengeneratedbysomeinstanceofthemodelfamily(ies)youchose.(11)and(iii).(Notethat(iii)worksfor20.)(i)exhibitsalargevarianceatx=0,andthevarianceappearsindependentofx.c)True/False:Regressionwithinput-dependentnoisegivesthesamesolutionasordinaryregressionforaninfinit

28、edatasetgeneratedaccordingtothecorrespondingmodel.True.Inbothcasesthealgorithmwillrecoverthetrueunderlyingmodel.d)Forthemodelyouchoseinpart(a),writedownthederivativeofthenegativeloglikelihoodwithrespecttowi.Theuegativeluglikclilujod今(Wi-(+可13)2工卬2Hiidtheileiivatiivew.r.t認(rèn)NotethatforlinestlLioiiglith

29、e-orisin(w=0),theoptiiidHolmtionliasthepartknlarhsimpleform南=w工LtisputssHilctoLikeI1ULLdeiiviitiM?dfdielugwirlioutiwjticiiiUiutIUJJ;exp(j)=/:weumyIUKlikcliliuixlatorHgoorlrcfion!Phi,theyiuipli0rlivh:nKlliiigofniidriplcdatap“in,IICCHHSPtlicproductofpr口bM/Hticshcouucsa即川ioflogprobhliiitios.II111.五、分類1

30、 .產(chǎn)生式模型vs,判別式模型(a) Yourbillionairefriendneedsyourhelp.Sheneedstoclassifyjobapplicationsintogood/badcategories,andalsotodetectjobapplicantswholieintheirapplicationsusingdensityestimationtodetectoutliers.Tomeettheseneeds,doyourecommendusingadiscriminativeorgenerativeclassifier?Why?產(chǎn)生式模型因?yàn)橐烙?jì)密度px|y(b)

31、Yourbillionairefriendalsowantstoclassifysoftwareapplicationstodetectbug-proneapplicationsusingfeaturesofthesourcecode.Thispilotprojectonlyhasafewapplicationstobeusedastrainingdata,though.Tocreatethemostaccurateclassifier,doyourecommendusingadiscriminativeorgenerativeclassifier?Why?判別式模型樣本數(shù)較少,通常用判別式模

32、型直接分類效果會(huì)好些(d)Finally,yourbillionairefriendalsowantstoclassifycompaniestodecidewhichonetoacquire.Thisprojecthaslotsoftrainingdatabasedonseveraldecadesofresearch.Tocreatethemostaccurateclassifier,doyourecommendusingadiscriminativeorgenerativeclassifier?Why?產(chǎn)生式模型樣本數(shù)很多時(shí),可以學(xué)習(xí)到正確的產(chǎn)生式模型2、logstic 回歸averageI

33、og-probabilityoftestlabels00.511522.533.54regularizationparameterCFigure2:Log-probabilityoflabelsasafunctionofregularizationparameterCHereweusealogisticregressionmodeltosolveaclassificationproblem.InFigure2,wehaveplottedthemeanlog-probabilityoflabelsinthetrainingandtestsetsafterhavingtrainedtheclass

34、ifierwithquadraticregularizationpenaltyanddifferentvaluesoftheregularizationparameterC.1、 Intrainingalogisticregressionmodelbymaximizingthelikelihoodofthelabelsgiventheinputswehavemultiplelocallyoptimalsolutions.(F)Answer:Thelog-probabilityoflabelsgivenexamplesimpliedbythelogisticregressionmodelisac

35、oncave(convexdown)functionwithrespecttotheweights.The(only)locallyoptimalsolutionisalsogloballyoptimal2、 Astochasticgradientalgorithmfortraininglogisticregressionmodelswithafixedlearningratewillfindtheoptimalsettingoftheweightsexactly.(F)Answer:Afixedlearningratemeansthatwearealwaystakingafinitestep

36、towardsimprovingthelog-probabilityofanysingletrainingexampleintheupdateequation.Unlesstheexamplesaresomehow“aligned”,wecoilitinuejumpingfromsidetosideoftheoptimalsolution,andwillnotbeabletogetarbitrarilyclosetoit.Thelearningratehastoapproachtozerointhecourseoftheupdatesfortheweightstoconverge.3、 The

37、averagelog-probabilityoftraininglabelsasinFigure2canneverincreaseasweincreaseC.(T)Strongerregularizationmeansmoreconstraintsonthesolutionandthusthe(average)log-probabilityofthetrainingexamplescanonlygetworse.4、 ExplainwhyinFigure2thetestlog-probabilityoflabelsdecreasesforlargevaluesofC.AsCincreases,

38、wegivemoreweighttoconstrainingthepredictor,andthusgivelessflexibilitytofittingthetrainingset.Theincreasedregularizationguaranteesthatthetestperformancegetsclosertoaveragelog-prcoftraininglabels4-O.thetrainingperformance,butasweover-constrainourallowedpredictors,wearenotabletofitthetrainingsetatall,a

39、ndalthoughthetestperformanceisnowveryclosetothetrainingperformance,botharelow.5、 Thelog-probabilityoflabelsinthetestsetwoulddecreaseforlargevaluesofCevenifwehadalargenumberoftrainingexamples.(T)Theaboveargumentstillholds,butthevalueofCforwhichwewillobservesuchadecreasewillscaleupwiththenumberofexamp

40、les.6、 Addingaquadraticregularizationpenaltyfortheparameterswhenestimatingalogisticregressionmodelensuresthatsomeoftheparameters(weightsassociatedwiththecomponentsoftheinputvectors)vanish.Aregularizationpenaltyforfeatureselectionmusthavenon-zeroderivativeatzero.Otherwise,theregularizationhasnoeffect

41、atzero,andweightwilltendtobeslightlynon-zero,evenwhenthisdoesnotimprovethelog-probabilitiesbymuch.3、正則化的 Logstic 回歸ThisproblemwewillrefertothebinaryclassificationtaskdepictedinFigure1(a),whichweattempttosolvewiththesimplelinearlogisticregressionmodel戶=L|x,叫,wa)=g(皿/-n吧工幻=;;(forsimplicitywedonotuseth

42、ebiasparameterw0).Thetrainingdatacanbeseparatedwithzerotrainingerror-seelineL1inFigure1(b)forinstance.(a)The2-dimensionaldatasetusedinProblem2Consideraregularizationapproachwherewetrytomaximize加耳以此|如 tik 心)-uj:=i-forlargeC.Notethatonlyw2ispenalized.WedliketoknowwhichofthefiiussinFigure1(b)couldarise

43、asaresultofsuchregularization.ForeachpotentiallineL2,L3orL4determinewhetheritcanresultfromregularizingw2.Ifnot,explainverybrieflywhynot.L2:No.Whenweregularizew2,theresultingboundarycanrelylessonthevalueofx2andthereforebecomesmorevertical.L2hereseemstobemorehorizontalthantheunregularizedsolutionsoitc

44、annotcomeasaresultofpenalizingw2L3:Yes.Herew2A2issmallrelativetow1A2(asevidencedbyhighslope),andeventhoughitwouldassignaratherlowlog-probabil(b)ThepointscanbeseparatedbyL1(solidline).PossibleotherdecisionboundariesareshownbyL2;L3;L4.itytotheobservedlabels,itcouldbeforcedbyalargeregularizationparamet

45、erC.L4:No.ForverylargeC,wegetaboundarythatisentirelyvertical(linex1=0orthex2axis).L4hereisreflectedacrossthex2axisandrepresentsapoorersolutionthanitscounterpartontheotherside.Formoderateregularizationwehavetogetthebestsolutionthatwecanconstructwhilekeepingw2small.L4isnotthebestandthuscannotcomeasare

46、sultofregularizingw2.(2)Ifwechangetheformofregularizationtoone-norm(absolutevalue)andalsoregularizew1wegetthefollowingpenalizedlog-likelihoodlog一洲X;,叫,帆)-(IWi|4-w2)-i=lConsideragaintheprobleminFigure1(a)andthesamelinearlogisticregressionmodel.AsweincreasetheregularizationparameterCwhichofthefollowin

47、gscenariosdoyouexpecttoobserve(chooseonlyone):(x)Firstw1willbecome0,thenw2.()w1andw2willbecomezerosimultaneously()Firstw2willbecome0,thenw1.()Noneoftheweightswillbecomeexactlyzero,onlysmallerasCincreasesThedatacanbeclassifiedwithzerotrainingerrorandthereforealsowithhighlog-probabilitybylookingatthev

48、alueofx2alone,i.e.makingw1=0.Initiallywemightprefertohaveanon-zerovalueforw1butitwillgotozeroratherquicklyasweincreaseregularization.Notethatwepayaregularizationpenaltyforanon-zerovalueofw1andifitdoesnhelp1classificationwhywouldwepaythepenalty?Theabsolutevalueregularizationensuresthatw1willindeedgot

49、oexactlyzero.AsCincreasesfurther,evenw2willeventuallybecomezero.Wepayhigherandhighercostforsettingw2toanon-zerovalue.Eventuallythiscostoverwhelmsthegainfromthelog-probabilityoflabelsthatwecanachievewithanon-zerow2.Notethatwhenw1=w2=0,thelog-probabilityoflabelsisafinitevaluenlog(0:5).1、SVMFigure4:Tra

50、iningset,maximummarginlinearseparator,andthesupportvectors(inbold).(1)Whatistheleave-one-outcross-validationerrorestimateformaximummarginseparationinfigure4?(weareaskingforanumber)(0)Basedonthefigurewecanseethatremovinganysinglepointwouldnotchancetheresultingmaximummarginseparator.Sinceallthepointsa

51、reinitiallyclassifiedcorrectly,theleave-one-outerroriszero.(2) Wewouldexpectthesupportvectorstoremainthesameingeneralaswemovefromalinearkerneltohigherorderpolynomialkernels.(F)Therearenoguaranteesthatthesupportvectorsremainthesame.Thefeaturevectorscorrespondingtopolynomialkernelsarenon-linearfunctio

52、nsoftheoriginalinputvectorsandthusthesupportpointsformaximummarginseparationinthefeaturespacecanbequitedifferent.(3) Structuralriskminimizationisguaranteedtofindthemodel(amongthoseconsidered)withthelowestexpectedloss.(F)Weareguaranteedtofindonlythemodelwiththelowestupperboundontheexpectedloss.(4) Wh

53、atistheVC-dimensionofamixtureoftwoGaussiansmodelintheplanewithequalcovariancematrices?Why?AmixtureoftwoGaussianswithequalcovariancematriceshasalineardecisionboundary.LinearseparatorsintheplanehaveVC-dimexactly3.4、SVM對如下數(shù)據(jù)點(diǎn)進(jìn)行分類:(a) Plotthesesixtrainingpoints.Aretheclasses+,-linearlyseparable?yes(b) C

54、onstructtheweightvectorofthemaximummarginhyperplanebyinspectionandidentifythesupportvectors.Themaximummarginhyperplaneshouldhaveaslopeof-1andshouldsatisfyx1=3/2,x2=0.Thereforeitsequatixn+s2=3/2,andtheweightvectoris(1,1)T.(c) Ifyouremoveoneofthesupportvectorsdoesthesizeoftheoptimalmargindecrease,stay

55、thesame,orincrease?Inthisspecificdatasettheoptimalmarginincreaseswhenweremovethesupportvectors(1,0)or(1,1)m孫andstaysthesamewhenweremovetheothertwo.(d) (ExtraCredit)Isyouranswerto(c)alsotrueforanydataset?Provideacounterexampleorgiveashortproof.Whenwedropsomeconstraintsinaconstrainedmaximizationproble

56、m,wegetanoptimalvaluewhichisatleastasgoodthepreviousone.Itisbecausethesetofcandidatessatisfyingtheoriginal(larger,stronger)setofcontraintsisasubsetofthecandidatessatisfyingthenew(smaller,weaker)setofconstraints.So,fortheweakerconstraints,theoldoptimalsolutionisstillavailableandtheremaybeadditionssol

57、tonsthatareevenbetter.Inmathematicalform:maxfix)1,?=1.2.3UsingthemethodofLagrangemultipliersshowthatthesolutionisV?0,0,2,b1andthemarginislinearlyseparable?x1,.2x,x2.Arethe,-NoForoptimizationproblemswithinequalityconstraintssuchastheabove,weshouldapplyKKTconditionswhichisageneralizationofLagrangemult

58、ipliers.Howeverthisproblemcanbesolvedeasierbynotingthatwehavethreevectorsinthe3-dimensionalspaceandallofthemaresupportvectors.Hencetheall3constraintsholdwithequality.ThereforewecanapplythemethodofLagrangemultipliersto,min|wI2s.t.曲曲.1i)=1,i=1,2.3Vehave3constraints,andshouldhavp3Lagrangemultipliers.We

59、firstformtheLagrangianfunctionZ(w.A: whpreA=(ALh23)隈follows1,=71Ml同+%(如(與)+的1)i=lanfldificmitifitcwithrespectr.ooptimizationvarirtblnswandbmidtuzejo.Hww+尢物。(S)1=1吧產(chǎn)=3uby1=1l.Ningthedatapoints妣卬),wegetthefollowingequationsfromtheabovelines.wj+Aj而一入3=0(10)6-vAa=0(11)-A2-As=1)(12)Ai 一 43-人 3=0(13)(14)I

60、.m 電(10)and(14)嗝工1grt 支=0,Tlienph 喝日 iu 裁 tliisCoequalityconstraiiithintheoptiinizatioHprobleiikweget%=1(15)V2W2+tf3-1-t=1(16)4/22+L=1(17)(16)and(17)implytliatu”=0.mid七乜=2.llierctorctheo)tiniapciglitsarew=(0.0:2)an=LAntitheluarginis1/2.(e) Showthatthesolutionremainsthesameiftheconstraintsarechangedt

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論