機(jī)器學(xué)習(xí)課件2線性回歸_第1頁(yè)
機(jī)器學(xué)習(xí)課件2線性回歸_第2頁(yè)
機(jī)器學(xué)習(xí)課件2線性回歸_第3頁(yè)
機(jī)器學(xué)習(xí)課件2線性回歸_第4頁(yè)
機(jī)器學(xué)習(xí)課件2線性回歸_第5頁(yè)
已閱讀5頁(yè),還剩34頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、RegressionHung-yi Lee李宏毅Regression: Output a scalarStock Market ForecastDow Jones IndustrialAverage at tomorrow=Self-driving Car=方向盤角度Recommendation=購(gòu)買可能性商品B使用者AExample Application Estimating the Combat Power (CP) of a pokemon after evolution=CP after evolutionStep 1: My = b + w xcpw and b are param

2、eters(can be any value)f1: y = 10.0 + 9.0 xcpA set offunctionf : y = 9.8 + 9.2 x2cpf1, f2 Lf : y = - 0.8 - 1.2 x3cp infinite=: , , , = + : weight, b: biasfeatureLinear m:CP after evolutionMStep 2: Goodness of Functiony = b + w xcpfunctioninput:function Output (scalar):A set offunctionf1, f2 L122Trai

3、ningData1MStep 2: Goodness of FunctionTraining Data:10 pokemons1, 1, 22 , 10, 10This is real data.Source: Step 2: Goodness of Functiony = b + w xcpLoss function :Input: a function, output: how bad it isA set offunctionf , fL12Estimation error10= =1Goodness offunction fLEstimated y basedon input func

4、tionSum over examplesL, 10= =1TrainingData2 + 2 MStep 2: Goodness of Function102= =1 + L, Loss FunctionEach point inthe figure is a functionThe colorrepresentsL , .(true example)Very largey = - 180 - 2 xcpsmallestStep 3: Best FunctionA set offunctionf , fL12Goodness offunction f= min , = min , ,Trai

5、ningData102 + = min ,=1Gradient DescentPick the “Best” FunctionML , 102= + =1bum/photo/171572850Step 3: Gradient Descent= min Consider loss function () with one parameter w:(Randomly) Pick an initial value w0Compute|=0Lossww0Decrease wPositiveIncrease wNegativebum/photo/171572850Step 3: Gradient Des

6、cent= min Consider loss function () with one parameter w:(Randomly) Pick an initial value w0Compute|0=Loss1 0 |=0ww0|=0 is called“l(fā)earning rate”Step 3: Gradient Descent= min Consider loss function () with one parameter w:(Randomly) Pick an initial value w0Compute|0=1 0 Loss|=0 |=1Compute|=121 Many i

7、terationww0w1w2wTglobal minimaLocal minimaStep 3: Gradient Descent How about two parameters?, = min , ,(Randomly) Pick an initial value w0, b0Compute|=0,=0 ,|=0,=0 |=0,=0 |=0,=01010 Compute|=1,=1 ,|=1,=1 |=1,=1 |=1,=12121 gradientStep 3: Gradient DescentCompute , ( , )Color: Value of Loss L(w,b)Step

8、 3: Gradient Descent When solving:by gradient descent= arg max Each time we update the parameters, we obtain that makes smaller.012 Is this statement correct?Step 3: Gradient DescentLossVery slow atthe plateauStuck atsaddle point12The value of the parameter w = 0 = 0 0Stuck at local minimaStep 3: Gr

9、adient Descent Formulation of and 10= =12 + , 10 2=1 + =?=?Step 3: Gradient Descent Formulation of and 10= =12 + , 10 2=1 + =?10 2=1 + =?Step 3: Gradient DescentHows the results?Training Data1b = -188.4w = 2.7Average Error on Training Data1021 =1= 31.910y = b + w xcpHows the results?- Generalization

10、Another 10 pokemons as testing datab = -188.4w = 2.7Average Error on Testing Data101 = 35.0=1=10 Average Error onTraining Data (31.9)How can we do better?y = b + w xcpWhat we really care about is the error on new data (testing data)Selecting another MBest Function b = -10.3w1 = 1.0, w2 = 2.7 x 10-3A

11、verage Error = 15.4Testing:Average Error = 18.4Better! Could it be even better?y = b + w1 xcp + w2 (xcp)2Selecting another MBest Function b = 6.4, w1 = 0.66w2 = 4.3 x 10-3w3 = -1.8 x 10-6Average Error = 15.3Testing:Average Error = 18.1Slightly better. How about morecomplex m?y = b + w1 xcp + w2 (xcp

12、)2+ w3 (xcp)3Selecting another MBest Function Average Error = 14.9Testing:Average Error = 28.8The results becomeworse .y = b + w1 xcp + w2 (xcp)2+ w3 (xcp)3 + w4 (xcp)4Selecting another MBest Function Average Error = 12.8Testing:Average Error = 232.1The results are so bad.y = b + w1 xcp + w2 (xcp)2+

13、 w3 (xcp)3 + w4 (xcp)4+ w5 (xcp)5Training DataMSelection1.2.3.4.A more complex myields5.lower error on training data.If we can truly find the best functiony = b + w1 xcp + w2 (xcp)2+ w3 (xcp)3 + w4 (xcp)4+ w5 (xcp)5y = b + w1 xcp + w2 (xcp)2+ w3 (xcp)3 + w4 (xcp)4y = b + w1 xcp + w2 (xcp)2+ w3 (xcp)

14、3y = b + w1 xcp + w2 (xcp)2y = b + w xcpMSelectionA more complex mdoes not always lead tobetter performance on testing data.Select suitable mThis is Overfitting.OverfittingTrainingTesting131.935.0215.418.4315.318.1414.928.2512.8232.1Lets collect more dataThere is some hidden factors not considered i

15、n the previous mWhat are the hidden factors?EeveePidgeyWeedleCaterpieBack to step 1:Redesign the M = + x= species of xyIf = Pidgey: = 1 + 1 If = Weedle: = 2 + 2 If = Caterpie: = 3 + 3 If = Eevee: = 4 + 4 Linear m?Back to step 1:Redesign the M = + = 1 +1 = Pidgey+2 If = Pidgey=1+2 =0otherwise+3 +3 If

16、 = Pidgey+4 = + 11+4 00000011Linear m?TrainingDataAverage error= 3.8TestingDataAverage error= 14.3CP after evolutionCP after evolutionCP after evolutionAre there any otherhidden factors?weightHeightHPBack to step 1:Redesign the MxAgainy2If = Pidgey: = 1 + 1 + 5 2If = Weedle: = 2 + 2 + 6 2If = Caterp

17、ie: = 3 + 3 + 7 2If = Eevee: = 4 + 4 + 8 2 = + 9 + 10 +11 + 12 2 + 13 + 14 2Training Error= 1.9Testing Error= 102.3Overfitting!Back to step 2: Regularization = + The functions withsmaller are better2 = + Smaller means = + + = + (+i) We believe smoother function is more likely to be correctDo you hav

18、e to apply regularization on bias?smoother+ 2RegularizationsmootherHow smooth?Select obtaining the best m Training error: larger, considering the training error less We prefer smooth function, but dont be too smooth.TrainingTesting01.9102.312.368.7103.525.71004.111.110005.612.8100006.318.71000008.526.8Pokmon: Original CP and species almost decide the CPafter evolution There are probably other hidden factors Gradient descent More theory and tips in the following lecturesWe finally get average error = 11

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論