




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
1、a,1,BEO2255 Applied Statistics for Business,Week Six Analyzing categorical data: Chi-squared tests,a,2,This week lecture will cover.,Analysing categorical data (nominal) Chi-square test of differences between proportions Chi-square test of independence,a,3,SPSS單樣本非參數(shù)檢驗(yàn),總體分布的chi-square檢驗(yàn) (1)目的: 根據(jù)樣本數(shù)
2、據(jù)推斷總體的分布與某個(gè)已知分布是否有顯著差異-吻合性檢驗(yàn)。 適用于分類資料的統(tǒng)計(jì)推斷,a,4,SPSS單樣本非參數(shù)檢驗(yàn),總體分布的chi-square檢驗(yàn) (2)基本假設(shè): H0:總體分布與理論分布無顯著差異 (3)基本方法 根據(jù)已知總體的構(gòu)成比計(jì)算出樣本中各類別的期望頻數(shù),計(jì)算實(shí)際觀察頻數(shù)與期望頻數(shù)的差距,即:計(jì)算卡方值 卡方值較小,則實(shí)際頻數(shù)和期望頻數(shù)相差較小.如果P大于a,不能拒絕H0,認(rèn)為總體分布與已知分布無顯著差異.反之,a,5,SPSS單樣本卡方檢驗(yàn),總體分布的chi-square檢驗(yàn) (4)基本操作步驟: 菜單:analyze-nonparametric test-chi squ
3、are 選定待檢驗(yàn)變量入test variable list 框 確定待檢驗(yàn)個(gè)案的取值范圍(expected range) get from data:全部樣本 use specified range:用戶自定義個(gè)案范圍 指定期望頻數(shù)(expected values) all categories equal:所有類別有相同的構(gòu)成比 value:用戶自定義構(gòu)成比,a,6,Categorical variable,Variables that describe categories of entities Dealing with them all the time in statistics
4、Making comparisons among variables For example, whether consumers prefer a particular brand of a product among other competing brands. Checking whether there is a relationship between two categorical variables Gender and preference for a product, whether the preference for a product is independent f
5、rom gender,a,7,Chi-square test for differences between proportions,This test involves with nominal data produced by multinomial experiment It is a generalisation of a binomial experiment These test the null hypothesis that data in the target population has a particular probability distribution. Exam
6、ple 1 We might test whether consumers are indifferent to which of four materials (glass, plastic, steel or aluminium) that could be used to make soft drink containers. The null hypothesis is that they are indifferent (or that equal numbers prefer glass, plastic, steel and aluminium).,a,8,Example 1,D
7、ata Let pG be the probability that an individual selected at random will nominate glass as his/her preference if required to make a choice. Similarly for pP (plastic), pS (steel) and pA (aluminium) Hypotheses HO: pG = pP = pS = pA = 0.25. HA: at least one pi 0.25. The alternative is that at least on
8、e material is more preferred (or less preferred) than the others.,a,9,Example 1cont.,Procedure: Select a random sample of, say, 100 consumers and determine their preferences. Under the null hypothesis We expect 25 consumers to nominate glass, 25 to nominate plastic, 25 to nominate steel and 25 to no
9、minate aluminium These are the expected frequencies, Ei. Ei = n pi. We compare the expected frequencies with the sample results or the observed frequencies, Oi. If they are approximately the same we would conclude that the null hypothesis is true. Oi Ei HO is probably true.,a,10,Example 1cont., Chi
10、square,We require a test statistic to decide whether the difference is large enough to reject the null hypothesis. We use chi square with G - 1 degrees of freedom where G is the number of groups.,Suppose in our example, 39 prefer glass, 16 prefer plastic, 20 prefer steel and 25 prefer aluminium. Rec
11、all that the expected frequencies were all 25.,a,11,Obtain the critical value of chi square Critical 23 = 7.82. Obtain the critical value at 5% significance level at 3 d.f., (Table E4, page 742, Berenson et.al. 2013) i.e. there is only a 5 percent chance or less that 23 7.82 if HO is true. Compariso
12、n of chi square values 23 = 12.08 7.82 reject HO. Conclusion: at the 5% significance level there is sufficient evidence to reject the null hypothesis. At least one of the probabilities (pi) is different. The sample results indicate that the materials are not equally preferred by consumers in the tar
13、get population. Thus, at least preferences for two materials are different.,a,12,Chi square test using SPSS,Example : Suppose that we want to test whether or not customers have a colour preference for packaging. Three different colours, Blue, Green 不同層為水平數(shù)積. (5)是否顯示各分組的棒圖(display clustered bar chart
14、s ),a,21,產(chǎn)生交叉列聯(lián)表,進(jìn)一步計(jì)算 cells選項(xiàng):選擇在頻數(shù)分析表中輸出各種百分比. row:行百分比(Row pct); column:列百分比(Col pct); total:總百分比(Tot pct);,a,22,分析列聯(lián)表中變量間的關(guān)系,目的: 通過列聯(lián)表分析,檢驗(yàn)行列變量之間是否獨(dú)立。 方法: 卡方檢驗(yàn):對品質(zhì)數(shù)據(jù)的相關(guān)性進(jìn)行度量,a,23,分析列聯(lián)表中變量間的關(guān)系,卡方檢驗(yàn) 年齡與工資收入交叉列聯(lián)表 低 中 高 青 400 0 0 中 0 500 0 老 0 0 600 低 中 高 青 0 0 500 中 0 600 0 老 400 0 0,a,24,分析列聯(lián)表中變量間
15、的關(guān)系,卡方檢驗(yàn)基本步驟 (1)H0:行列變量之間無關(guān)聯(lián)或相互獨(dú)立 (2)構(gòu)造卡方統(tǒng)計(jì)量 統(tǒng)計(jì)量服從(r-1)*(c-1)個(gè)自由度的卡方分布 count:觀察(實(shí)際)頻數(shù) expected count:期望頻數(shù) (期望頻數(shù)反映的是H0成立情況下的數(shù)據(jù)分布特征) Residual:剩余 (觀察頻數(shù)-期望頻數(shù)),a,25,1、列聯(lián)表,2、三維柱形圖,3、二維條形圖,從三維柱形圖能清晰看出 各個(gè)頻數(shù)的相對大小。,從二維條形圖能看出,吸煙者中 患肺癌的比例高于不患肺癌的比例。,通過圖形直觀判斷兩個(gè)分類變量是否相關(guān):,a,26,Tests of independence cont,Example 2 S
16、uppose we interviewed 400 people & asked them which of three age groups they are in (under 25, 25 to 60, and over 60). We also ask their response to the statement that “All imports of automobiles should be banned in order to protect the local industry” (agree, no view either way, disagree).,attitude
17、s towards banning imports agree no view disagree Total age group under 25 19 53 25 97 25 - 60 46 94 47 187 over 60 30 56 30 116 Total 95 203 102 400,a,27,Tests of independence cont,Example 2 cont. Null hypothesis: The null hypothesis is that answers to the two questions are independent. Under the nu
18、ll: Probover 60 and agree = Probover 60 Probagree Multiplication rule for independent events Expected frequency= Probover 60 Probagree sample size.,a,28,Procedure We set up a cross-tabulation showing the observed frequencies of answers to the two questions. We calculate the expected frequencies. Tes
19、t Our test is based on a comparison of the observed and expected frequencies. Short-cut for expected frequencies,a,29,Age *attitude to banning imports Cross tabulation,19.0,53.0,25.0,97.0,23.0,49.2,24.7,96.9,46.0,94.0,47.0,187.0,44.4,94.9,47.7,187.0,30.0,56.0,30.0,116.0,27.6,58.9,29.6,116.1,95.0,203
20、.0,102.0,400.0,95.0,203.0,102.0,400.0,Count,Expected Count,Count,Expected Count,Count,Expected Count,Count,Expected Count,Under 25,25-60,Over 60,Age,Group,Total,Agree,No view,Disagree,Attitude to ban imports,Total,Calculation for expected frequency of agree and over 60, 95 116 / 400,a,30,Age *attitu
21、de to banning imports Cross tabulation,19.0,53.0,25.0,97.0,23.0,49.2,24.7,96.9,46.0,94.0,47.0,187.0,44.4,94.9,47.7,187.0,30.0,56.0,30.0,116.0,27.6,58.9,29.6,116.1,95.0,203.0,102.0,400.0,95.0,203.0,102.0,400.0,Count,Expected Count,Count,Expected Count,Count,Expected Count,Count,Expected Count,Under 2
22、5,25-60,Over 60,Age,Group,Total,Agree,No view,Disagree,Attitude to ban imports,Total,The count (observed) and the expected are different, but different enough to reject the null?,a,31,Chi-squared test for independence,Rationale: Oij Eij HO is probably true. Test statistic We require a test statistic to decide whether the difference is large enough to reject the null hypothesis.,a,32,Chi-Square Tests,1.438,a,4,.837,1.517,4,.805,1.307,1,.758,400,Pearson Chi-Square,Likelihood Ratio,Linear-by-Linear,Association,N of Valid C
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 廣西田陽高中2025年高三沖刺押題(最后一卷)物理試題試卷含解析
- 西南科技大學(xué)《人工智能》2023-2024學(xué)年第二學(xué)期期末試卷
- 甘肅省嘉峪關(guān)市金川區(qū)2025年六年級數(shù)學(xué)小升初摸底考試含解析
- DB1411T 72-2024丘陵區(qū)谷子精量穴播免間苗種植技術(shù)規(guī)程
- DB15T 281-2024主要造林樹種種子質(zhì)量分級
- 土地利用規(guī)劃中的文化產(chǎn)業(yè)發(fā)展策略考核試卷
- 油料作物種植與農(nóng)業(yè)科普教育普及考核試卷
- 化肥銷售團(tuán)隊(duì)的績效評估與激勵(lì)考核試卷
- 海洋生物基因資源利用考核試卷
- 染整工藝在醫(yī)療繃帶材料中的應(yīng)用考核試卷
- YS/T 555.1-2009鉬精礦化學(xué)分析方法鉬量的測定鉬酸鉛重量法
- 水利工程(水電站)全套安全生產(chǎn)操作規(guī)程
- 學(xué)生宿舍宿管人員查寢記錄表
- 配電間巡檢記錄表
- ISO 31000-2018 風(fēng)險(xiǎn)管理標(biāo)準(zhǔn)-中文版
- 雙人法成生命支持評分表
- DBJ61_T 179-2021 房屋建筑與市政基礎(chǔ)設(shè)施工程專業(yè)人員配備標(biāo)準(zhǔn)
- 畢業(yè)設(shè)計(jì)三交河煤礦2煤層開采初步設(shè)計(jì)
- 預(yù)應(yīng)力錨索施工全套表格模板
- 食品流通許可證食品經(jīng)營操作流程圖
- 風(fēng)電場工作安全培訓(xùn)
評論
0/150
提交評論