




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
1、實(shí)用教程(Teradata)陸世潮2008年9月第1頁,共81頁。問題總結(jié)常見問題分類:表屬性不對: Set / Multiset問題:INSERT操作慢主索引(PI)設(shè)置不合理問題1:數(shù)據(jù)傾斜度大,空間爆滿。問題2:JOIN操作,數(shù)據(jù)需要重分布。 分區(qū)索引(PPI)設(shè)置不合理問題:全表掃描連接條件過于復(fù)雜問題:系統(tǒng)無法優(yōu)化執(zhí)行計(jì)劃缺乏統(tǒng)計(jì)信息問題:系統(tǒng)無法找到最優(yōu)化的執(zhí)行計(jì)劃SQL跑得慢哈!第2頁,共81頁。提綱Teradata架構(gòu)常見問題,及解決方法Teradata工具實(shí)用小技巧JOIN的實(shí)現(xiàn)機(jī)制JOIN的優(yōu)化第3頁,共81頁。Teradata 體系架構(gòu)Teradata and MPP S
2、ystemsRDBMS ARCH第4頁,共81頁。Logical Example of NPPI versus PPI4 AMPs with Orders Table defined with PPI on O_Date.RHO_#O_DateRHO_#O_DateRHO_#O_DateRHO_#O_Date35100706/0126100206/0124100406/0120100506/0139101106/0136101206/0132100306/0143101006/0114100106/0106100906/0104100806/0108100606/0103101606/0217
3、101306/0248102306/0207101706/0216102106/0245101506/0209101806/0227101406/0244102206/0202102406/0211101906/0222102006/0201102806/0312103106/0328103206/0310103406/0329103306/0334102906/0319102506/0340103506/0347102706/0325103606/0331102606/0346103006/0323104006/0430103806/0442104706/0413103706/0421104
4、506/0436104306/0405104806/0415104206/0433103906/0418104106/0438104606/0441104406/044 AMPs with Orders Table defined with NPPI.01102806/0312103106/0328103206/0310103406/0329103306/0334102906/0319102506/0340103506/0347102706/0325103606/0331102606/0346103006/0303101606/0217101306/0248102306/0207101706/
5、0216102106/0245101506/0209101806/0227101406/0244102206/0202102406/0211101906/0222102006/0214100106/0135100706/0139101106/0106100906/0126100206/0136101206/0104100806/0124100406/0132100306/0108100606/0120100506/0143101006/0123104006/0430103806/0442104706/0413103706/0421104506/0436104306/0405104806/041
6、5104206/0433103906/0418104106/0438104606/0441104406/04RHO_#O_DateRHO_#O_DateRHO_#O_DateRHO_#O_Date第5頁,共81頁。提綱Teradata架構(gòu)常見問題,及解決方法Teradata工具實(shí)用小技巧JOIN的實(shí)現(xiàn)機(jī)制JOIN的優(yōu)化第6頁,共81頁。表屬性:Set & MultiSetSet Table不允許記錄重復(fù)MultiSet Table允許記錄重復(fù)默認(rèn)值:Set TableCreate Table. AS . 生成的目標(biāo)表屬性默認(rèn)為Set Table對SET Table進(jìn)行INSERT操作,需要檢查
7、是否存在重復(fù)記錄相當(dāng)?shù)暮馁Y源若真要限定唯一性,可以通過UPI或USI實(shí)現(xiàn)CREATE SET TABLE pmart.RPT_NM_GRP_PRE_WARN_MON,(CAL_Month INTEGER TITLE 統(tǒng)計(jì)月份 ,ORG_NUM CHAR(12) TITLE 集團(tuán)編號(hào) ,City_ID CHAR(3) TITLE 地市標(biāo)識(shí) ,ORG_SUBS_GRP_NUM CHAR(10) TITLE 集團(tuán)用戶群編號(hào) ,ORG_Title VARCHAR(200) TITLE 集團(tuán)名稱 ,ORG_Level CHAR(2) TITLE 集團(tuán)級別 ,STAT_Item_Code CHAR(2)
8、TITLE 統(tǒng)計(jì)項(xiàng) ,STAT_Value DECIMAL(18,2) TITLE 統(tǒng)計(jì)值 ) PRIMARY INDEX (ORG_NUM);例子:pmart.RPT_NM_GRP_PRE_WARN_MON 內(nèi)蒙移動(dòng)集團(tuán)客戶預(yù)警指標(biāo)月報(bào)表假設(shè)原有1286449條記錄插入:152853條記錄耗時(shí):15秒第7頁,共81頁。表屬性:Set & MultiSet (cont.)CREATE MULTISET TABLE pmart.RPT_NM_GRP_PRE_WARN_MON(CAL_Month INTEGER TITLE 統(tǒng)計(jì)月份 ,ORG_NUM CHAR(12) TITLE 集團(tuán)編號(hào) ,Ci
9、ty_ID CHAR(3) TITLE 地市標(biāo)識(shí) ,ORG_SUBS_GRP_NUM CHAR(10) TITLE 集團(tuán)用戶群編號(hào) ,ORG_Title VARCHAR(200) TITLE 集團(tuán)名稱 ,ORG_Level CHAR(2) TITLE 集團(tuán)級別 ,STAT_Item_Code CHAR(2) TITLE 統(tǒng)計(jì)項(xiàng) ,STAT_Value DECIMAL(18,2) TITLE 統(tǒng)計(jì)值 ) PRIMARY INDEX (ORG_NUM);例子:pmart.RPT_NM_GRP_PRE_WARN_MON 內(nèi)蒙移動(dòng)集團(tuán)客戶預(yù)警指標(biāo)月報(bào)表建議:Teradata中都用 MultiSet假設(shè)
10、原有1286449條記錄插入:152853條記錄耗時(shí):1秒例子: CREATE MULTISET TABLE tttemp.VT_SUBS_VIOC_QUAN as ( SELECT * FROM tttemp.MID_SUBS_VIOC_QUAN WHERE CAL_MONTH = 200802 AND * )WITH DATA PRIMARY INDEX ( subs_id);臨時(shí)表, 默認(rèn)為: Set需要指定為: Multiset字段越多,記錄越多差別越明顯第8頁,共81頁。PI(Primary Index 主索引)的選擇PI影響數(shù)據(jù)的存儲(chǔ)與訪問,其選擇標(biāo)準(zhǔn):不同值盡量多的字段(More
11、 Unique Values)使用頻繁的字段:包括值訪問和連接訪問少更新PI字段不宜太多最好是手動(dòng)指定PI 例子:用戶語音業(yè)務(wù)量中間表CREATE MULTISET TABLE tttemp.MID_SUBS_VIOC_QUAN ( CAL_Month INTEGER TITLE 統(tǒng)計(jì)月份, City_ID CHAR(4) TITLE 地市標(biāo)識(shí), Channel_ID CHAR(8) TITLE 渠道標(biāo)識(shí), Subs_id CHAR(12) TITLE 用戶標(biāo)識(shí), 。) PRIMARY INDEX ( subs_id);例子:用戶語音業(yè)務(wù)量臨時(shí)表CREATE MULTISET TABLE tt
12、temp.VT_SUBS_VIOC_QUAN as ( SELECT * FROM tttemp.MID_SUBS_VIOC_QUAN WHERE CAL_MONTH = 200802 AND * )WITH DATA PRIMARY INDEX ( subs_id);Subs_ID: 頻繁使用 Unique Value多如果不指定PI,系統(tǒng)默認(rèn)為:Cal_Month第9頁,共81頁。PI(Primary Index 主索引)的選擇(cont.)例子:夢網(wǎng)客戶活躍客戶分析CREATE MULTISET TABLE PMART.FCT_DATA_MONNET_ACTIVE_MON ( CAL_M
13、onth INTEGER TITLE 統(tǒng)計(jì)月份, City_ID CHAR(4) TITLE 地市標(biāo)識(shí), Channel_ID CHAR(8) TITLE 渠道標(biāo)識(shí), Mont_SVC_Type_Cod CHAR(3) TITLE 夢網(wǎng)業(yè)務(wù)類型編碼, Mont_SVC_CAT_MicroCls_Cod CHAR(3) TITLE 夢網(wǎng)業(yè)務(wù)分類小類編碼, Mont_SVC_CHRG_Type_Cod CHAR(2) TITLE 夢網(wǎng)業(yè)務(wù)計(jì)費(fèi)類型編碼, THR_Brand_Cod CHAR(1) TITLE 三大品牌編碼, Mont_Consume_Level_Cod CHAR(2) TITLE
14、夢網(wǎng)消費(fèi)層次編碼, Consume_Level_Cod CHAR(2) TITLE 消費(fèi)層次編碼, 。)PRIMARY INDEX ( CAL_Month ,City_ID ,Channel_ID ,Mont_SVC_Type_Cod ,Mont_SVC_CAT_MicroCls_Cod ,Mont_SVC_CHRG_Type_Cod,THR_Brand_Cod ,Mont_Consume_Level_Cod ,Consume_Level_Cod );PI:9字段 2字段: City_ID ,Channel_ID 調(diào)整PI后,在右邊的SQL中,PI是否起作用?以下SQL,PI是否起作用?:1.
15、值訪問Select *From FCT_DATA_MONNET_ACTIVE_MONWhere City_ID = 070010 and Channel_ID= 0100 and cal_month = 2007072.連接訪問Select *From FCT_DATA_MONNET_ACTIVE_MON ALEFT JOIN MID_CHANNEL_INFO_DAILY B ON A. Channel_ID = B. Channel_ID and A. City_ID = b. City_ID LEFT JOIN VW_CDE_REGION_TYPE C ON A. City_ID = C.
16、 City_ID 3、值訪問連接訪問Select *From FCT_DATA_MONNET_ACTIVE_MON A, VT_INFO BWHERE A. Channel_ID = B. Channel_ID AND A. City_ID = B. City_ID AND A.CAL_MONTH = 200707 AND A. Consume_Level_Cod=B. Consume_Level_Cod第10頁,共81頁。PPI的使用PPI(Partition Primary Index,分區(qū)索引),把具有相同分區(qū)值的數(shù)據(jù)聚簇存放在一起;類似于SQL Server的聚簇索引(Cluster
17、Index),Oracle的聚簇表(Cluster Table)。利用PPI,可以快速插入/訪問同一個(gè)Partition(分區(qū))的數(shù)據(jù)。CREATE MULTISET TABLE qdata.TB_DQC_KPI_CHECK_RESULT ( TX_DATE DATE FORMAT YYYYMMDD TITLE 數(shù)據(jù)日期 NOT NULL, KPI_CODE INTEGER TITLE 指標(biāo)代碼 NOT NULL, 。 )PRIMARY INDEX ( KPI_CODE )PARTITION BY RANGE_N(TX_DATE BETWEEN CAST(20030101) AS DATE F
18、ORMAT YYYYMMDD) AND CAST(20191231) AS DATE FORMAT YYYYMMDD) EACH INTERVAL 1 DAY , NO RANGE OR UNKNOWN);Select *From TB_DQC_KPI_CHECK_RESULTWhere tx_date = 20070701;或Where tx_date between 20070701 and 20070731;或Where tx_date 20070701;但Where tx_date like 200707%;不起作用第11頁,共81頁。PPI的使用(cont.)Partition上不要
19、使用表達(dá)式,否則Partition不能被正確使用。T1. tx_date/100=CAST(20070917AS DATE FORMAT YYYYMMDD)/100Substring(T1. tx_date from 1 for 6) =200709應(yīng)該修改為 T1. tx_date=CAST(20070901 AS DATE FORMAT YYYYMMDD)第12頁,共81頁。PPI的使用(cont.) 腳本:tb_030040270.pl/* 刪除當(dāng)月 */ 2小時(shí) del BASS1.tb_03004 where proc_dt = 200709;insert into BASS1.tb
20、_03004 7小時(shí)。 sel . from pview.vw_evt_cust_so cust where acpt_date=cast(200710|01 as date) cast(200710|01 as date)寫法錯(cuò)誤,PPI不起作用日期的正確寫法:Cast(20071001 as date format YYYYMMDD)在proc_dt建立PPIPPI字段從Load_Date調(diào)整為acpt_date第13頁,共81頁。創(chuàng)建可變臨時(shí)表它僅存活于同一個(gè)Session之內(nèi)注意指定可變臨時(shí)表為multiset(通常也要指定PI)可變臨時(shí)表不能帶有PPI例子1:create volat
21、ile multiset table vt_RETAIN_ANLY_MON as ( select col1,col2, from where group by . )with data PRIMARY INDEX (PI_Cols) ON COMMIT PRESERVE ROWS;例子2: create volatile multiset table vt_RETAIN_ANLY_MON ( col1 char(2), col2 varchar(12) NOT NULL)PRIMARY INDEX (PI_Cols) ON COMMIT PRESERVE ROWS;第14頁,共81頁。創(chuàng)建可
22、變臨時(shí)表(cont.) 例子3:create volatile multiset table vt_RETAIN_ANLY_MON as ( select col1, cast(adc as varchar(12) col2 from where )with no data PRIMARY INDEX (col1) ON COMMIT PRESERVE ROWS; 例子4:create volatile multiset table vt_net_gsm_nl as pdata.tb_net_gsm_nl with no data ON COMMIT PRESERVE ROWS;字段col2將
23、用unicode字符集;當(dāng)跟普通字段(latin字符集)join時(shí),需要進(jìn)行數(shù)據(jù)重新分布。不建議失?。阂?yàn)閜data.tb_net_gsm_nl 有PPI而可變臨時(shí)表不允許有PPI第15頁,共81頁。固化臨時(shí)表固化臨時(shí)表,就是把查詢結(jié)果存放到一張物理表。共下次分析或他人使用Session斷開之后,仍然可以使用。示例1: CREATE MULTISET TABLE tttemp.TMP_BOSS_VOIC as ( select * from pview.vw_net_gsm_nl) WITH no DATA PRIMARY INDEX (subs_id);INSERT INTO tttemp.
24、TMP_BOSS_VOICSELECT * FROM pview.vw_net_gsm_nl WHERE *;示例2:CREATE MULTISET TABLE tttemp.TMP_BOSS_VOIC as ( select * from pview.vw_net_gsm_nl WHERE *) WITH DATA PRIMARY INDEX (subs_id);示例3:(復(fù)制表,數(shù)據(jù)備份)CREATE MULTISET TABLE tttemp.TMP_BOSS_VOIC AS pdata.tb_net_gsm_nl WITH DATA ;第16頁,共81頁。數(shù)據(jù)類型注意非日期字段與日期字
25、段char & date的轉(zhuǎn)換與關(guān)聯(lián):如果數(shù)據(jù)類型一致可以直接使用;在CASE WHEN or COALESCE一定要使用顯式的類型轉(zhuǎn)換(CAST)CASE WHEN A = B THEN DATE1 ELSE 20061031 END應(yīng)寫成CASE WHEN A = B THEN DATE1 ELSE CAST(20061031 AS DATE) END數(shù)值運(yùn)算時(shí),確保運(yùn)算過程中不丟失計(jì)算精度。CAST(100/3 AS DEC(5,2)應(yīng)該寫成CAST(100/3.00 AS DEC(5,2)第17頁,共81頁。字符(串)與數(shù)字相比較比較規(guī)則:1) 比較兩個(gè)值(字段),它們的類型必須一樣!
26、2) 當(dāng)字符(串)與數(shù)字相比較時(shí),先把字符(串)轉(zhuǎn)換成數(shù)字,再進(jìn)行比較。3) 經(jīng)分系統(tǒng)中容易出錯(cuò)的,有Cal_Month字段Case 1Table 1CREATE TABLE Emp1 (Emp_noCHAR(6), Emp_nameCHAR(20)PRIMARY INDEX (Emp_no);Statement 1SELECT *FROMEmp1WHEREEmp_no = 1234;Statement 2SELECT *FROMEmp1WHEREEmp_no = 1234;Table 1CREATE TABLE Emp2 (Emp_noINTEGER, Emp_nameCHAR(20)PRI
27、MARY INDEX (Emp_no);Statement 1SELECT *FROMEmp2WHEREEmp_no = 1234;Statement 2SELECT *FROMEmp2WHEREEmp_no = 1234;Case 2Results in Full Table ScanResults in unnecessary conversion 第18頁,共81頁。目標(biāo)列的選擇減少目標(biāo)列,可以少消耗SPOOL空間,從而提高SQL的效率當(dāng)系統(tǒng)任務(wù)繁忙,系統(tǒng)內(nèi)存少的時(shí)候,效果尤為明顯。舉例:GSM語言話單表,PDATA.TB_NET_GSM_NL共有73字段,以下SQL供返回1.6億條記錄
28、左邊的SQL,記錄最長為:698字節(jié),平均399字節(jié)右邊的SQL,記錄最長為:59字節(jié), 平均30字節(jié)兩者相差400多GB的SPOOL空間,IO次數(shù)也隨著相差甚大!SPOOL空間估計(jì):497 GBSPOOL空間估計(jì):42 GBSELECT SUBS_ID ,MSISDN ,Begin_Date ,Begin_Time ,Call_DUR ,CHRG_DURFROM PDATA.TB_NET_GSM_NLWHERE PROC_DATE BETWEEN 20070701 AND 20070731 SELECT * FROM PDATA.TB_NET_GSM_NLWHERE PROC_DATE BE
29、TWEEN 20070701 AND 20070731第19頁,共81頁。Where條件的限定根據(jù)Where條件先進(jìn)行過濾數(shù)據(jù)集,再進(jìn)行連接(JOIN)等操作這樣,可以減少參與連接操作的數(shù)據(jù)集大小,從而提高效率好的查詢引擎,可以自動(dòng)優(yōu)化;但有些復(fù)雜SQL,查詢引擎優(yōu)化得并不好。注意:系統(tǒng)的SQL優(yōu)化,只是避免最差的,選擇相對優(yōu)的,未必能夠得到最好的優(yōu)化結(jié)果。SELECT A.TX_DATE, A.KPI_CODE ,B.SRC_NAME,A.KPI_VALUEFROM ( select * from qdata.tb_dqc_kpi_check_result where TX_DATE = 2
30、0070701 AND KPI_CODE = 65 ) ALEFT JOIN ( SELECT * FROM qdata.tb_dqc_kpi_def where KPI_CODE = 65 and N_TYPE = M) BON A.KPI_CODE = B.KPI_CODE SELECT A.TX_DATE, A.KPI_CODE ,coalesce(B.SRC_NAME, no name) ,A.KPI_VALUEFROM qdata.tb_dqc_kpi_check_result ALEFT JOIN qdata.tb_dqc_kpi_def BON A.KPI_CODE = B.KP
31、I_CODE WHERE A. TX_DATE = 20070701 AND A.KPI_CODE = 65 AND B.N_TYPE = M rewrite第20頁,共81頁。用Case When替代UNION sel city_id,channel_id,cust_brand_id,sum(stat_values) as stat_valuesfrom ( select t.city_id 語音雜志計(jì)費(fèi)量 ,coalesce(v.channel_id,b.channel_id,-) as channel_id ,cust_brand_id ,sum(case when SMS_SVC_Ty
32、pe_Level_SECND = 017 and Call_Type_Code in (00,10,01,11) then sms_quan else 0 END) as stat_values from PVIEW.vw_mid_sms_svc_quan_daily t left join VT_SUBS v on t.subs_id=v.subs_id left join PVIEW.vw_FCT_CDE_BUSN_CITY_TYPE b on t.city_id=b.City_ID where cal_date=20070914 group by 1,2,3 union all sele
33、ct t.city_id 夢網(wǎng)短信計(jì)費(fèi)量 ,coalesce(v.channel_id,b.channel_id,-) as channel_id ,cust_brand_id ,sum(sms_quan) as stat_values from PVIEW.vw_mid_sms_svc_quan_daily t left join VT_SUBS v on t.subs_id=v.subs_id left join PVIEW.vw_FCT_CDE_BUSN_CITY_TYPE b on t.city_id=b.City_ID where cal_date=20070914 and SMS_
34、SVC_Type_Level_SECND like 02% and SMS_SVC_Type_Level_SECND not in (021,022) group by 1,2,3 )tmpGroup by 1,2,3兩個(gè)子查詢的表連接部分完全一樣兩個(gè)子查詢除了取數(shù)據(jù)條件,其它都一樣。Union all是多余的,它需要重復(fù)掃描數(shù)據(jù),進(jìn)行重復(fù)的JOIN可以用Case when替代union 作業(yè):KPI_NWR_SMS_BILL_QUAN描述:點(diǎn)對點(diǎn)短信計(jì)費(fèi)量腳本: kpi_nwr_sms_bill_quan0600.pl第21頁,共81頁。用Case When替代UNION (cont.)se
35、l city_id,channel_id,cust_brand_id,sum(stat_values) as stat_valuesfrom ( select t.city_id ,coalesce(v.channel_id,b.channel_id,-) as channel_id ,cust_brand_id ,sum(CASE WHEN SMS_SVC_Type_Level_SECND = 017 and Call_Type_Code in (00,10,01,11) THEN sms_quan 語音雜志計(jì)費(fèi)量 WHEN SMS_SVC_Type_Level_SECND like 02%
36、 and SMS_SVC_Type_Level_SECND not in (021,022) THEN sms_quan 夢網(wǎng)短信計(jì)費(fèi)量 ELSE 0 END ) as stat_values from PVIEW.vw_mid_sms_svc_quan_daily t left join VT_SUBS v on t.subs_id=v.subs_id left join PVIEW.vw_FCT_CDE_BUSN_CITY_TYPE b on t.city_id=b.City_ID where cal_date=20070914)tmpGroup by 1,2,3SQL優(yōu)化重寫第22頁,共
37、81頁。用OR替代UNION Select city_id , channel_id, cust_brand_id, sum(sms_quan ) stat_valuesfrom( select t.city_id 語音雜志計(jì)費(fèi)量 ,coalesce(v.channel_id,b.channel_id,-) as channel_id ,cust_brand_id ,sum(sms_quan ) stat_values from PVIEW.vw_mid_sms_svc_quan_daily t left join VT_SUBS v on t.subs_id=v.subs_id left j
38、oin PVIEW.vw_FCT_CDE_BUSN_CITY_TYPE b on t.city_id=b.City_ID where cal_date=20070914 and SMS_SVC_Type_Level_SECND = 017 and Call_Type_Code in (00,10,01,11) group by 1,2,3 union all select t.city_id 夢網(wǎng)短信計(jì)費(fèi)量 ,coalesce(v.channel_id,b.channel_id,-) as channel_id ,cust_brand_id ,sum(sms_quan) as stat_val
39、ues from PVIEW.vw_mid_sms_svc_quan_daily t left join VT_SUBS v on t.subs_id=v.subs_id left join PVIEW.vw_FCT_CDE_BUSN_CITY_TYPE b on t.city_id=b.City_ID where cal_date=20070914 and SMS_SVC_Type_Level_SECND like 02% and SMS_SVC_Type_Level_SECND not in (021,022) group by 1,2,3 )T Group by 1,2,3兩個(gè)子查詢的表
40、連接部分完全一樣兩個(gè)子查詢除了取數(shù)據(jù)條件,其它都一樣。Union all是多余的,它需要重復(fù)掃描數(shù)據(jù),進(jìn)行重復(fù)的JOIN可以用OR替代union 此類的問題,在腳本中經(jīng)常見到。第23頁,共81頁。用OR替代UNION (cont.)select t.city_id ,coalesce(v.channel_id,b.channel_id,-) as channel_id ,cust_brand_id ,sum( sms_quan) as stat_values from PVIEW.vw_mid_sms_svc_quan_daily t left join VT_SUBS v on t.subs
41、_id=v.subs_id left join PVIEW.vw_FCT_CDE_BUSN_CITY_TYPE b on t.city_id=b.City_ID where cal_date=20070914 and ( SMS_SVC_Type_Level_SECND = 017 語音雜志計(jì)費(fèi)量 and Call_Type_Code in (00,10,01,11) ) OR (SMS_SVC_Type_Level_SECND like 02% 夢網(wǎng)短信計(jì)費(fèi)量 and SMS_SVC_Type_Level_SECND not in (021,022) ) )Group by 1,2,3SQL
42、優(yōu)化重寫第24頁,共81頁。去掉多余的Distinct與Group by sel t.operator ,t.acpt_channel_id ,t.acpt_city_id ,t.subs_id ,t.acpt_date as evt_date From ( sel operator, ACPT_Channel_ID, acpt_city_id,subs_id, acpt_date from pview.vw_evt_cust_so cust where acpt_date =20071007 and so_meth_code in(0,1,2) and PROC_STS_Code =-1 g
43、roup by 1,2,3,4,5union all sel operator_num as operator, ACPT_Channel_ID, acpt_city_id, subs.subs_id, charge_date as acpt_date from pview.vw_fin_busi_rec bus join crmmart.subs_day_info_daily subs on subs.msisdn=bus.msisdn where charge_date =20071007 group by 1,2,3,4,5 )t group by 1,2,3,4,5;既然t查詢外層有g(shù)
44、roup by操作去重,那么子查詢內(nèi)的Group by去重是多余的。而且,兩個(gè)子查詢group by后再用union all,就可能再產(chǎn)生重復(fù)記錄,那么group by也失去意義了。解決方法: 把t查詢內(nèi)部的兩個(gè)group by去掉即可 類似的Distinct問題,可效仿解決。去重去重去重第25頁,共81頁。Group by vs. DistinctDistinct是去除重復(fù)的操作Group by是聚集操作某些情況下,兩者可以起到相同的作用。兩者的執(zhí)行計(jì)劃不一樣,效率也不一樣建議:使用Group byselect subs_id ,acct_idfrom PVIEW. VW_FIN_ACCT_
45、SUBS_HISwhere efct_date 20070701 group by 1,2select DISTINCT subs_id ,acct_idfrom PVIEW. VW_FIN_ACCT_SUBS_HISwhere efct_date 20070701 第26頁,共81頁。Union vs. Union all Union與Union all的作用是將多個(gè)SQL的結(jié)果進(jìn)行合并。Union將自動(dòng)剔除集合操作中的重復(fù)記錄;需要耗更多資源。Union all則保留重復(fù)記錄,一般建議使用Union all。第一個(gè)SELECT語句,決定輸出的字段名稱,標(biāo)題,格式等要求所有的SELECT語句
46、: 1) 必須要有同樣多的表達(dá)式數(shù)目; 2) 相關(guān)表達(dá)式的域必須兼容select *from (select a) T1(col1)unionselect *from (select bc)T2(col2)select *from (select bc)T3(col3)union allselect *from (select a) T1(col1)union allselect *from (select bc)T2(col2)col3abcbccol1ab第27頁,共81頁。先Group by再join腳本:rpt_mart_new_comm_mon0400.pl 11小時(shí)Select c
47、ase when b.CUST_Brand_ID is null then 5020 when b.CUST_Brand_ID in(2000,5010) then 5020 else b.CUST_Brand_ID end ,sum(COALESCE(b.Bas_CHRG_DUR_Unit,0) as Thsy_Accum_New_SUBS_CHRG_DUR , sum(case when b.call_type_code =20 then b.Bas_CHRG_DUR_Unit else 0 END) from VTNEW_SUBS_THISYEAR t inner join VTDUR_
48、MON b on t.Subs_ID=b.Subs_ID left join PVIEW.vw_MID_CDE_LONG_CALL_TYPE_LVL c on b.Long_Type_Level_SECND= c.Long_Type_Level_SECND left join PVIEW.vw_MID_CDE_ROAM_TYPE_LVL d on b.Roam_Type_Level_SECND= d.Roam_Type_Level_SECND group by 1;記錄數(shù)情況:t: 580萬,b: 9400萬, c:8, d:8 主要問題:假如連接順序?yàn)椋?( (b join c) join
49、d) join t)則是( (9400萬 join 8) join 8) join 580萬)數(shù)據(jù)分布時(shí)間長(IO多),連接次數(shù)多解決方法: 先執(zhí)行(t join b),然后groupby,再join c,d第28頁,共81頁。先Group by再join (cont.)腳本:rpt_mart_new_comm_mon0400.pl40秒Select case when b.CUST_Brand_ID is null then 5020 when b.CUST_Brand_ID in(2000,5010) then 5020 else b.CUST_Brand_ID end ,sum(COAL
50、ESCE(b.Bas_CHRG_DUR_Unit,0) as Thsy_Accum_New_SUBS_CHRG_DUR , sum(case when b.call_type_code =20 then b.Bas_CHRG_DUR_Unit else 0 END) from (select CUST_Brand_ID, call_type_code, Long_Type_Level_SECND, Roam_Type_Level_SECND, sum(Bas_CHRG_DUR_Unit) Bas_CHRG_DUR_Unit, count(*) quan from VTDUR_MON where
51、 subs_id in (select subs_id from VTNEW_SUBS_THISYEAR) group by 1,2,3,4 )b left join PVIEW.vw_MID_CDE_LONG_CALL_TYPE_LVL c on b.Long_Type_Level_SECND= c.Long_Type_Level_SECND left join PVIEW.vw_MID_CDE_ROAM_TYPE_LVL d on b.Roam_Type_Level_SECND= d.Roam_Type_Level_SECND group by 1;記錄數(shù)情況:t: 580萬,b: 940
52、0萬, c:8, d:8 處理過程: 先執(zhí)行(t join b),然后groupby,再join c,d結(jié)果:1、 VTDUR_MON join VTNEW_SUBS_THISYEAR PI相同,merge join,只需10秒2、經(jīng)過group by,b表只有332記錄3、b join c join d, 就是: 332 8 84、最終結(jié)果:5記錄,共40秒第29頁,共81頁。先Group by再join(cont.)先匯總再連接,可以減少參與連接的數(shù)據(jù)集大小,減少比較次數(shù),從而提高效率。以下面SQL為例,假設(shè)歷史表( History )有1億條記錄左邊的SQL,需要進(jìn)行 1億 90次比較右
53、邊的SQL,則只需要 1億 1 次比較SELECT H.product_id ,sum(H.account_num)FROM History H , Calendar DTWHERE H.sale_date = DT.calendar_date AND DT.quarter = 3 GROUP BY 1;SELECT H.product_id, SUM(H.account_num)FROM History H , (SELECT min(calendar_date) min_date ,max(calendar_date) max_date FROM Calendar WHERE quarte
54、r = 3 ) DT WHERE H.sale_date BETWEEN DT.min_date and DT.max_dateGROUP BY 1;第30頁,共81頁。提取公共SQL形成臨時(shí)表腳本:rpt_nmmart_comm_subs_mon0403.pl出現(xiàn)以下SQL代碼段,共5次,平均每次執(zhí)行需10分鐘 。 FROM PVIEW.VW_MID_VOIC_SVC_QUAN_MON a ,PVIEW.VW_MID_CDE_SUBS_BRAND_LVL b ,vt_subs c WHERE a.CUST_Brand_ID=b.SUBS_Brand_Level_Third AND a.CAL
55、_Month=200708 AND a.SUBS_ID=c.SUBS_ID 。整個(gè)腳本需要掃描以下SQL 14次,平均每次執(zhí)行需3分鐘PVIEW.VW_MID_VOIC_SVC_QUAN_MON where CAL_Month=200708提取公共SQL,形成臨時(shí)表,較少掃描(IO)次數(shù)。該腳本,經(jīng)過優(yōu)化之后,從50分鐘縮減至10分鐘第31頁,共81頁。關(guān)聯(lián)條件 (1)Select A.a2, B.b2 from A join Bon substring(A.a1 from 1 for 7) = B.b1應(yīng)該寫為Select A.a2, B.b2 from (select substring(
56、a1 from 1 for 7) as a1_new,a2 from A ) A_newjoin Bon a1_new = b1第32頁,共81頁。關(guān)聯(lián)條件 (2)Select A.a2, B.b2 from A join Bon TRIM(A.a1 ) = TRIM(B.b1)應(yīng)該寫為Select A.a2, B.b2 from A join Bon A.a1 = B.b1第33頁,共81頁。SQL書寫不當(dāng)可能會(huì)引起笛卡兒積以下面兩個(gè)SQL為例,它們將進(jìn)行笛卡兒積操作。例子1:Select employee.emp_no , employee.emp_nameFrom employee A例
57、子2:SELECT A.EMP_Name, B.Dept_NameFROM employee A, Department BWhere a.dept_no = b.dept_no;表Employee與表A進(jìn)行笛卡兒積表A與表B進(jìn)行笛卡兒積表A與表B進(jìn)行Inner Join第34頁,共81頁。修改表定義常見的表定義修改操作:增加字段修改字段長度建議的操作流程Rename table db.tablex as db.tabley;通過Show table語句獲得原表db.tablex的定義定義新表: db.tablexInsert into db.tablex(。) select 。 From d
58、b.tabley;Drop table db.tabley;Teradata提供ALTER TABLE語句,可進(jìn)行修改表定義但,不建議采用ALTER TABLE方式。第35頁,共81頁。插入/更新/刪除記錄時(shí),盡量不要Abort當(dāng)目標(biāo)表有數(shù)據(jù)時(shí),插入和更新操作,以及部分刪除,都產(chǎn)生TJ如果此時(shí)abort該操作,系統(tǒng)將會(huì)回滾 Delete BASS1.tb_03004 where proc_dt = 200709 ;UPDATE Customer SET Credit_Limit = Credit_Limit * 1.20 ; DELETE FROM Trans WHERE Trans_Date
59、 981231;DROP TABLE Trans;RENAME TABLE Trans_N TO Trans;先建立空表,通過insert / select 方式插入數(shù)據(jù)這是非常快的操作!先備份,然后做變更操作,更加安全!對于大表進(jìn)行Update/DELETE操作,將耗費(fèi)相當(dāng)多的資源與相當(dāng)長的時(shí)間。Update/Delete操作,需要事務(wù)日志TJ(Transient Journal)以防意外中斷導(dǎo)致數(shù)據(jù)受到破壞在Update/Delete操作中途被Cancel,系統(tǒng)則需回滾,這將耗更多的資源與時(shí)間!在經(jīng)分系統(tǒng)中,應(yīng)嚴(yán)防此類事件發(fā)生!DELETE FROM Trans WHERE Trans_D
60、ate 990101; 第37頁,共81頁。經(jīng)分系統(tǒng)的實(shí)體命名規(guī)范實(shí)體的命名,最長不超過30個(gè)字母;通常要求都是大寫。實(shí)體的命名:_后綴前綴:表:基礎(chǔ)表以TB_開頭中間表以MID_開頭應(yīng)用模塊的表以相應(yīng)的主體縮寫開頭視圖:一般地,視圖名稱與表名稱一一對應(yīng)。以VW_開頭。對于TB_開頭的表,把TB_替換成VW;對于其他表,加上VW_即可。宏:以M_或者M(jìn)acro_開頭后綴:歷史表:_HIS月表:_MON日表:_DAILY第38頁,共81頁。實(shí)體的命名規(guī)范示例TB_OFR_SUBS_HIS 用戶歷史Efct_date, End_date示例:Select *From pview.vw_ofr_su
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 無人機(jī)操控與航拍技術(shù)考核試卷
- 圖書館數(shù)字資源長期保存策略考核試卷
- 家電產(chǎn)品品質(zhì)監(jiān)控與質(zhì)量改進(jìn)考核試卷
- 整年運(yùn)輸合同范本
- 大板委托加工合同范本
- 修剪綠化直營合同范本
- 工地個(gè)人水電合同范本
- 小學(xué)生美術(shù)課件制作教學(xué)
- 名片合同范本
- 財(cái)務(wù)支出季度計(jì)劃工作的分解與執(zhí)行要點(diǎn)
- GB/T 18601-2009天然花崗石建筑板材
- 畢業(yè)設(shè)計(jì)論文-貝類脫殼機(jī)設(shè)計(jì)
- 八項(xiàng)規(guī)定學(xué)習(xí)課件
- 《工程電磁場》配套教學(xué)課件
- 《過零丁洋》公開課件
- 從生產(chǎn)工藝角度詳解磷酸鐵鋰
- 全套橋梁施工技術(shù)交底記錄
- 《教師職業(yè)道德》全書word版
- 城市定制型商業(yè)醫(yī)療保險(xiǎn)(惠民保)知識(shí)圖譜
- GB∕T 3836.31-2021 爆炸性環(huán)境 第31部分:由防粉塵點(diǎn)燃外殼“t”保護(hù)的設(shè)備
- AMDAR資料的分析和應(yīng)用
評論
0/150
提交評論