Inatica網(wǎng)格及原理.ppt_第1頁
Inatica網(wǎng)格及原理.ppt_第2頁
Inatica網(wǎng)格及原理.ppt_第3頁
Inatica網(wǎng)格及原理.ppt_第4頁
Inatica網(wǎng)格及原理.ppt_第5頁
已閱讀5頁,還剩52頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、PowerCenter Enterprise Grid A real look under the hood.,Informatica,Customer 案例Previously using xx xxxxx,業(yè)務(wù)驅(qū)動,客戶場景 全球Sales Compensation and Incentive應(yīng)用 Assignment of Sales Credits on Revenue generated selling Products ,最大的3個表 數(shù)據(jù)量:6000萬、6000萬、400萬,Power Center實現(xiàn)與執(zhí)行,時間:46分鐘,PowerCenterSW Architecture,

2、Data Access,RDBMS / Applications,Mainframe,Messages,Web Services,Cloud Computing,Flat Files; XML,Unstructured,PowerCenter 平臺架構(gòu),Data Delivery,RDBMS / Applications,Mainframe,Messages,Web Services,Cloud Computing,Flat Files; XML,Unstructured,Core Services,Portal Services,Client Tools,Any data Anywhere

3、Any time Any Latency,Any data Anywhere Any time Any Latency,HA Database Required only with HA Option,Separately licensed products/options.,Web Service Hub,PowerCenterEnterprise Grid,PowerCenter Grid 能力,High Availability(高可用性) Fail-Over Automatic restart of services (Repository Service and Integratio

4、n Service) Recovery Automatic restart or resume of work-in-progress at fail-over Resilience Automatic retry of failed connections Distributed Processing Workflow-on-Grid Sessions run across multiple nodes in grid (session dispatch granularity) Session-on-Grid A session runs across multiple nodes in

5、grid (sub-session dispatch granularity) Dynamic Load Balancing Load, metric and historical based decisions on where tasks are run Partitioning Parallel “partitions” enabled to run independently on “nodes” in a grid, allowing PowerCenter to scale a single session in a grid environment across hardware

6、 boundaries.,PowerCenter Services 框架,PowerCenter Domain,Client Tools,NOTE: 任意節(jié)點(diǎn)可以運(yùn)行任意一種服務(wù)。這個框架僅供討論目的,High Availability(多備份服務(wù)),HA File System Shared Directory,Access,HA Database,Backup Instance,Primary Instance,Node 1,P,B,B,Node 3,B,B,P,Node 2,B,P,B,Repository dbs,NOTE: 任意節(jié)點(diǎn)可以運(yùn)行任意一種服務(wù)。這個框架僅供討論目的,Work

7、flow(工作流) Recovery,當(dāng)中斷、異常發(fā)生時,工作流可從斷點(diǎn)處恢復(fù)完成 工作流恢復(fù)觸發(fā)機(jī)制: Failover Recovery that is initiated by user,Session recovery,Fail task and continue workflow Task is failed during recovery run and workflow will continue Restart task Task is restarted during recovery run Resume from last check-point Session stat

8、e is persisted during normal run and task is resumed from last check-point,PowerCenterEnterprise Grid Reference Architecture,HA/Enterprise Grid Reference Architecture,switch,/infa_shared/,Depends on reqs,1GB IP or ,Src/tgt dbs,Repository dbs,Domain dbs,LUNs,PowerCenterLoad Balancing,Load Balancing M

9、odes,Round Robin Honors maxNumProcess Metric-based Evaluates nodes in round-robin Honors resource provision thresholds 使用任務(wù)的最近3次運(yùn)行的靜態(tài)統(tǒng)計決定在那個節(jié)點(diǎn)上運(yùn)行該任務(wù) 如果尚無靜態(tài)統(tǒng)計,default values are used (40 MB memory, 15% CPU) Adaptive Selects node w/ the most available CPU Honors resource provision thresholds Uses stat

10、istics from last 3 runs of a task to determine whether a task can run on a node Bypass in dispatch queue: skip tasks in the queue that are more resource intensive and cant be dispatch to any currently available nodes CPU Profile Ranks node CPU performance against a baseline system Used in adaptive d

11、ispatch mode,Load Balancing Mode Comparison,Round-robin mode: Should be used if the workloads are similar and nodes are identical It should be used if predictability of dispatch is important Metrics-based and Adaptive mode provide the most uniform distribution: Recommended for non-uniform workloads

12、Recommended if nodes are not identical As a grid expands, it is likely that newer nodes are more powerful,PowerCenterSession on Grid,PowerCenter Data Smart Parallelism,SonG Session Details,Worker DTM Details,Sorter Cache Size,DTM Buffer Size,Agg Index Cache Size,Agg Data Cache Size,Jnr Index Cache S

13、ize,Jnr Data Cache Size,Default Buffer Block Size,Intra-process,Inter- process,Other Worker DTMs,Partitioning,Partitioning Myths,There is a set formula given X CPUs, create Y partitions or Z repartition points NO too many variables involved for simple formulas. In general, more CPUs can handle more

14、partitions, but it all depends Partitioning will always improve performance NO if youre already strapped for CPUs or source or target constrained, partitions may just make things worse The source and/or target must be partitioned to benefit from partitioning NO these help, but they are not mandatory

15、,Performance Tuning Example,Server ArchitectureThreads,Load Manager,Writer Thread Thread,Writer Thread,Reader Thread Thread,Reader Thread,Assume a mapping with an Aggregator, a Rank, and other transformations in a session with two partitions. Pre and Post session commands would add one thread each.,

16、DTM Memory,Mapping Thread,DTM Master Thread,Transformation Thread,Transformation Thread,Transformation Thread,Transformation Thread,Transformation Thread,Transformation Thread,Rank Threads,Aggregator Threads,LM Memory,Transformation Caches,Target Bottleneck,The transformation stage is waiting for ta

17、rget buffers,(First Stage) (Second Stage) (Third Stage) (Fourth Stage) Busy% Busy% Busy%=15 Busy%=95,Threads, Partition Points and Stages,Threads are created to move data down the pipeline,The data is moved in pipeline stages defined by partition points. Stages run in parallel,By default PowerCenter

18、 assigns a partition point at the Source Qualifier, Target, and Aggregator transformations.,Adding Partitions and Partition Points,Adding partitions increases the number of threads Adding partition points increases the number of pipeline stages Only add partitions or partition points if you have amp

19、le CPU bandwidth Adding partitions and partition points requires the Partitioning option,3 Reader Threads 6 Transformation Threads 3 Writer Threads (First Stage) (Second Stage)(Third Stage) (Fourth Stage),Threads - partition 1 Threads partition 2 Threads partition 3,Monitor System Statistics,Look at

20、 system performance with “idle” system Sometimes “idle” is not really idle because computers are usually shared resources,Example - Partitioning in Action,Default Session Thread Stats,MASTER PETL_24018 Thread READER_1_1_1 created for the read stage of partition point SQ_CNSM_SORTED has completed: To

21、tal Run Time = 117.609375 secs, Total Idle Time = 88.093750 secs, Busy Percentage = 25.096320. MASTER PETL_24019 Thread TRANSF_1_1_1_1 created for the transformation stage of partition point SQ_CNSM_SORTED has completed: Total Run Time = 281.515625 secs, Total Idle Time = 0.000000 secs, Busy Percent

22、age = 100.000000. MASTER PETL_24022 Thread WRITER_1_1_1 created for the write stage of partition point(s) CNSM_DIM_INSERT, CNSM_DIM_UPDATE, CNSM_LOC has completed: Total Run Time = 222.406250 secs, Total Idle Time = 199.640625 secs, Busy Percentage = 10.236055.,Monitor System Statistics,Does the sys

23、tem have the capacity to support more threads? Yes!,Add re-partition points,Partition points,Default partition point,Threads/Stages,Add re-partition points,Partition points,Default partition point,Threads/Stages,New (Abbreviated) Thread Stats,MASTER READER_1_1_1 SQ_CNSM_SORTED Busy Percentage = 17.3

24、79066. MASTER TRANSF_1_1_1_1 SQ_CNSM_SORTED Busy Percentage = 36.560798. MASTER TRANSF_1_1_1_2 AggCalcSFPP_ACCT_CNT Busy Percentage = 96.469898. MASTER TRANSF_1_1_1_3 expCalcSCD Busy Percentage = 39.830986. MASTER TRANSF_1_1_1_4 NRMTRANS Busy Percentage = 99.021461. MASTER WRITER_1_1_1 CNSM_DIM_INSE

25、RT, CNSM_DIM_UPDATE, CNSM_LOC Busy Percentage = 9.110226.,Monitor System Statistics,There is still available CPU and memory There are threads reporting 100% busy Partition!,Add data partition,Partition # = 2,Thread Stats for Partition 1,MASTER READER_1_1_1 SQ_CNSM_SORTED Busy Percentage = 29.878735.

26、 MASTER TRANSF_1_1_1_1 SQ_CNSM_SORTED Busy Percentage = 69.849308. MASTER TRANSF_1_1_1_2 AggCalcSFPP_ACCT_CNT Busy Percentage = 95.968037. MASTER TRANSF_1_1_1_3 expCalcSCD Busy Percentage = 20.234279. MASTER TRANSF_1_1_1_4 NRMTRANS Busy Percentage = 79.674997. MASTER WRITER_1_1_1 CNSM_DIM_INSERT, CN

27、SM_DIM_UPDATE, CNSM_LOC Busy Percentage = 5.099374.,Thread Stats for Partition 2,MASTER READER_1_1_2 SQ_CNSM_SORTED The total run time was insufficient for any meaningful statistics. MASTER TRANSF_1_1_2_1 SQ_CNSM_SORTED has completed. The total run time was insufficient for any meaningful statistics

28、. MASTER TRANSF_1_1_2_2 AggCalcSFPP_ACCT_CNT Busy Percentage = 99.673585. MASTER TRANSF_1_1_2_3 expCalcSCD Busy Percentage = 20.814423. MASTER TRANSF_1_1_2_4 NRMTRANS Busy Percentage = 66.540595. MASTER WRITER_1_1_2 CNSM_DIM_INSERT, CNSM_DIM_UPDATE, CNSM_LOC Busy Percentage = 5.416233.,Monitor Syste

29、m Statistics,No more CPU resources are available. More partitions and re-partition points will not help Done!,Grid Topology (Informaticas HACOE),GB Lan,4 FC,Cisco Gigabit 24 Port Switch,Oracle 11g RAC 2 HP DL 585 G2,2 HP DL 585 G2 Servers Each with 4 x AMD Opteron (dual core) 32 GB RAM 1 Gb Ethernet

30、 1 2Gb Fibre Channel 73 GB HD (RAID 10 15K RPM) RedHat Enterprise Linux v5.x 64-bit,5 GB flat file dataset,For this example, DB is only used to host repository (meta data),Default Session Internals,Thread,Process,Session Shared Memory,srtNEW Cache,srtOLD Cache,jnrOldNew Cache,New file,Old file,inserts,updates,deletes,Tuning Strategy,Increase sorter caches to recommended value (3GB) Reader is bottleneck and we have available CPU (and I/O) Add data partitions (pipelines) to g

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論