微軟數(shù)據(jù)倉庫介紹_第1頁
微軟數(shù)據(jù)倉庫介紹_第2頁
微軟數(shù)據(jù)倉庫介紹_第3頁
微軟數(shù)據(jù)倉庫介紹_第4頁
微軟數(shù)據(jù)倉庫介紹_第5頁
已閱讀5頁,還剩11頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

1、Module 1Introduction to Data WarehousingModule Overview數(shù)據(jù)倉庫概述考慮數(shù)據(jù)倉庫解決方案Lesson 1: 數(shù)據(jù)倉庫概述商業(yè)難題什么是數(shù)據(jù)倉庫?數(shù)據(jù)倉庫架構(gòu)數(shù)據(jù)倉庫解決方案組件數(shù)據(jù)倉庫項目數(shù)據(jù)倉庫項目角色SQL Server 作為數(shù)據(jù)倉庫平臺業(yè)務難題關(guān)鍵業(yè)務數(shù)據(jù)分布在多個業(yè)務系統(tǒng)找到業(yè)務決策的信息是耗時的和容易出錯的基本的業(yè)務問題很難回答?What Is a Data Warehouse?一個集中存放用于報表和數(shù)據(jù)的信息容器通常,一個數(shù)據(jù)倉庫:包含大量的歷史數(shù)據(jù)優(yōu)化了數(shù)據(jù)查詢 (而不是插入和更新)定期加載新的業(yè)務數(shù)據(jù)為企業(yè)商務智能解決方案提

2、供依據(jù)Data Warehouse ArchitecturesCentralized Data WarehouseDepartmental Data MartHub and SpokeComponents of a Data Warehousing Solution從業(yè)務系統(tǒng)和其他數(shù)據(jù)源抽取數(shù)據(jù)加載數(shù)據(jù)通常最終加載到數(shù)據(jù)倉庫數(shù)據(jù)清洗和重復數(shù)據(jù)的刪除,確保數(shù)據(jù)倉庫中數(shù)據(jù)的質(zhì)量MDM 提供確切的業(yè)務數(shù)據(jù)實體Data WarehouseStaging DatabaseETL Load ProcessETL Staging ProcessMaster Data Management1011000110

3、Data CleansingReporting and AnalysisData SourcesData Warehousing Projects1.首先確定數(shù)據(jù)倉庫需要解決的業(yè)務問題2.確定回答這些問題所需的數(shù)據(jù)3.識別所需數(shù)據(jù)的數(shù)據(jù)源4.評估關(guān)鍵業(yè)務目標價值可行性,從現(xiàn)在的數(shù)據(jù)回答每個問題對大量數(shù)據(jù)的項目, 使用增量更新比較有效:把項目分解為多個子項目每個子項目處理一個特定的主題Data Warehousing Project RolesProject managerSolution architectData modelerDatabase administratorInfrastruc

4、ture specialistETL developerBusiness users/analystTestersData stewardsSQL Server As a Data Warehousing PlatformSQL ServerAnalysis ServicesSQL Server Database EngineMicrosoft SQL Server Integration ServicesSQL Server Master Data Services1011000110SQL Server Data Quality ServicesMicrosoft SQL Azureand

5、 the Windows Azure MarketplaceMicrosoft SharePoint ServerMicrosoft PowerPivot TechnologiesMicrosoft ExcelData Mining Add-InPowerPivot Add-InMDS Add-InPower ViewSQL ServerReporting ServicesReports, KPIs, and DashboardsInteractive data visualizationsInteractive data analysisData WarehousingBusiness In

6、telligenceLesson 2: Considerations for a Data Warehouse SolutionData Warehouse Database and StorageData SourcesExtract, Transform, and Load ProcessesData Quality and Master Data ManagementData Warehouse Database and Storage考慮數(shù)據(jù)倉庫包括:Database schema Logical: typically denormalized for optimal read per

7、formancePhysical: often partitioned for performance and managementHardwareQuery processing and memoryStorageNetworkHigh availability and disaster recoveryHardware redundancyBackup strategySecurityServer accessData permissionsData Sources數(shù)據(jù)源連接類型憑證和權(quán)限數(shù)據(jù)格式數(shù)據(jù)采集窗口Extract, Transform, and Load Processes臨時表

8、:存放臨時數(shù)據(jù)所需的轉(zhuǎn)換:提取數(shù)據(jù)時所需的數(shù)據(jù)轉(zhuǎn)換和清洗增量 ETL:數(shù)據(jù)的變化加載Data Quality and Master Data ManagementData quality:Cleansing data:Validating data valuesEnsuring data consistencyIdentifying missing valuesDeduplicating dataMaster data management:Ensuring consistent business entity definitions across multiple systemsApplying business rules to ensure data validity1011000110Module Review and TakeawaysWhy might you consider including a staging area in your ETL solution?What options might you consider for performing data transf

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論