大數(shù)據(jù)分析存儲解決方案toBD39.ppt_第1頁
大數(shù)據(jù)分析存儲解決方案toBD39.ppt_第2頁
大數(shù)據(jù)分析存儲解決方案toBD39.ppt_第3頁
大數(shù)據(jù)分析存儲解決方案toBD39.ppt_第4頁
大數(shù)據(jù)分析存儲解決方案toBD39.ppt_第5頁
已閱讀5頁,還剩34頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

1、從企業(yè)數(shù)據(jù)向大數(shù)據(jù)的擴展,Traditional Approach Structured, analytical, logical Systems of Record,New ApproachCreative, holistic thought, intuition Systems Of Engagement,Multimedia,Systems of Insight Enterprise Integrationand Context Accumulation,StructuredRepeatableLinear,UnstructuredExploratoryDynamic,Data Ware

2、house,Web Logs,Social Data,Text Data:emails,Sensor data:images,RFID,Internal App Data,Transaction Data,Mainframe Data,OLTP System Data,Hadoop andStreams,Traditional Sources,New Sources,ERP data,對新式基礎(chǔ)架構(gòu)的需求,在可靠和安全的環(huán)境中處理關(guān)鍵業(yè)務應用 存取和處理海量數(shù)據(jù)包括結(jié)構(gòu)化和非結(jié)構(gòu)化數(shù)據(jù) 速度及時響應隨時可能出現(xiàn)的商業(yè)機會,這就需要靈活、實時性的基礎(chǔ)架構(gòu) The dynamics of SoR

3、and SoE: 通過負載及資源部署的優(yōu)化,來增強靈活性和效益 通過采用包括基于開放標準的技術(shù)等新技術(shù)來改善IT economics,System of Record (SoR),Systems of Engagement(SoE),對的決策 對的地方 對的時間點,Big Data & Analytics,大數(shù)據(jù)分析的新型架構(gòu)解決方案,IBM Big Data & Analytics Infrastructure,Data Zone,Application Zone,4,Smart Metering,Grid Operations 電網(wǎng)管理,Field Service 外勤現(xiàn)場服務,Resou

4、rce Planning 資源規(guī)劃,Customer Service / Customer Operations,實現(xiàn)真正的有效的法規(guī)遵從,及時發(fā)現(xiàn)能源損耗問題、以及偷電和欺詐行為,提高客戶滿意度,電量使用預測更為精確,電網(wǎng)運維優(yōu)化,減少停電次數(shù)和時間,案例: Smart Metering智慧電力計費 大數(shù)據(jù)分析應用可以帶來真正的業(yè)務價值,法規(guī)遵從,案例: 用大數(shù)據(jù)分析來加強 Smart Metering,數(shù)據(jù)分析的高可用性,以確保隨時了解用戶喜好,跨應用的TB級的數(shù)據(jù)需求 通用虛擬化存儲平臺,實時收集、存儲并分析數(shù)據(jù),最快可達 50,000 data points/sec,歷史用電狀態(tài)數(shù)據(jù)的

5、復雜查詢處理,數(shù)據(jù)在加載到數(shù)據(jù)倉庫前的清洗、驗證,這些數(shù)據(jù)可能來自很多的用戶、收費系統(tǒng)或斷電保護系統(tǒng),關(guān)系掌控構(gòu)建和維護電網(wǎng)的唯一試圖,對整個企業(yè)的結(jié)構(gòu)化和非結(jié)構(gòu)化數(shù)據(jù)t做全局導覽Navigation,從中發(fā)現(xiàn)Discover價值,分析用戶用電情況,偵測偷電、改表等行為,預測哪些用戶適合于哪些分時時段電價或需求/響應服務,分時時段電價的實時定價 或 提供及時的需求/響應服務,IBM Big Data & Analytics Reference Architecture,Big Data Platform Capabilities,Information Ingest Real-time Ana

6、lytics Warehouse & Data Marts Analytic Appliances,All Data Sources,Advanced Analytics/New Insights,New/Enhanced Applications,New Infrastructure Leverages Data Types,Data inMotion,Data atRest,Data inMany Forms,Information Ingestion and Operational Information,Decision Management,BI and Predictive Ana

7、lytics,Navigation and Discovery,IntelligenceAnalysis,Video/Audio Network/Sensor Entity Analytics Predictive,Real-time Analytics,Exploration, Integrated Warehouse, and Mart Zones,Discovery Deep Reflection Operational Predictive,Stream Processing Data Integration Master Data,Streams,Information Govern

8、ance, Security and Business Continuity,Streams,Warehouse,InfoSphere BigInsights Hadoop-based 低延遲分析,針對多樣化的、海量靜態(tài)數(shù)據(jù)Data-At-Rest,Netezza High Capacity Appliance 基于結(jié)構(gòu)化數(shù)據(jù)的可查詢歸檔,Netezza 1000 基于結(jié)構(gòu)化數(shù)據(jù)的BI+定制化分析 Data,Smart Analytics System 基于結(jié)構(gòu)化數(shù)據(jù)的運營分析,Informix Timeseries Time-structured analytics,InfoSphere W

9、arehouse 基于結(jié)構(gòu)化數(shù)據(jù)的大容量數(shù)據(jù)分析,InfoSphere Streams 低延遲流數(shù)據(jù)分析 Velocity, Variety & Volume Data-In-Motion,MPP Data Warehouse,Stream Computing,Information Integration,Hadoop,InfoSphere Information Server 海量數(shù)據(jù)集成和轉(zhuǎn)化,IBM Big Data Platform大數(shù)據(jù)平臺,What: 一種開源軟件,將數(shù)據(jù)計算分布到整個集群的常見商用服務器和存儲上 Why: 傳統(tǒng)的計算架構(gòu)是一種沿縱向擴展模式,通過更快的SAN、大

10、容量內(nèi)存和多級緩存將數(shù)據(jù)加載到CPU上,成本比較高。 What: Hadoop 把大數(shù)據(jù)集合拆分區(qū)劃為小數(shù)據(jù)集合,再把小數(shù)據(jù)集合分發(fā)到多臺普通服務器上,是一種橫向擴展模式。 Why: Scalable, Flexible, Cost Effective, Fault Tolerent Components: Map Reduce, HDFS,What is Hadoop?,IBM Value for Hadoop!,HDFS 把數(shù)據(jù)分散存儲在多個存儲節(jié)點Node上 HDFS 設(shè)計時就假設(shè)存儲節(jié)點有失效的可能,所以HDFS會把一份數(shù)據(jù)復制3份以上,分散存儲在多個節(jié)點上,從而實現(xiàn)系統(tǒng)整體上的可靠性

11、 HDFS文件系統(tǒng)是由服務器節(jié)點集群組成的,每臺服務器依照HDFS的特有block協(xié)議支持網(wǎng)絡(luò)化block 數(shù)據(jù) HDFS Name Node 有發(fā)生單點故障的危險 IBM 在改善文件系統(tǒng)的性能同時消除了單點故障 Elastic Storage -SNC (available as beta code),Hadoop 說明, Map Reduce, HDFS,Hadoop Stack,What does it look like?,典型Hadoop存儲的Pain Points,在選擇HDFS的組件(如軟件、服務器、網(wǎng)絡(luò)和存儲等)時很難選對 在從測試環(huán)境遷移到生產(chǎn)環(huán)境時,需要做的調(diào)優(yōu)和調(diào)整工作太

12、繁復了 長期持續(xù)不斷的運維保障過于繁重,比如老要更換失效組件(尤其是硬盤),這使得保證期望的SLA非常難,CPU 和存儲去耦 本來用戶的CPU和內(nèi)存已經(jīng)滿足計算需求,但為了存儲容量需要安裝更多的硬盤不得不買更多的、不必要的CPU和內(nèi)存 Storage options available have clear gaps 本地存儲的利用率低 (25%),每次需要擴容的時候就要添加更多的服務器,而一旦硬盤失效后需要重建,服務器越多,失效的幾率越高,性能也就越差,IBM Storage for Hadoop,傳統(tǒng)的 Hadoop 集群使用的是服務器內(nèi)置硬盤存儲。如果用作測試或科學研究還好,可作為業(yè)務運

13、行的存儲就要采用企業(yè)存儲 Hadoop 集群要負責數(shù)據(jù)保護和復制 重建(就是copy)失效的數(shù)據(jù)集到不同節(jié)點上 嚴重影響CPU性能,無法實現(xiàn)企業(yè)級的RAS Replicate data 問題同上 擴展的時候同時增加處理器/網(wǎng)絡(luò)/存儲,無法做到物盡其用( no way to separate these 3 even if excess capacity existing in one (e.g. Needed more storage but had to add Compute and Network)) 使用外部存儲可以將存儲負載和Hadoop計算節(jié)點分離,同時還獲得了企業(yè)存儲的好處。 S

14、ell the value of XIV, V7000, SVC, etc. 用戶一般會隨Hadoop File System部署;采用Elastic Storage 可以有很多好處,數(shù)據(jù)加速 Experience the instant results that come from IBM FlashSystem Drive as much as 45X faster analytics results on certain workloads 數(shù)據(jù)負載的多樣性和靈活性 XIV delivers predictable performance that scales linearly wit

15、hout hotspots delivering insights from analytics faster with tuning-free data distribution Scale-out, parallel processing of Elastic Storage software and integration with FlashSystem dramatically accelerates performance of Analytics clusters Virtual Storage Center with SVC automatically optimizes da

16、ta warehouse performance and cost across Flash and Disk Mainframe Data Environments Integration with DB2 & specialty analytics “engines” leveraging DS8870 delivers 4x reduction in batch times with new High Performance Flash Enclosures High speed encryption on every drive type secures data 數(shù)據(jù)保護和保留 LT

17、FS EE w/ tape provides reduced TCO by up to 90% over disk for long term retention of data at rest with a large open format tape repository Reduce the amount of data to be stored by up to 25 times with ProtecTIER de-duplication,12x 更快 IBM FlashSystem increased SPLUNK & SAS application efficiency to p

18、erform business analytics,20 x 改善 in actionable supply chain analytics, 4x reduction in batch times, virtualization for plug & play,6x 時間節(jié)省 “GPFS allows us to move the metadata from the disk to the FlashSystem online. Once we did that, the backups were reduced down to about an hour.”,2 hrs becomes 2

19、 minutes 失效切換時間大幅縮短,Mapping Characteristics to IBM Storage Products,Storage Infrastructure 需求,適用于所有的5種應用場景,Optimized Multi-Temperature Warehouse優(yōu)化的多級存儲庫 All Flash FlashSystem Hybrid DS8000 EasyTier XIV + SSD Caching Storwize EasyTier FlashSystem Solution (VSC + FlashSystem) PureSystems PureFlex (XIV

20、 or Storwize w/EasyTier) PureData for Transactions (Storwize) PureData for Analytics (Netezza),Midrange & Entry Tier 0 Acceleration,Smarter Storage,Integrated Systems,Enterprise Offerings,XIV,zEnterprise Solutions for Analytics with DS8000,PureData System for Operational Analytics with Storwize,Pure

21、Flex System with Storwize,DS8000,Smart Analytics Systems with DS3xxx,Open & Extensible,Storwize family FlashSystem family,IBM Smarter Storage 的設(shè)計就是支持大數(shù)據(jù)分析高效和優(yōu)化數(shù)據(jù)基礎(chǔ)架構(gòu),IBM FlashSystem:為大數(shù)據(jù)分析應用設(shè)計的,讓應用和數(shù)據(jù)實現(xiàn)極速,IBM FlashSystem的 極速性能 讓實時業(yè)務決策成為可能 適合于模塊化數(shù)據(jù)存儲結(jié)構(gòu)的Hadoop系統(tǒng)。某些或所有數(shù)據(jù)可以保存到Flash閃存上,其他可以保存到XIV,IBM XIV

22、: Optimized data workload diversity for Big Data & Analytics,IBM XIV 的高性能無須人工干預配置,且適用于各種各樣的存儲負載 IBM XIV 的效率 高的異乎尋常,而且簡單性業(yè)內(nèi)最高,內(nèi)置友好界面 IBM XIV 的彈性是企業(yè)級的,完全保證了數(shù)據(jù)的可用性和業(yè)務連續(xù)性,XIV: 為 Analytics 而生,無與倫比的性能,可擴展的網(wǎng)格存儲架構(gòu) 任意時間支持任意讀寫負載 板上的閃存Flash,無與倫比的可靠性,精致的數(shù)據(jù)分布 無雙的磁盤重建時間 企業(yè)級的可用性,無與倫比的簡易性,簡單的規(guī)劃、供給和靈活性 上線后零維護 零調(diào)優(yōu),“X

23、IV最吸引我們的地方就是其超強的性能 we正是由于XIV為我們的精細復雜的分析應用提供了一致的高性能, 使得我們能夠為我們的用戶帶來更多的價值。”,SAS 和 XIV 網(wǎng)格架構(gòu) 完美的結(jié)合,大規(guī)模并行計算 保持持續(xù)地最佳性能 Balanced Performance性能均衡 常年零調(diào)整 Unprecedented Scalability史無前例的擴展性 配合添加SAS節(jié)點和XIV模塊即可,IBM SVC: Optimized data workload flexibility for Big Data & Analytics,IBM SVC 通過如下功能在IBM 大數(shù)據(jù)產(chǎn)品線上增加了靈活性:

24、完整和數(shù)據(jù)虛擬化和數(shù)據(jù)移動性 高級集群和復制 多路鏡像,read preferred option Real Time Compression實時壓縮 Easy Tier Hot Extent caching,Storwize V7000/U,IBM SVC,設(shè)計原則,Real-Time Compression實時壓縮是設(shè)計來做: 作用于 Active Primary Data 專用的壓縮平臺 Platform handles ALL heavy lifting associated with compression 不會影響性能 We modify a compressed file in-

25、place efficiently 不會改變用戶應用 Users nor admins need to change anything 處理流程不變 壓縮是在線完成,不是事后壓縮 業(yè)界標準壓縮算法 所采用的壓縮算法已經(jīng)使用了幾十年,Storwize V7000/U,IBM SVC,流處理計算 & IBM Flash Systems,Data: 是擁有還是保存? 或是是分析和開始行動!,Data in,Data at,InfoSphere Streams: 大數(shù)據(jù)流分析,為分析動態(tài)數(shù)據(jù)而建 多并發(fā)輸入數(shù)據(jù)流 大規(guī)??蓴U展Massive scalability 分析和處理的數(shù)據(jù)多樣化 Struct

26、ured, unstructured, video, audio Advanced analytic operators 自適應實時分析 With Data Warehouses With Hadoop Systems,Current fact finding當前數(shù)據(jù)查詢 分許流動中的數(shù)據(jù)在數(shù)據(jù)落盤前 低延遲模式, push model 數(shù)據(jù)驅(qū)動真正的數(shù)據(jù)分析,Historical fact finding歷史數(shù)據(jù)查詢 查找和分析存儲在磁盤上的數(shù)據(jù)信息 批處理模式, pull model 查詢驅(qū)動: submits queries to static data,Traditional Comp

27、uting,Stream Computing,流數(shù)據(jù)計算代表著計算模式的變遷,Real-time Analytics,Real Time Analytics實時分析想象一下你如何用防火栓喝水,來自多個多樣輸入源的大量數(shù)據(jù) 直接處理和過濾數(shù)據(jù),而不必存儲 僅保存有價值的數(shù)據(jù) 僅關(guān)聯(lián)對數(shù)據(jù)最感興趣的用戶 隨著數(shù)據(jù)信息的產(chǎn)生采取行動,Adaptive Analytics自適應分析Data in Motion and Data at Rest的集成,1. Data Ingest,數(shù)據(jù)集成, 數(shù)據(jù)挖掘, 機器學習, 統(tǒng)計建模,實時和歷史數(shù)據(jù)洞察力的可視化,3. Adaptive Analytics Mo

28、del,數(shù)據(jù)收取,在線分析準備,模式校驗,Data,2. Bootstrap/Enrich,Control flow,InfoSphere BigInsights, Database & Warehouse,InfoSphere Streams,Adaptive Real-Time Analytics自適應實時分析,來自多個多樣輸入源的大量數(shù)據(jù) 過去、現(xiàn)在和未來全方位綜合性視圖 實時分析,低延時結(jié)果 Full context for deep analysis深度分析的完整的上下文 跨data in motion and data at rest的常用數(shù)據(jù)分析 自適應-隨機而變 當發(fā)現(xiàn)非預期行

29、為時,自適應 當識別出新數(shù)據(jù)意義時深度分析之 開始沒有意識到的數(shù)據(jù)意義,隨后才可能意識到 自適應在開始沒有意識到的,隨后可以找出數(shù)據(jù)模式,Stock market Impact of weather on securities prices Analyze market data at ultra-low latencies Momentum Calculator,Fraud prevention Detecting multi-party fraud Real time fraud prevention,e-Science Space weather prediction Detection

30、 of transient events Synchrotron atomic research Genomic Research,Transportation Intelligent traffic management Automotive Telematics,Energy & Utilities Transactive control Phasor Monitoring Unit Down hole sensor monitoring,Natural Systems Wildfire management Water management,Other Manufacturing Tex

31、t Analysis ERP for Commodities,Telephony CDR processing Social analysis Churn prediction Geomapping,如何使用InfoSphere Streams?,加快數(shù)據(jù)流入分析系統(tǒng)的速度,向交易方向加速。,一個高效和靈活的基礎(chǔ)架構(gòu)顯然可以加快流速,并平衡不同數(shù)據(jù)分析的需求,大數(shù)據(jù)分析的新式基礎(chǔ)架構(gòu)解決方案,IBM Big Data & Analytics Infrastructure,Data Zone,Application Zone,Experience real-time analytical ins

32、ights with up to 50 x better performance than enterprise disk systems using IBM FlashCore technology Preserve and protect infrastructure continuity while scaling to over 2 petabyte of effective all-flash capacity under a single integrate interface Deliver agility and data economics with 4x greater c

33、apacity in less rack space than competitive all-flash products,Synchronized and Complimentary to Overarching Storage Messaging - Accelerate time to insights through data without borders. IBM innovation frees data with agile and simple to use storage solutions delivering superior data economics,IBM F

34、lashSystem Core Launch Messaging,Drive a complete paradigm shift in Enterprise Storage with the all new IBM FlashSystem Family,IBM FlashSystem Family2015 Theme,Time to insight. Time to value. Time to market.,IBM FlashSystem, its about time.,Flash Realized!,IBM FlashSystem V9000Foundational Pillars,I

35、BM FlashCore Technology is the DNA of the FlashSystem Family,Introducing the New IBM FlashSystem Family Offerings,IBM FlashSystem 900 Extreme Performance: Delivers 100 microsecond response times Macro Efficiency: Lowest latency offering with 40% greater capacity at a lower cost per capacity Enterpri

36、se Reliability: IBM enhanced Micron MLC flash technology with Flash Wear Guarantee,Powered by IBM FlashCore Technology,IBM FlashSystem V9000 Scalable Performance: Grow capacity and performance with up to 2.2PB scaling capability Enduring Economics: Next generation flash media with lower cost per capacity Agile Integration: Fully integrated system management to simplify m

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論