




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
1、精選文檔學(xué)校代碼: 10128學(xué) 號:200920205048 本科畢業(yè)設(shè)計(jì)外文文獻(xiàn)翻譯(英文題目:Software Database An Object-Oriented Perspective.中文題目:軟件數(shù)據(jù)庫的面向?qū)ο蟮囊暯菍W(xué)生姓名:宋蘭蘭學(xué) 院:信息工程學(xué)院系 別:軟件工程系專 業(yè):軟件工程班 級:軟件09-1指導(dǎo)教師:關(guān)玉欣 講師二一三 年 六 月.A HISTORICAL PERSPECTIVEFrom the earliest days of computers, storing and manipulating data have been a major applicat
2、ion focus. The first general-purpose DBMS was designed by Charles Bachman at General Electric in the early 1960s and was called the Integrated Data Store. It formed the basis for the network data model, which was standardized by the Conference on Data Systems Languages (CODASYL) and strongly influen
3、ced database systems through the 1960s. Bachman was the first recipient of ACMs Turing Award (the computer science equivalent of a Nobel prize) for work in the database area; he received the award in 1973. In the late 1960s, IBM developed the Information Management System (IMS) DBMS, used even today
4、 in many major installations. IMS formed the basis for an alternative data representation framework called the hierarchical data model. The SABRE system for making airline reservations was jointly developed by American Airlines and IBM around the same time, and it allowed several people to access th
5、e same data through computer network. Interestingly, today the same SABRE system is used to power popular Web-based travel services such as Travelocity!In 1970, Edgar Codd, at IBMs San Jose Research Laboratory, proposed a new data representation framework called the relational data model. This prove
6、d to be a watershed in the development of database systems: it sparked rapid development of several DBMSs based on the relational model, along with a rich body of theoretical results that placed the field on a firm foundation. Codd won the 1981 Turing Award for his seminal work. Database systems mat
7、ured as an academic discipline, and the popularity of relational DBMSs changed the commercial landscape. Their benefits were widely recognized, and the use of DBMSs for managing corporate data became standard practice.In the 1980s, the relational model consolidated its position as the dominant DBMS
8、paradigm, and database systems continued to gain widespread use. The SQL query language for relational databases, developed as part of IBMs System R project, is now the standard query language. SQL was standardized in the late 1980s, and the current standard, SQL-92, was adopted by the American Nati
9、onal Standards Institute (ANSI) and International Standards Organization (ISO). Arguably, the most widely used form of concurrent programming is the concurrent execution of database programs (called transactions). Users write programs as if they are to be run by themselves, and the responsibility fo
10、r running them concurrently is given to the DBMS. James Gray won the 1999 Turing award for his contributions to the field of transaction management in a DBMS.In the late 1980s and the 1990s, advances have been made in many areas of database systems. Considerable research has been carried out into mo
11、re powerful query languages and richer data models, and there has been a big emphasis on supporting complex analysis of data from all parts of an enterprise. Several vendors (e.g., IBMs DB2, Oracle 8, Informix UDS) have extended their systems with the ability to store new data types such as images a
12、nd text, and with the ability to ask more complex queries. Specialized systems have been developed by numerous vendors for creating data warehouses, consolidating data from several databases, and for carrying out specialized analysis.An interesting phenomenon is the emergence of several enterprise r
13、esource planning(ERP) and management resource planning (MRP) packages, which add a substantial layer of application-oriented features on top of a DBMS. Widely used packages include systems from Baan, Oracle, PeopleSoft, SAP, and Siebel. These packages identify a set of common tasks (e.g., inventory
14、management, human resources planning, financial analysis) encountered by a large number of organizations and provide a general application layer to carry out these tasks. The data is stored in a relational DBMS, and the application layer can be customized to different companies, leading to lower Int
15、roduction to Database Systems overall costs for the companies, compared to the cost of building the application layer from scratch. Most significantly, perhaps, DBMSs have entered the Internet Age. While the first generation of Web sites stored their data exclusively in operating systems files, the
16、use of a DBMS to store data that is accessed through a Web browser is becoming widespread. Queries are generated through Web-accessible forms and answers are formatted using a markup language such as HTML, in order to be easily displayed in a browser. All the database vendors are adding features to
17、their DBMS aimed at making it more suitable for deployment over the Internet. Database management continues to gain importance as more and more data is brought on-line, and made ever more accessible through computer networking. Today the field is being driven by exciting visions such as multimedia d
18、atabases, interactive video, digital libraries, a host of scientific projects such as the human genome mapping effort and NASAs Earth Observation System project, and the desire of companies to consolidate their decision-making processes and mine their data repositories for useful information about t
19、heir businesses. Commercially, database manage- ment systems represent one of the largest and most vigorous market segments. Thusthes- tudy of database systems could prove to be richly rewarding in more ways than one!INTRODUCTION TO PHYSICAL DATABASE DESIGNLike all other aspects of database design,
20、physical design must be guided by the nature of the data and its intended use. In particular, it is important to understand the typical workload that the database must support; the workload consists of a mix of queries and updates. Users also have certain requirements about how fast certain queries
21、or updates must run or how many transactions must be processed per second. The workload description and users performance requirements are the basis on which a number of decisions have to be made during physical database design.To create a good physical database design and to tune the system for per
22、formance in response to evolving user requirements, the designer needs to understand the workings of a DBMS, especially the indexing and query processing techniques supported by the DBMS. If the database is expected to be accessed concurrently by many users, or is a distributed database, the task be
23、comes more complicated, and other features of a DBMS come into play. DATABASE WORKLOADSThe key to good physical design is arriving at an accurate description of the expected workload. A workload description includes the following elements: 1. A list of queries and their frequencies, as a fraction of
24、 all queries and updates. 2. A list of updates and their frequencies. 3. Performance goals for each type of query and update.For each query in the workload, we must identify:Which relations are accessed.Which attributes are retained (in the SELECT clause).Which attributes have selection or join cond
25、itions expressed on them (in the WHERE clause) and how selective these conditions are likely to be. Similarly, for each update in the workload, we must identify:Which attributes have selection or join conditions expressed on them (in the WHERE clause) and how selective these conditions are likely to
26、 be.The type of update (INSERT, DELETE, or UPDATE) and the updated relation.For UPDATE commands, the fields that are modified by the update.Remember that queries and updates typically have parameters, for example, a debit or credit operation involves a particular account number. The values of these
27、parameters determine selectivity of selection and join conditions.Updates have a query component that is used to find the target tuples. This component can benefit from a good physical design and the presence of indexes. On the other hand, updates typically require additional work to maintain indexe
28、s on the attributes that they modify. Thus, while queries can only benefit from the presence of an index, an index may either speed up or slow down a given update. Designers should keep this trade-offer in mind when creating indexes.NEED FOR DATABASE TUNINGAccurate, detailed workload information may
29、 be hard to come by while doing the initial design of the system. Consequently, tuning a database after it has been designed and deployed is importantwe must refine the initial design in the light of actual usage patterns to obtain the best possible performance.The distinction between database desig
30、n and database tuning is somewhat arbitrary.We could consider the design process to be over once an initial conceptual schema is designed and a set of indexing and clustering decisions is made. Any subsequent changes to the conceptual schema or the indexes, say, would then be regarded as a tuning ac
31、tivity. Alternatively, we could consider some refinement of the conceptual schema (and physical design decisions affected by this refinement) to be part of the physical design process.Where we draw the line between design and tuning is not very important.OVERVIEW OF DATABASE TUNINGAfter the initial
32、phase of database design, actual use of the database provides a valuable source of detailed information that can be used to refine the initial design. Many of the original assumptions about the expected workload can be replaced by observed usage patterns; in general, some of the initial workload spe
33、cification will be validated, and some of it will turn out to be wrong. Initial guesses about the size of data can be replaced with actual statistics from the system catalogs (although this information will keep changing as the system evolves). Careful monitoring of queries can reveal unexpected pro
34、blems; for example, the optimizer may not be using some indexes as intended to produce good plans.Continued database tuning is important to get the best possible performance. TUNING THE CONCEPTUAL SCHEMAIn the course of database design, we may realize that our current choice of relation schemas does
35、 not enable us meet our performance objectives for the given workload with any (feasible) set of physical design choices. If so, we may have to redesign our conceptual schema (and re-examine physical design decisions that are affected by the changes that we make).We may realize that a redesign is ne
36、cessary during the initial design process or later, after the system has been in use for a while. Once a database has been designed and populated with data, changing the conceptual schema requires a significant effort in terms of mapping the contents of relations that are affected. Nonetheless, it m
37、ay sometimes be necessary to revise the conceptual schema in light of experience with the system. We now consider the issues involved in conceptual schema (re)design from the point of view of performance.Several options must be considered while tuning the conceptual schema:We may decide to settle fo
38、r a 3NF design instead of a BCNF design.If there are two ways to decompose a given schema into 3NF or BCNF, our choice should be guided by the workload.Sometimes we might decide to further decompose a relation that is already in BCNF.In other situations we might denormalize. That is, we might choose
39、 to replace a collection of relations obtained by a decomposition from a larger relation with the original (larger) relation, even though it suffers from some redundancy problems. Alternatively, we might choose to add some fields to certain relations to speed up some important queries, even if this
40、leads to a redundant storage of some information (and consequently, a schema that is in neither 3NF nor BCNF).This discussion of normalization has concentrated on the technique of decomposition, which amounts to vertical partitioning of a relation. Another technique to consider is horizontal partiti
41、oning of a relation, which would lead to our having two relations with identical schemas. Note that we are not talking about physically partitioning the cuples of a single relation; rather, we want to create two distinct relations (possibly with different constraints and indexes on each).Incidentall
42、y, when we redesign the conceptual schema, especially if we are tuning an existing database schema, it is worth considering whether we should create views to mask these changes from users for whom the original schema is more natural. TUNING QUERIES AND VIEWSIf we notice that a query is running much
43、slower than we expected, we have to examine the query carefully to end the problem. Some rewriting of the query, perhaps in conjunction with some index tuning, can often ?x the problem. Similar tuning may be called for if queries on some view run slower than expected. When tuning a query, the first
44、thing to verify is that the system is using the plan that you expect it to use. It may be that the system is not finding the best plan for a variety of reasons. Some common situations that are not handled efficiently by many optimizers follow:A selection condition involving null values.Selection con
45、ditions involving arithmetic or string expressions or conditions using the or connective. For example, if we have a condition E.age = 2*D.age in the WHERE clause, the optimizer may correctly utilize an available index on E.age but fail to utilize an available index on D.age. Replacing the condition
46、by E.age/2=D.age would reverse the situation.Inability to recognize a sophisticated plan such as an index-only scan for an aggregation query involving a GROUP BY clause. If the optimizer is not smart enough to and the best plan (using access methods and evaluation strategies supported by the DBMS),
47、some systems allow users to guide the choice of a plan by providing hints to the optimizer; for example, users might be able to force the use of a particular index or choose the join order and join method. A user who wishes to guide optimization in this manner should have a thorough understanding of
48、 both optimization and the capabilities of the given DBMS.(8)OTHER TOPICSMOBILE DATABASESThe availability of portable computers and wireless communications has created a new breed of nomadic database users. At one level these users are simply accessing a database through a network, which is similar
49、to distributed DBMSs. At another level the network as well as data and user characteristics now have several novel properties, which affect basic assumptions in many components of a DBMS, including the query engine, transaction manager, and recovery manager.Users are connected through a wireless lin
50、k whose bandwidth is ten times less than Ethernet and 100 times less than ATM networks. Communication costs are therefore significantly higher in proportion to I/O and CPU costs.Users locations are constantly changing, and mobile computers have a limited battery life. Therefore, the true communicati
51、on costs is connection time and battery usage in addition to bytes transferred, and change constantly depending on location. Data is frequently replicated to minimize the cost of accessing it from different locations.As a user moves around, data could be accessed from multiple database servers withi
52、n a single transaction. The likelihood of losing connections is also much greater than in a traditional network. Centralized transaction management may therefore be impractical, especially if some data is resident at the mobile computers. We may in fact have to give up on ACID transactions and devel
53、op alternative notions of consistency for user programs.MAIN MEMORY DATABASESThe price of main memory is now low enough that we can buy enough main memory to hold the entire database for many applications; with 64-bit addressing, modern CPUs also have very large address spaces. Some commercial syste
54、ms now have several gigabytes of main memory. This shift prompts a reexamination of some basic DBMS design decisions, since disk accesses no longer dominate processing time for a memory-resident database:Main memory does not survive system crashes, and so we still have to implement logging and recov
55、ery to ensure transaction atomicity and durability. Log records must be written to stable storage at commit time, and this process could become a bottleneck. To minimize this problem, rather than commit each transaction as it completes, we can collect completed transactions and commit them in batche
56、s; this is called group commit. Recovery algorithms can also be optimized since pages rarely have to be written out to make room for other pages.The implementation of in-memory operations has to be optimized carefully since disk accesses are no longer the limiting factor for performance.A new criter
57、ion must be considered while optimizing queries, namely the amount of space required to execute a plan. It is important to minimize the space overhead because exceeding available physical memory would lead to swapping pages to disk (through the operating systems virtual memory mechanisms), greatly s
58、lowing down execution.Page-oriented data structures become less important (since pages are no longer the unit of data retrieval), and clustering is not important (since the cost of accessing any region of main memory is uniform).(一)從歷史的角度回顧從數(shù)據(jù)庫的早期開始,存儲和操縱數(shù)據(jù)就一直是主要的應(yīng)用焦點(diǎn)。第一個通用的DBMS是由Charles Bechman于20世
59、紀(jì)60年代早期在通用電器公司設(shè)計(jì)的,稱為集成數(shù)據(jù)存儲(Integrated Data Store).它奠定了網(wǎng)狀數(shù)據(jù)模型的基礎(chǔ)。網(wǎng)狀數(shù)據(jù)模型由數(shù)據(jù)系統(tǒng)語言協(xié)會(CODASYL)標(biāo)準(zhǔn)化,并在整個20世紀(jì)60年代對數(shù)據(jù)庫系統(tǒng)產(chǎn)生了巨大的影響。由于Bachman在數(shù)據(jù)庫領(lǐng)域的貢獻(xiàn),他成為第一個ACM圖靈獎(相當(dāng)于計(jì)算機(jī)科學(xué)界的諾貝爾獎)的獲得者,并于1973年接受了這一獎勵。20世紀(jì)60年代末期,IBM成功開發(fā)了信息管理系統(tǒng)(IMS)DBMS。直至今天,它還在許多系統(tǒng)中使用。IMS奠定了另一個數(shù)據(jù)表達(dá)框架層次數(shù)據(jù)模型的基礎(chǔ)。同時,美國航空公司和IBM聯(lián)合開發(fā)出用于飛機(jī)訂票的SABRE系統(tǒng),它允許多個用戶通過計(jì)算機(jī)網(wǎng)絡(luò)存取
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 煤礦機(jī)器人與自動化
- 老年人用藥,莫“跟著感覺走”
- 2025年遼寧高校畢業(yè)生“三支一扶”計(jì)劃考試筆試試題(含答案)
- 2025年江蘇鹽城市射陽縣城市照明服務(wù)有限公司聘考試筆試試題(含答案)
- 老年疾病護(hù)理
- 老年護(hù)理溝通課件
- 車輛質(zhì)押擔(dān)保貸款服務(wù)合同樣本
- 美容美發(fā)場地租賃合同終止及客戶服務(wù)協(xié)議
- 戀愛期間情感關(guān)懷與財(cái)產(chǎn)管理協(xié)議
- 專業(yè)辦公租賃及企業(yè)孵化服務(wù)合同
- 《黃帝內(nèi)經(jīng)養(yǎng)生智慧》課件
- 2025年安徽蚌埠市臨港建設(shè)投資集團(tuán)及所屬公司招聘筆試參考題庫含答案解析
- 隱私計(jì)算技術(shù)突破-全面剖析
- 機(jī)械電子工程考試試題及答案
- 2025-2030晶圓貼片機(jī)行業(yè)市場現(xiàn)狀供需分析及重點(diǎn)企業(yè)投資評估規(guī)劃分析研究報(bào)告
- 青少年去極端化安全教育宣傳
- AI賦能與素養(yǎng)導(dǎo)向:初中英語智慧課堂的融合實(shí)踐研究
- 樣品打樣合同協(xié)議
- 構(gòu)建中藥材種植標(biāo)準(zhǔn)化模式
- 2023年荊門市城市建設(shè)投資控股集團(tuán)有限公司人才招聘【23人】筆試參考題庫附帶答案詳解
- 《2025年危險(xiǎn)化學(xué)品企業(yè)安全生產(chǎn)執(zhí)法檢查重點(diǎn)事項(xiàng)指導(dǎo)目錄》解讀與培訓(xùn)
評論
0/150
提交評論