hazards(結(jié)構(gòu)冒險大多發(fā)生在)_第1頁
hazards(結(jié)構(gòu)冒險大多發(fā)生在)_第2頁
hazards(結(jié)構(gòu)冒險大多發(fā)生在)_第3頁
hazards(結(jié)構(gòu)冒險大多發(fā)生在)_第4頁
hazards(結(jié)構(gòu)冒險大多發(fā)生在)_第5頁
已閱讀5頁,還剩66頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

MorePipeline1BasicRISCPipeliningBasicidea:Eachinstructionspends1clockcycleineachofthe5executionstages.During1clockcycle,thepipelinecanprocess(indifferentstages)5differentinstructions.2SimpleRISCDatapathIFIDEXMEMWBProgram

CounterNextPCInst.

Reg.Load

fr.Mem.

Data3DescriptionofPipeStages4Hazards5ThehazardsofpipeliningPipelinehazardspreventnextinstructionfromexecutingduringdesignatedclockcycleThereare3classesofhazards:StructuralHazards:ArisefromresourceconflictsHWcannotsupportallpossiblecombinationsofinstructionsDataHazards:OccurwhengiveninstructiondependsondatafromaninstructionaheadofitinpipelineControlHazards:Resultfrombranch,otherinstructionsthatchangeflowofprogram(i.e.changePC)6Howdowedealwithhazards?Often,pipelinemustbe

stalledStallingpipelineusuallyletssomeinstruction(s)inpipelineproceed,another/otherswaitfordata,resource,etc.7StallsandperformanceStallsimpede(阻止)progressofapipelineandresultindeviationfrom1instructionexecuting/clockcyclePipeliningcanbeviewedto:DecreaseCPIorclockcycletimeforinstructionLet’sseewhataffectstallshaveonCPI…CPIpipelined=IdealCPI+Pipelinestallcyclesperinstruction1+PipelinestallcyclesperinstructionIgnoringoverheadandassumingstagesarebalanced:8Evenmorepipelineperformanceissues!Thisresultsin:Whichleadsto:Ifnostallsinidealcasespeedup==numberofpipelinestages91.StructuralhazardsMostcommoninstancesofstructuralhazards(結(jié)構(gòu)冒險大多發(fā)生在):Whenafunctionalunitnotfullypipelined(完全流水)WhensomeresourcenotduplicatedenoughOnewaytoavoidstructuralhazardsistoduplicateresourcesPipelinesstallresultofhazards,CPIincreasedfromtheusual“1〞10AnexampleofastructuralhazardALURegMemDMRegALURegMemDMRegALURegMemDMRegALURegMemDMRegTimeALURegMemDMRegLoadInstruction1Instruction2Instruction3Instruction4What’stheproblemhere?Theprocessorhasacombinedinstruction+datamemorywithonly1readport11Howisitresolved?ALURegMemDMRegALURegMemDMRegALURegMemDMRegTimeALURegMemDMRegLoadInstruction1Instruction2StallInstruction3BubbleBubbleBubbleBubbleBubblePipelinegenerallystalledbyinsertinga“bubble〞orNOP12Oralternatively…Inst.#12345678910LOADIFIDEXMEMWBInst.i+1IFIDEXMEMWBInst.i+2IFIDEXMEMWBInst.i+3stallIFIDEXMEMWBInst.i+4IFIDEXMEMWBInst.i+5IFIDEXMEMInst.i+6IFIDEXClockNumberLOADinstruction“steals〞aninstructionfetchcyclewhichwillcausethepipelinetostall.Thus,noinstructioncompletesonclockcycle813Rememberthecommoncase!But,insomecasesitmaybebettertoallowthemthantoeliminatethem.Thesearesituationsacomputerarchitectmighthavetoconsider:IspipeliningfunctionalunitsorduplicatingthemcostlyintermsofHW?Doesstructuralhazardoccuroften?What’sthecommoncase?142.DatahazardsWhydotheyexist???Pipeliningchangesorder(i.e.read/writeaccessestooperands)Orderdiffersfromorderseenbysequentiallyexecutinginstructionsonunpipelinedmachine(流水執(zhí)行序不同于非流水機(jī)器的順序執(zhí)行指令序)Considerthisexample:ADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11AllinstructionsafterADDuseresultofADDADDwritestheregisterinWBbutSUBneedsitinID.Thisisadatahazard15IllustratingadatahazardALURegMemDMRegALURegMemDMRegALURegMemDMRegMemTimeADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11ALURegMemADDinstructioncausesahazardinnext3instructionsb/c(because)

registernotwrittenuntilafterthose3readit.16DatahazardspecificsThereareactually3differentkindsofdatahazards!ReadAfterWrite(RAW)WriteAfterWrite(WAW)WriteAfterRead(WAR)Assumethathazardswilluseinstructionsi&j.iisalwaysissuedbeforej.Thus,iwillalwaysbefurtheralonginpipelinethanj.Withanin-orderissue/in-ordercompletionmachine,we’renotasconcernedwithWAW,WAR17ThreeTypesofDataHazardsThereareactually3differentkindsofdatahazards!Let

i

beanearlierinstruction,

j

alaterone.RAW(readafterwrite)jtriestoreadavaluebefore

i

writesitWAW(writeafterwrite)i

andj

writetosameplace,butinthewrongorder.發(fā)生條件:Onlyoccursif>1pipelinestagecanwrite(in-order)WAR(writeafterread)j

writesanewvaluetoalocationbeforei

hasreadtheoldone.發(fā)生條件:Onlyoccursifwritescanhappenbeforereadsinpipeline(in-order).18Readafterwrite(RAW)hazardsWithRAWhazard,instructionjtriestoreadasourceoperandbeforeinstructioniwritesit.Thus,jwouldincorrectlyreceiveanoldorincorrectvalueGraphically/Example:Canusestallingorforwardingtoresolvethishazard…ji…InstructionjisareadinstructionissuedafteriInstructioniisawriteinstructionissuedbeforeji:ADDR1,R2,R3j:SUBR4,R1,R619ForwardingItcanactuallybesolvedrelativelyeasily–withforwardingInthisexample,resultoftheADDinstructionnotreallyneededuntilafterADDactuallyproducesitCanwemovetheresultfromEX/MEMregistertothebeginningofALU(whereSUBneedsit)?Generallyspeaking:Forwarding

occurswhenaresultispasseddirectlytofunctionalunitthatrequiresit.Resultgoesfromoutputofoneunittoinputofanother20Whencanweforward?ALURegMemDMRegALURegMemDMRegALURegMemDMRegMemTimeADDR1,R2,R3SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9XORR10,R1,R11ALURegMemSUBgetsinfo.fromEX/MEMpiperegisterANDgetsinfo.fromMEM/WBpiperegisterORgetsinfo.byforwardingfromregisterfileRuleofthumb: Iflinegoes“forward〞youcandoforwarding. Ifitsdrawnbackward,it’sphysicallyimpossible.21DataHazardDetection22HazardDetectionLogicExample:Detectingwhetheraninstructionthathasjustbeenfetchedneedstobestalledbecauseofaprecedingload.23ForwardingSituationsinDLX24HWChangeforForwardingMuxMuxALUZero?DatamemoryID/EXEX/MEMMEM/WB25Forwarding:Itdoesn’talwaysworkALURegIMDMRegALURegIMDMALURegIMTimeLWR1,0(R2)SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9RegIMLoadhasalatencythatforwardingcan’tsolve.Pipelinemuststalluntilhazardcleared(startingwithinstructionthatwantstousedatauntilsourceproducesit).26ThesolutionALURegIMDMRegRegIMIMTimeLWR1,0(R2)SUBR4,R1,R5ANDR6,R1,R7ORR8,R1,R9BubbleBubbleBubbleALURegRegIMALUDMInsertionofbubblecauses#ofcyclestocompletethissequencetogrowby127DatahazardsandthecompilerCompilershouldbeabletohelpeliminatesomestallscausedbydatahazardsi.e.compilercouldnotgenerateaLOADinstructionthatisimmediatelyfollowedbyinstructionthatusesresultofLOAD’sdestinationregister.Techniqueiscalled“pipeline/instructionscheduling〞28AsimpleExampleAclevercompilercanoftenrescheduleinstructionstoavoidastall.Asimpleexample:Originalcode:

lwr2,0(r4)

addr1,r2,r3Note:Stallhappenshere!

lwr5,4(r4)Transformedcode:

lwr2,0(r4)

lwr5,4(r4)

addr1,r2,r3Nostallneeded!

29SimpleRISCPipelineStallStatistics%ofloadsthatcauseastall30Writeafterwrite(WAW)hazardsWithWAWhazard,instructionjtriestowriteanoperandbeforeinstructioniwritesit.ThewritesareperformedinwrongorderleavingthevaluewrittenbyearlierinstructionGraphically/Example:…ji…InstructionjisawriteinstructionissuedafteriInstructioniisawriteinstructionissuedbeforeji:DIVF1,F2,F3j:SUBF1,F4,F631Writeafterread(WAR)hazardsWithWARhazard,instructionjtriestowriteanoperandbeforeinstructionireadsit.Instructioniwouldincorrectlyreceivenewervalueofitsoperand;Insteadofgettingoldvalue,itcouldreceivesomenewer,undesiredvalue.Graphically/Example:…ji…InstructionjisawriteinstructionissuedafteriInstructioniisareadinstructionissuedbeforeji:DIVF7,F1,F3j:SUBF1,F4,F6323.Control(Branch)HazardsSupposethenewPCvalueisnotcomputeduntiltheMEMstage.Thenwemuststall3clocksaftereverybranch!33BranchHazardsneedtoconsiderhazardsinvolvingbranches:Example:40: beq $1,$3,2844: and $12,$2,$548: or $13,$6,$252: add $14,$2,$272: lw $4,50($7)34PipelineimpactonbranchHowdowedealwiththis?AlwaysstallAssumebranch-not-takenBranchdelayslots35AssumebranchnottakenOnaverage,branchesaretaken?thetimeIfbranchnottaken…ContinuenormalprocessingElse,ifbranchistaken…NeedtoflushimproperinstructionfrompipelineCutsoveralltimeforbranchprocessingin?36AssumebranchnottakenCase1:nottakenExecutionproceedsnormallynopenalty37AssumebranchnottakenCase2:takenbranchBubblesinjectedinto3stagesduringcycle538Sum:BranchPenaltyImpactAssume16%ofallinstructionsarebranches4%unconditionalbranches:3cyclepenalty12%conditional:50%taken,3cyclepenaltyForasequenceofNinstructions(assumeNislarge)Ncyclestoinitiateeach3*0.04*Ndelaysduetounconditionalbranches0.5*3*0.12*NdelaysduetoconditionaltakenAlso,anextra4cyclesforpipelinetoemptyTotal:1.3*N+4totalcycles(or1.3cycles/instruction)(CPI)30%PerformanceHit!!!(Badthing)39BranchdelayslotDelayslot:FindoneinstructionthatwillbeexecutednomatterwhichwaythebranchgoesBranchesalwaysexecutenext1or2instructionsInstructionsoexecutedsaidtobeindelayslotbranchinstruction

Delayslotinstruction1

Delayslotinstruction2

Delayslotinstructionn

branchtargetiftaken

Branchdelayslotoflengthn40SchedulingDelayedBranchADDR1,R2,R3ifR2=0thenifR2=0thenSUBR4,R5,R6ADDR1,R2,R3ifR1=0thenSUBR4,R5,R6ADDR1,R2,R3ifR1=0thenADDR1,R2,R3ifR1=0thenADDR1,R2,R3SUBR4,R5,R6ORR7,R8,R9SUBR4,R5,R6ADDR1,R2,R3ifR1=0thenSUBR4,R5,R6ORR7,R8,R9FrombeforeFromtargetFromfallthrough41SchedulingDelayedBranchWheretogetinstructionstofillbranchdelayslot?BeforebranchinstructionalwaysvaluableFromthetargetaddress:onlyvaluablewhenbranchtakenFromfallthrough:onlyvaluablewhenbranchnottaken42FastBranchResolutionPerformancepenaltycouldbemorethan30%Deeperpipelines,somecodeisverybranchheavyFastBranchResolutionAdderinIDforPC+immediatetargetsOnlyworksforsimpleconditions(compareto0)Comparingtworegistervaluescouldbetooslow4344NewPipelineLogic45ExampleAssumethefollowingMIPSinstructionmix:WhatistheresultingCPIforthepipelinedMIPSwithforwardingandbranchaddresscalculationinIDstagewhenusingabranchnot-takenscheme?CPI=IdealCPI+Pipelinestallclockcyclesperinstruction=1+stallsbyloads+stallsbybranches=1+.3x.25x1+.2x.45x1=1+.075+.09=1.165Type Frequency Arith/Logic 40% Load 30%ofwhich25%arefollowedimmediatelybyaninstructionusingtheloadedvalueStore 10% branch 20%ofwhich45%aretaken46Exceptions47TypesofExceptions(Interrupts,Faults)I/Odevicerequest,timereventInvokingOSservicesfromauserprogramTracing(single-stepping)throughprogramBreakpointsIntegerarithmeticoverflow,dividebyzeroFParithmeticanomaly(overflow,underflow,etc.)Pagefault(pagenotinphysicalmemory)MisalignedmemoryaccessMemory-protectionviolation(acc.mem.notalloc’edtoproc.)Illegal(undefinedorunimplemented)instructionHardwaremalfunctionPower-relatedinterrupt(e.g.batterylow,powerfailure)……48ExceptionCharacterization1Synchronousvs.asynchronousEventsynchronizedwithprogramexecution?Synchronous:eventoccurssameplaceeverytimeAsynchronous:causedbydevicesexternaltoCPU&memory,alsohwmalfunctionsUserrequestedvs.coercedEventcausedintentionallybyuserprogram?Requested:usertaskasksforitCoerced:hweventnotundercontrolofuserprogram49ExceptionCharacterization2Usermaskable(canbedisabled)ornotCaneventbedisabled?Maskable:eventthatcanbedisabledbyusertaskWithininstructionsorbetweeninstructionsDoeseventpreventinstructionfromcompleting?Within:duringexecutionoftask,hardtohandle,usuallysynchronoussinceinstructionistriggerResumevsterminateDoestheprogramcontinuefromwhereitleftoffafterexceptionishandled,ordoesitstop?Terminating:executionalwaysstopsaftertheinterrupt50RestartableExceptionsRequirements:Exceptionmayoccurwithininstruction.Programmustcontinueafterexceptionishandled.Examples:Virtualmemorypagefault.Difficultbecause:Pipelinestatemustbesaved.Oneapproach,foreasycases:1.Forceatrapinst.intopipelineonnextIF.2.Clearpipelinebehindfaultinginstruction.3.ExceptionhandlersavesPCoffaultinginstr.51Precisevs.ImpreciseHandlingMachinesmaysupporteitherorbothmodesofexceptionhandling:Preciseexceptionhandling:Correctlyimplementallpossiblecombinationsofexceptionsinallcircumstances.Maybearequirementforsomesystems/applications.Maybe10xslower!Easierforintegerthanfloating-point.Usefulfordebuggingcode.Impreciseexceptionhandling:Onlycorrectlyimplementthemostcommoncases.Softwaremayavoidsomeexceptions.Onlystatisticalguaranteesofcorrectness,throughtesting.52ExceptionsinDLXpipelineInstructionFetch,&MemorystagesPagefaultoninstruction/datafetchMisalignedmemoryaccessMemory-protectionviolationInstructionDecodestageUndefined/illegalopcodeExecutionstageArithmeticexceptionWrite-BackstageNone!53Out-of-OrderExceptionsConsiderthefollowingcodesequence:LWIFIDEXMEMWBADDIFIDEXMEMWBTheADDmaycauseanexceptionduringIF,beforeLWcausesanexceptionduringMEM!Can’trestartPContheADD!Solution:Notetheexceptioninastatusvector,carriedalong.Disablewritesforthatinstruction.Resolveallexceptionsatalatestage(e.g.WB).54PipeliningComplicationsComplexaddressingmodesandinstructionsAutoincrementaddressmodes:causesregisterchangeduringinstructionexecutioninterrupts?NeedtorestoreregisterstateAddsWARandWAWhazardssincewritesnolongerinlaststageFloatingpoint:longexecutiontime;outofordercompletion55StoppingandStartingExecutionMostdifficultexceptionoccurrenceshave2properties TheyoccurwithininstructionsTheymustberestartableThepipelinemustbeshutdownsafelyandthestatemustbesavedforcorrectrestartingRestartingisusuallydonebysavingPCofinstructionatwhichtostartBranchesanddelayedbranchesrequirespecialtreatmentPreciseexceptionsallowinstructionsjustbeforetheexceptiontobecompleted,whilerestartinginstructionsaftertheexception56Multi-cycleOperations57Multi-cycleOperationsforFP58PipelinedMultiple-IssueFPU59Out-of-ordercompleteNoticeinstructionsmaycompleteout-of-order:MULTDIFIDM1M2M3M4M5M6M7MEWBADDDIFIDA1A2A3A4MEWBLDIFIDEX

MEWBSDIFIDEX

MEWB60TypicalFPCodeSeq.WAR.StallsClockCycleNumberInstruction1234567891011121314151617L.DF4,0(R2)IFIDEXMEWBMUL.DF0,F4,F6IFIDstallM1M2M3M4M5M6M7MEWBADD.DF2,F0,F8IFstallIDstallstallstallstallstallstallA1A2A3A4MEWBS.DF2,0(R2)IFstallstallstallstallstallstallIDEXstallstallstallME61Structurehazards62Sum:multiple-cyclesproblemsRaisesthepossibilityofWAWhazards,andstructuralhazardsinMEM&WBstages.Structuralhazardsmayoccurespeciallyoftenwithnon-pipelinedDIVunit.Out-of-ordercompletionimpactsexceptionhandling.63附錄:TheMIPSR4000Pipeline64TheMIPSR4300PipelineManufacturedbyNEC64-bitprocessorimplementsMIPS64ISAUsedinembeddedapplicationsNintendo-64(任天堂)gameprocessor,networkrouter,…MultipleEXstagesforfloating-pointpipelineOut-of-ordercompletion,preciseexceptionsNECVR4122:Integerdata

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論