版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、.1.數(shù)學(xué)模型對(duì)于一個(gè)目的像素,其坐標(biāo)通過(guò)反向變換得到的在原圖中的浮點(diǎn)坐標(biāo)為(i+u,j+v) ,其中i 、 j 均為非負(fù)整數(shù),u、 v 為 0,1) 區(qū)間的浮點(diǎn)數(shù),雙三次插值考慮一個(gè)浮點(diǎn)坐標(biāo)(i+u,j+v) 周圍的 16 個(gè)鄰點(diǎn),目的像素值f(i+u,j+v) 可由如下插值公式得到:f(i+u,j+v) = A * B * CA= S(u + 1)S(u + 0)S(u - 1)S(u - 2) f(i-1, j-1)f(i-1, j+0)f(i-1, j+1)f(i-1, j+2)B= f(i+0, j-1) f(i+0, j+0) f(i+0, j+1) f(i+0, j+2) f(i
2、+1, j-1) f(i+1, j+0) f(i+1, j+1) f(i+1, j+2) f(i+2, j-1)f(i+2, j+0)f(i+2, j+1)f(i+2, j+2) S(v + 1)C= S(v + 0) S(v - 1) S(v - 2) 1-2*Abs(x)2+Abs(x)3, 0=Abs(x)1S(x)= 4-8*Abs(x)+5*Abs(x)2-Abs(x)3, 1=Abs(x)=2S(x)是對(duì)Sin(x*Pi)/x的逼近( Pi 是圓周率) ,為插值核。2.計(jì)算流程獲取 16 個(gè)點(diǎn)的坐標(biāo) P1、 P2P162. 由插值核計(jì)算公式S(x) 分別計(jì)算出x、 y 方向的插值核
3、向量Su、Sv進(jìn)行矩陣運(yùn)算,得到插值結(jié)果iTemp1 = Su0 * P1 + Su1 * P5 + Su2 * P9 + Su3 * P13iTemp2 = Su0 * P2 + Su1 * P6 + Su2 * P10 + Su3 * P14iTemp3 = Su0 * P3 + Su1 * P7 + Su2 * P11 + Su3 * P15iTemp4 = Su0 * P4 + Su1 * P8 + Su2 * P12 + Su3 * P16iResult = Sv1 * iTemp1 + Sv2 * iTemp2 + Sv3 * iTemp3 + Sv4 * iTemp44. 在得到
4、插值結(jié)果圖后,我們發(fā)現(xiàn)圖像中有“毛刺”,因此對(duì)插值結(jié)果做了個(gè)后處理,即:設(shè)該點(diǎn)在原圖中的像素值為pSrc,若 abs(iResult - pSrc)大于某閾值,我們認(rèn)為插值后的點(diǎn)可能污染原圖,因此用原像素值pSrc 代替。;.3. 算法優(yōu)化由于雙三次插值計(jì)算一個(gè)點(diǎn)的坐標(biāo)需要其周圍16 個(gè)點(diǎn),更有多達(dá)20 次的乘法及15 次的加法,計(jì)算量可以說(shuō)是非常大,勢(shì)必要進(jìn)行優(yōu)化。我們選擇了Intel 的 SSE2 優(yōu)化技術(shù),它只支持在P4 及以上的機(jī)器。測(cè)試當(dāng)前CPU 是否支持 SSE2,可由 CPUID 指令得到,代碼為:BOOL g_bSSE2 = FALSE;_asmmoveax, 1;cpuid;
5、testedx, 0 x04000000;jzNotSupport;movg_bSSE2, 1NotSupport:支持 SSE2 的 CPU 引入了 8 個(gè) 128 位的寄存器,這樣一個(gè)寄存器中就可以存放4 個(gè)點(diǎn)(RGB) ,有利于并行計(jì)算。詳細(xì)代碼見Transform.cpp 中函數(shù) Optimize_Bicubic 。優(yōu)化中遇到的問題:圖像每個(gè)點(diǎn)由 RGB 通道組成, 由于 1 個(gè) SSE2 寄存器有 16 個(gè)字節(jié), 這樣讀入 4 個(gè)像素點(diǎn)后, 要浪費(fèi) 4 個(gè)字節(jié), 同時(shí)要花費(fèi)時(shí)間將數(shù)據(jù)對(duì)齊, 即由 BRGB | RGBR | GBRG | BRGB 對(duì)齊成 0RGB | 0RGB |
6、0RGB | 0RGB ;2. 讀 16 字節(jié)數(shù)據(jù)到寄存器時(shí),由于圖像地址不能保證是16 字節(jié)對(duì)齊,因此需用更多時(shí)鐘周期的MOVDQU指令 (6 個(gè)以上時(shí)鐘周期);如能使地址16 字節(jié)對(duì)齊,則可用 MOVDQA指令 (1 個(gè)時(shí)鐘周期 ) ;3. 為了消除除法及浮點(diǎn)運(yùn)算,對(duì)權(quán)值放大256 倍,這樣在計(jì)算插值核時(shí),必須用 2Bytes來(lái)表示 1 個(gè)系數(shù),而圖像數(shù)據(jù)都是1Byte ,這樣在對(duì)齊做乘法時(shí),要浪費(fèi)一半的SSE2 寄存器的空間,導(dǎo)致運(yùn)算時(shí)間變長(zhǎng);而若降低插值核的精度,使其在1Byte 表示范圍內(nèi)時(shí),運(yùn)算的精度又大為下降;4. 對(duì)各指令的周期以及若干行指令是否能夠并行流水缺乏經(jīng)驗(yàn)和認(rèn)識(shí)。;.
7、附: SSE2 指令整理算術(shù) (Arithmetic)指令:ADDPD-Packed Double-Precision Floating-Point AddSSE22個(gè) double 對(duì)應(yīng)相加ADDPD xmm0, xmm1/m128ADDPS-Packed Single-Precision Floating-Point AddSSE4個(gè) float 對(duì)應(yīng)相加ADDPS xmm0, xmm1/m128ADDSD-Scalar Double-Precision Floating-Point Add1 個(gè) double(低端 )對(duì)應(yīng)相加SSE2ADDSD xmm0, xmm1/m64ADDSS-S
8、calar Single-Precision Floating-Point AddSSE1 個(gè) float(低端 )對(duì)應(yīng)相加ADDSS xmm0, xmm1/m32PADDB/PADDW/PADDD-Packed AddOpcodeInstructionDescription0F FC /rPADDB mm, mm/m64Add packed byte integers from mm/m64 andmm.66 0F FCPADDBAdd packed byte integers from xmm2/m128/rxmm1,xmm2/m128and xmm1.0F FD /rPADDW mm,
9、mm/m64Add packed word integers from mm/m64 andmm.66 0F FDPADDWxmm1,Add packed word integers from xmm2/m128/rxmm2/m128and xmm1.0F FE /rPADDD mm, mm/m64Addpackeddoublewordintegersfrommm/m64 and mm.66 0F FEPADDDxmm1,Addpackeddoublewordintegersfrom/rxmm2/m128xmm2/m128 and xmm1.;.PADDQ-Packed Quadword Ad
10、dOpcodeInstructionDescription0F D4 /rPADDQ mm1,mm2/m64Add quadword integer mm2/m64 to mm166 0F D4PADDQAdd packed quadword integers xmm2/m128/rxmm1,xmm2/m128to xmm1PADDSB/PADDSW-Packed Add with SaturationOpcodeInstructionDescription0F EC /rPADDSBmm,Add packed signed byte integers from mm/m64mm/m64and
11、 mm and saturate the results.66 0F ECPADDSB xmm1,Addpackedsignedbyteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 saturate the results.0F ED /rPADDSWmm,Add packed signed word integers from mm/m64mm/m64and mm and saturate the results.66 0F EDPADDSWxmm1,Addpackedsignedwordintegersfrom/rxmm2/m128xmm2/m128
12、and xmm1 and saturate the results.PADDUSB/PADDUSW-Packed Add Unsigned with SaturationOpcodeInstructionDescription0F DC /rPADDUSBmm,Add packed unsigned byte integers from mm/m64mm/m64and mm and saturate the results.66 0F DCPADDUSBxmm1,Addpacked unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 s
13、aturate the results.0F DD /rPADDUSWmm,Addpacked unsigned wordintegersfrommm/m64mm/m64 and mm and saturate the results.660FPADDUSWxmm1,Addpacked unsigned wordintegersfromDD /rxmm2/m128xmm2/m128 to xmm1 and saturate the results.PMADDWD-Packed Multiply and AddOpcodeInstructionDescription0F F5 /rPMADDWD
14、mm, Multiply the packed words in mm by the packedmm/m64words in mm/m64. Add the 32-bit pairs of results andstore in mm as doubleword;.66Multiply the packed word integers in xmm1 by the0F PMADDWDpacked word integers in xmm2/m128, and add theF5 /rxmm1, xmm2/m128 adjacent doubleword results.PSADBW-Pack
15、ed Sum of Absolute DifferencesOpcodeInstructionDescriptionPSADBW mm1,Absolute difference of packed unsigned byte integers0F F6 /rfrom mm2 /m64 and mm1; differences are then summedmm2/m64to produce an unsigned word integer result.PSADBWAbsolute difference of packed unsigned byte integers66 0Ffrom xmm
16、2 /m128 and xmm1; the 8 low differences andxmm1,F6 /rxmm2/m1288 high differences are then summed separately toproduce two word integer results.;.PSUBB/PSUBW/PSUBD-Packed SubtractOpcodeInstructionDescription0F F8 /rPSUBBmm,Subtract packed byte integers in mm/m64frommm/m64packed byte integers in mm.66
17、 0F F8PSUBBxmm1,Subtract packed byte integers in xmm2/m128 from/rxmm2/m128packed byte integers in xmm1.0F F9 /rPSUBWmm,Subtract packed word integers in mm/m64frommm/m64packed word integers in mm.66 0F F9PSUBWxmm1,Subtract packed word integers in xmm2/m128 from/rxmm2/m128packed word integers in xmm1.
18、0F FA /rPSUBDmm,Subtract packed doubleword integers in mm/m64mm/m64from packed doubleword integers in mm.66 0F FAPSUBDxmm1,Subtract packed doubleword integers in/rxmm2/m128xmm2/mem128 from packed doubleword integers inxmm1.PSUBQ-Packed Subtract QuadwordOpcodeInstructionDescription0F FB /rPSUBQmm1,Su
19、btract quadword integer in mm1 from mm2mm2/m64/m64.66 0F FBPSUBQxmm1,Subtract packed quadword integers in xmm1/rxmm2/m128from xmm2 /m128.PSUBSB/PSUBSW-Packed Subtract with SaturationOpcodeInstructionDescription0F E8 /rPSUBSBmm,Subtract signed packed bytes in mm/m64 from signedmm/m64packed bytes in m
20、m and saturate results.66 0F E8PSUBSBxmm1,Subtract packed signed byte integers in xmm2/m128/rxmm2/m128from packed signed byte integers in xmm1andsaturate results.0F E9 /rPSUBSWmm,Subtract signed packed words in mm/m64frommm/m64signed packed words in mm and saturate results.;.66 0F E9 PSUBSW xmm1, Su
21、btract packed signed word integers in xmm2/m128 from packed signed word integers in xmm1 and/rxmm2/m128saturate results.PSUBUSB/PSUBUSW-Packed Subtract Unsigned with SaturationOpcodeInstructionDescription0F D8 /rPSUBUSBmm,Subtract unsigned packed bytes inmm/m64 frommm/m64unsigned packed bytes in mm
22、and saturate result.660FPSUBUSBxmm1,Subtractpackedunsignedbyteintegersinxmm2/m128 from packed unsigned byte integers inD8 /rxmm2/m128xmm1 and saturate result.0F D9 /rPSUBUSWmm, Subtract unsigned packed words in mm/m64 frommm/m64unsigned packed words in mm and saturate result.660FPSUBUSWxmm1,Subtract
23、packedunsignedwordintegersinxmm2/m128 from packed unsigned word integers inD9 /rxmm2/m128xmm1 and saturate result.SUBPD-Packed Double-Precision Floating-Point SubtractOpcodeInstructionDescription66 0F 5CSUBPDxmm1, Subtract packed double-precision floating-point/rxmm2/m128values in xmm2/m128 from xmm
24、1.SUBPS-Packed Single-Precision Floating-Point SubtractOpcodeInstructionDescription0F 5CSUBPSxmm1 Subtract packed single-precision floating-point/rxmm2/m128values in xmm2/mem from xmm1.SUBSD-Scalar Double-Precision Floating-Point SubtractOpcodeInstructionDescriptionF2 0F 5CSUBSDxmm1, Subtracts the l
25、ow double-precision floating-point/rxmm2/m64numbers in xmm2/mem64 from xmm1.SUBSS-Scalar Single-FP Subtract;.OpcodeInstructionDescriptionF3 0F 5C SUBSSxmm1, Subtract the lowersingle-precision floating-point/rxmm2/m32numbers in xmm2/m32 from xmm1.-PMULHUW-Packed Multiply High UnsignedOpcodeInstructio
26、nDescription0F E4 /rPMULHUW mm1,Multiply the packed unsigned word integers in mm1mm2/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHUW xmm1,Multiply the packed unsigned word integers in xmm1E4 /rxmm2/m128and xmm2/m128, and store the high 16 bits of theresults in x
27、mm1.PMULHW-Packed Multiply High SignedOpcodeInstructionDescriptionPMULHWMultiply the packed signed word integers in mm1mm,0F E5 /rmm/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHWMultiply the packed signed word integers in xmm1xmm1,E5 /rxmm2/m128and xmm2/m128, a
28、nd store the high 16 bits of theresults in xmm1.;.PMULLW-Packed Multiply Low SignedOpcodeInstructionDescriptionPMULLWmm,Multiply the packed signed word integers in mm10F D5 /rregister and mm2/m64, and store the low 16 bits ofmm/m64the results in mm1.66 0FPMULLWxmm1,Multiply the packed signed word in
29、tegers in xmm1and xmm2/m128, and store the low 16 bits of theD5 /rxmm2/m128results in xmm1.PMULUDQ-Multiply Doubleword UnsignedOpcodeInstructionDescriptionPMULUDQ mm1,Multiplyunsigned doubleword integer in mm1 by0FF4 /runsigned doubleword integer in mm2/m64, and storemm2/m64the quadword result in mm
30、1.66OFPMULUDQMultiplypacked unsigned doubleword integers inxmm1,xmm1by packed unsigned doubleword integers inF4/rxmm2/m128xmm2/m128, and store the quadword results in xmm1.PMULUDQ instruction with 64-Bit operands:DEST63-0DEST31-0 * SRC31-0;PMULUDQ instruction with 128-Bit operands:;.DEST63-0DEST31-0
31、 * SRC31-0;DEST127-64DEST95-64*SRC95-64;MULPD-Packed Double-Precision Floating-Point MultiplyOpcodeInstructionDescription66 0F 59MULPDxmm1, Multiply packed double-precision floating-point/rxmm2/m128values in xmm2/m128 by xmm1.;.DEST63-0DEST63-0 * SRC63-0;DEST127-64DEST127-64*SRC127-64;MULPS-Packed S
32、ingle-Precision Floating-Point MultiplyOpcodeInstructionDescription0F 59MULPSxmm1, Multiply packed single-precision floating-point/rxmm2/m128values in xmm2/mem by xmm1.;.DEST31-0DEST31-0 * SRC31-0;DEST63-32DEST63-32*SRC63-32;.DEST95-64DEST95-64*SRC95-64;DEST127-96DEST127-96*SRC127-96;MULSD-Scalar Do
33、uble-Precision Floating-Point MultiplyOpcodeInstructionDescriptionF2 0FMULSD xmm1,Multiply the low double-precision floating-point value59 /rxmm2/m64in xmm2/mem64 by low double-precision floating-pointvalue in xmm1.;.DEST63-0DEST63-0*xmm2/m6463-0;* DEST127-64 remains unchanged *;MULSS-Scalar Single-
34、FP MultiplyOpcodeInstructionDescriptionF3 0F 59MULSS xmm1,Multiply the low single-precision floating-point value inxmm2/mem by the low single-precision floating-point/rxmm2/m32value in xmm1.DEST31-0DEST31-0 * SRC31-0;* DEST127-32 remains unchanged *;-;.DIVPD-Packed Double-Precision Floating-Point Di
35、videDIVPD xmm0, xmm1/m128DEST63-0DEST63-0 / (SRC63-0);DEST127-64DEST127-64/(SRC127-64);DIVPS-Packed Single-Precision Floating-Point DivideDIVPS xmm0, xmm1/m128;.DEST31-0DEST31-0 / (SRC31-0);DEST63-32DEST63-32/(SRC63-32);.DEST95-64DEST95-64/(SRC95-64);DEST127-96DEST127-96/(SRC127-96);DIVSD-Scalar Dou
36、ble-Precision Floating-Point DivideDIVSD xmm0, xmm1/m64;.DEST63-0DEST63-0 / SRC63-0;* DEST127-64 remains unchanged *;DIVSS-Scalar Single-Precision Floating-Point DivideDIVSS xmm0, xmm1/m32DEST31-0DEST31-0 / SRC31-0;* DEST127-32 remains unchanged *;-PAVGB/PAVGW-Packed AverageOpcode InstructionDescrip
37、tionPAVGBmm1, Average packed unsigned byte integers from0F E0 /rmm2/m64 and mm1, with rounding.mm2/m64;.66 0F E0, PAVGBxmm1, Average packed unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.0F E3 /rPAVGWmm1,Average packed unsigned wordintegersfrommm2/m64mm2/m64 and mm1, with roun
38、ding.66 0F E3 PAVGWxmm1,Average packed unsigned wordintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.-PMAXSW-Packed Signed Integer Word MaximumOpcodeInstructionDescription0F EE /rPMAXSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F EEPMAXSWxmm1,Compare si
39、gned word integers in xmm2/m128/rxmm2/m128and xmm1 for maximum values.PMAXUB-Packed Unsigned Integer Byte MaximumOpcodeInstructionDescription0F DE /rPMAXUBmm1, Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F DEPMAXUBxmm1, Compare unsigned byte integers in xmm2/m128/r
40、xmm2/m128and xmm1 for maximum values.PMINSW-Packed Signed Integer Word MinimumOpcodeInstructionDescription0F EA /rPMINSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for minimum values.66 0F EAPMINSWxmm1, Compare signed word integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.PM
41、INUB-Packed Unsigned Integer Byte MinimumOpcodeInstructionDescription0F DA /rPMINUBmm1,Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for minimum values.;.66 0F DA PMINUBxmm1, Compare unsigned byte integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.-RCPPS-Packed Single-Precision Fl
42、oating-Point ReciprocalOpcodeInstructionDescription0F 53RCPPS xmm1,Returns to xmm1 the packed approximations of the/rxmm2/m128reciprocals of the packed single-precision floating-pointvalues in xmm2/m128.DEST31-0APPROXIMATE(1.0/(SRC31-0);DEST63-32APPROXIMATE(1.0/(SRC63-32);DEST95-64;.APPROXIMATE(1.0/
43、(SRC95-64);DEST127-96APPROXIMATE(1.0/(SRC127-96);RCPSS-Scalar Single-Precision Floating-Point ReciprocalOpcodeInstructionDescriptionF3 0F 53RCPSS xmm1,Returns to xmm1 thepacked approximation of the/rxmm2/m32reciprocal of the lowsingle-precision floating-pointvalue in xmm2/m32.;.DEST31-0APPROX(1.0/(S
44、RC31-0);* DEST127-32 remains unchanged *;RSQRTPS-Packed Single-Precision Floating-Point Square Root ReciprocalOpcodeInstructionDescription0F 52RSQRTPS xmm1,Returns toxmm1 the packed approximations of thereciprocalsof the square roots of the packed/rxmm2/m128single-precision floating-point values in
45、xmm2/m128.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);DEST63-32;.APPROXIMATE(1.0/SQRT(SRC63-32);DEST95-64APPROXIMATE(1.0/SQRT(SRC95-64);DEST127-96;.APPROXIMATE(1.0/SQRT(SRC127-96);RSQRTSS-Scalar Single-Precision Floating-Point Square Root ReciprocalOpcode InstructionDescriptionF3RSQRTSSReturns to xmm1 an
46、approximation of the reciprocal of0Fthe square root of the low single-precision52 /rxmm1,xmm2/m32floating-point value in xmm2/m32.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);* DEST127-32 remains unchanged *;.SQRTPD-Packed Double-Precision Floating-Point Square RootOpcodeInstructionDescription66 0F 51SQRTP
47、D xmm1,Computes square roots of the packed double-precisionfloating-point values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTPS-Packed Single-Precision Floating-Point Square RootOpcodeInstructionDescription0F 51SQRTPS xmm1,Computes square roots of the packed single-precisionfloating-po
48、int values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTSD-Scalar Double-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowdouble-precisionF2 0F 51SQRTSD xmm1, floating-point value in xmm2/m64and stores the/rxmm2/m64results in xmm1.SQRTSS-Sca
49、lar Single-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowsingle-precisionF3 0F 51SQRTSS xmm1, floating-point value in xmm2/m32and stores the/rxmm2/m32results in xmm1.移動(dòng) (Move) 指令:MASKMOVDQU-Mask Move of Double Quadword Unaligned;.MASKMOVDQU xmm0, xmm1
50、MASKMOVQ-Mask Move of QuadwordMASKMOVQ mm0, mm1MOV APD-Move Aligned Packed Double-Precision Floating-Point Values MOVAPD xmm0, xmm1/m128MOVAPD xmm1/m128, xmm0MOV APS-Move Aligned Packed Single-Precision Floating-Point Values MOVAPS xmm0, xmm1/m128MOVD-Move DoublewordInstructionDescriptionMOVD mm, r/
51、m32Move doubleword from r/m32 to mm.MOVD r/m32, mmMove doubleword from mm to r/m32.MOVD xmm, r/m32Move doubleword from r/m32 to xmm.MOVD r/m32, xmmMove doubleword from xmm register to r/m32.MOVDQ2Q - Move QuadwordInstructionDescriptionMOVDQ2Q mm, xmmMove low quadword from xmm to mmx register .MOVQ2D
52、Q-Move QuadwordOpcodeInstructionDescriptionF30FMOVQ2DQxmm,Move quadword from mmx to low quadword ofD6mmxmm.;.DEST63-0SRC63-0;DEST127-6400000000000000000H;MOVDQA - Move Aligned Double QuadwordInstructionDescriptionMOVDQAxmm1, Move aligned double quadword from xmm2/m128 toxmm2/m128xmm1.MOVDQAxmm2/m128
53、, Movealigned doublequadword fromxmm1toxmm1xmm2/m128.MOVDQU - Move Unaligned Double QuadwordInstructionDescription;.MOVDQUxmm1, Move unaligned double quadword from xmm2/m128xmm2/m128to xmm1.MOVDQUxmm2/m128, Move unaligned double quadword fromxmm1 toxmm1xmm2/m128.MOVHLPS- Move Packed Single-Precision
54、 Floating-Point Values High to LowInstructionDescriptionMOVHLPSMove two packed single-precision floating-point values fromxmm1, xmm2high quadword of xmm2 to low quadword of xmm1.DEST63-0SRC127-64;* DEST127-64 unchanged *;MOVLHPS - Move Packed Single-Precision Floating-Point Values Low to HighInstruc
55、tionDescriptionMOVLHPSMove two packed single-precision floating-point values fromxmm1, xmm2low quadword of xmm2 to high quadword of xmm1.MOVHPD-Move High Packed Double-Precision Floating-Point ValueInstructionDescriptionMOVHPDxmm,Move double-precision floating-point value from m64 to highm64quadword
56、 of xmm.MOVHPDm64,Move double-precision floating-point value from high;.xmmquadword of xmm to m64.MOVHPD instruction for memory to XMM move:DEST127-64SRC ;* DEST63-0 unchanged *;MOVHPD instruction for XMM to memory move:DESTSRC127-64 ;MOVHPS-Move High Packed Single-Precision Floating-Point ValuesIns
57、tructionDescriptionMOVHPSxmm,Movetwopacked single-precision floating-pointvalues fromm64m64 to high quadword of xmm.MOVHPSm64,Movetwopacked single-precision floating-pointvalues from;.xmmhigh quadword of xmm to m64.MOVLPD-Move Low Packed Double-Precision Floating-Point ValueInstructionDescriptionMOV
58、LPDxmm,Movedouble-precision floating-point value fromm64 to lowm64quadword of xmm register.MOVLPDm64,Movedouble-precision floating-point nvaluefrom lowxmmquadword of xmm register to m64.MOVLPS - Move Low Packed Single-Precision Floating-Point ValuesOpcodeInstructionDescription0F12MOVLPS xmm,Movetwo
59、packed single-precision floating-pointvalues/rm64from m64 to low quadword of xmm.0F13MOVLPS m64,Movetwo packed single-precision floating-pointvalues/rxmmfrom low quadword of xmm to m64.MOVMSKPD - Extract Packed Double-Precision Floating-Point Sign Mask MOVMSKPD r32, xmmDEST0SRC63;.DEST1SRC127;DEST3-
60、200B;DEST31-40000000H;.MOVMSKPS - Extract Packed Single-Precision Floating-Point Sign Mask MOVMSKPS r32, xmmDEST0SRC31;DEST1SRC63;.DEST1SRC95;DEST1SRC127;DEST31-4000000H;.MOVNTDQ - Move Double Quadword Non-TemporalOpcodeInstructionDescription66 0F E7MOVNTDQMove double quadword from xmm to m128,/rm12
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年銷售人員入職培訓(xùn)與職業(yè)發(fā)展合同
- 公開課《土地的誓言》課件
- 區(qū)塊鏈在體育領(lǐng)域的應(yīng)用案例考核試卷
- 2025版學(xué)校浴室熱水供應(yīng)設(shè)備采購(gòu)與安裝合同3篇
- 2025版土地使用權(quán)出讓居間合同(高端定制版)3篇
- 2025年博主合作廣告合同
- 2025年度健康養(yǎng)生門面店鋪轉(zhuǎn)讓及服務(wù)項(xiàng)目合作協(xié)議4篇
- 2025年博物文化貸款合同
- 2025年高校外國(guó)文教專家教學(xué)與研究合作合同3篇
- 2025年公司增資協(xié)議書模板
- 乳腺癌的綜合治療及進(jìn)展
- 【大學(xué)課件】基于BGP協(xié)議的IP黑名單分發(fā)系統(tǒng)
- 2025年八省聯(lián)考高考語(yǔ)文試題真題解讀及答案詳解課件
- 信息安全意識(shí)培訓(xùn)課件
- 2024年山東省泰安市初中學(xué)業(yè)水平生物試題含答案
- 美的MBS精益管理體系
- 2024安全員知識(shí)考試題(全優(yōu))
- 中國(guó)移動(dòng)各省公司組織架構(gòu)
- 昆明手繪版旅游攻略
- 法律訴訟及咨詢服務(wù) 投標(biāo)方案(技術(shù)標(biāo))
- 格式塔心理咨詢理論與實(shí)踐
評(píng)論
0/150
提交評(píng)論