雙三次插值及優(yōu)化

上傳人：d*** IP屬地：天津上傳時間：2022-09-19 格式：DOCX 頁數(shù)：97 大小：382.04KB 積分：118 舉報 版權(quán)申訴

已閱讀5頁，還剩92頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進行舉報或認領(lǐng)

文檔簡介

1、.1.數(shù)學模型對于一個目的像素，其坐標通過反向變換得到的在原圖中的浮點坐標為(i+u,j+v) ，其中i 、 j 均為非負整數(shù)，u、 v 為 0,1) 區(qū)間的浮點數(shù)，雙三次插值考慮一個浮點坐標(i+u,j+v) 周圍的 16 個鄰點，目的像素值f(i+u,j+v) 可由如下插值公式得到：f(i+u,j+v) = A * B * CA= S(u + 1)S(u + 0)S(u - 1)S(u - 2) f(i-1, j-1)f(i-1, j+0)f(i-1, j+1)f(i-1, j+2)B= f(i+0, j-1) f(i+0, j+0) f(i+0, j+1) f(i+0, j+2) f(i

2、+1, j-1) f(i+1, j+0) f(i+1, j+1) f(i+1, j+2) f(i+2, j-1)f(i+2, j+0)f(i+2, j+1)f(i+2, j+2) S(v + 1)C= S(v + 0) S(v - 1) S(v - 2) 1-2*Abs(x)2+Abs(x)3, 0=Abs(x)1S(x)= 4-8*Abs(x)+5*Abs(x)2-Abs(x)3, 1=Abs(x)=2S(x)是對Sin(x*Pi)/x的逼近（ Pi 是圓周率），為插值核。2.計算流程獲取 16 個點的坐標 P1、 P2P162. 由插值核計算公式S(x) 分別計算出x、 y 方向的插值核

3、向量Su、Sv進行矩陣運算，得到插值結(jié)果iTemp1 = Su0 * P1 + Su1 * P5 + Su2 * P9 + Su3 * P13iTemp2 = Su0 * P2 + Su1 * P6 + Su2 * P10 + Su3 * P14iTemp3 = Su0 * P3 + Su1 * P7 + Su2 * P11 + Su3 * P15iTemp4 = Su0 * P4 + Su1 * P8 + Su2 * P12 + Su3 * P16iResult = Sv1 * iTemp1 + Sv2 * iTemp2 + Sv3 * iTemp3 + Sv4 * iTemp44. 在得到

4、插值結(jié)果圖后，我們發(fā)現(xiàn)圖像中有“毛刺”，因此對插值結(jié)果做了個后處理，即：設(shè)該點在原圖中的像素值為pSrc，若 abs(iResult - pSrc)大于某閾值，我們認為插值后的點可能污染原圖，因此用原像素值pSrc 代替。;.3. 算法優(yōu)化由于雙三次插值計算一個點的坐標需要其周圍16 個點，更有多達20 次的乘法及15 次的加法，計算量可以說是非常大，勢必要進行優(yōu)化。我們選擇了Intel 的 SSE2 優(yōu)化技術(shù)，它只支持在P4 及以上的機器。測試當前CPU 是否支持 SSE2，可由 CPUID 指令得到，代碼為：BOOL g_bSSE2 = FALSE;_asmmoveax, 1;cpuid;

5、testedx, 0 x04000000;jzNotSupport;movg_bSSE2, 1NotSupport:支持 SSE2 的 CPU 引入了 8 個 128 位的寄存器，這樣一個寄存器中就可以存放4 個點(RGB) ，有利于并行計算。詳細代碼見Transform.cpp 中函數(shù) Optimize_Bicubic 。優(yōu)化中遇到的問題：圖像每個點由 RGB 通道組成，由于 1 個 SSE2 寄存器有 16 個字節(jié)，這樣讀入 4 個像素點后，要浪費 4 個字節(jié)，同時要花費時間將數(shù)據(jù)對齊，即由 BRGB | RGBR | GBRG | BRGB 對齊成 0RGB | 0RGB |

6、0RGB | 0RGB ;2. 讀 16 字節(jié)數(shù)據(jù)到寄存器時，由于圖像地址不能保證是16 字節(jié)對齊，因此需用更多時鐘周期的MOVDQU指令 (6 個以上時鐘周期)；如能使地址16 字節(jié)對齊，則可用 MOVDQA指令 (1 個時鐘周期 ) ;3. 為了消除除法及浮點運算，對權(quán)值放大256 倍，這樣在計算插值核時，必須用 2Bytes來表示 1 個系數(shù)，而圖像數(shù)據(jù)都是1Byte ，這樣在對齊做乘法時，要浪費一半的SSE2 寄存器的空間，導致運算時間變長；而若降低插值核的精度，使其在1Byte 表示范圍內(nèi)時，運算的精度又大為下降；4. 對各指令的周期以及若干行指令是否能夠并行流水缺乏經(jīng)驗和認識。;.

7、附： SSE2 指令整理算術(shù) (Arithmetic)指令：ADDPD-Packed Double-Precision Floating-Point AddSSE22個 double 對應相加ADDPD xmm0, xmm1/m128ADDPS-Packed Single-Precision Floating-Point AddSSE4個 float 對應相加ADDPS xmm0, xmm1/m128ADDSD-Scalar Double-Precision Floating-Point Add1 個 double(低端 )對應相加SSE2ADDSD xmm0, xmm1/m64ADDSS-S

8、calar Single-Precision Floating-Point AddSSE1 個 float(低端 )對應相加ADDSS xmm0, xmm1/m32PADDB/PADDW/PADDD-Packed AddOpcodeInstructionDescription0F FC /rPADDB mm, mm/m64Add packed byte integers from mm/m64 andmm.66 0F FCPADDBAdd packed byte integers from xmm2/m128/rxmm1,xmm2/m128and xmm1.0F FD /rPADDW mm,

9、mm/m64Add packed word integers from mm/m64 andmm.66 0F FDPADDWxmm1,Add packed word integers from xmm2/m128/rxmm2/m128and xmm1.0F FE /rPADDD mm, mm/m64Addpackeddoublewordintegersfrommm/m64 and mm.66 0F FEPADDDxmm1,Addpackeddoublewordintegersfrom/rxmm2/m128xmm2/m128 and xmm1.;.PADDQ-Packed Quadword Ad

10、dOpcodeInstructionDescription0F D4 /rPADDQ mm1,mm2/m64Add quadword integer mm2/m64 to mm166 0F D4PADDQAdd packed quadword integers xmm2/m128/rxmm1,xmm2/m128to xmm1PADDSB/PADDSW-Packed Add with SaturationOpcodeInstructionDescription0F EC /rPADDSBmm,Add packed signed byte integers from mm/m64mm/m64and

11、 mm and saturate the results.66 0F ECPADDSB xmm1,Addpackedsignedbyteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 saturate the results.0F ED /rPADDSWmm,Add packed signed word integers from mm/m64mm/m64and mm and saturate the results.66 0F EDPADDSWxmm1,Addpackedsignedwordintegersfrom/rxmm2/m128xmm2/m128

12、and xmm1 and saturate the results.PADDUSB/PADDUSW-Packed Add Unsigned with SaturationOpcodeInstructionDescription0F DC /rPADDUSBmm,Add packed unsigned byte integers from mm/m64mm/m64and mm and saturate the results.66 0F DCPADDUSBxmm1,Addpacked unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 s

13、aturate the results.0F DD /rPADDUSWmm,Addpacked unsigned wordintegersfrommm/m64mm/m64 and mm and saturate the results.660FPADDUSWxmm1,Addpacked unsigned wordintegersfromDD /rxmm2/m128xmm2/m128 to xmm1 and saturate the results.PMADDWD-Packed Multiply and AddOpcodeInstructionDescription0F F5 /rPMADDWD

14、mm, Multiply the packed words in mm by the packedmm/m64words in mm/m64. Add the 32-bit pairs of results andstore in mm as doubleword;.66Multiply the packed word integers in xmm1 by the0F PMADDWDpacked word integers in xmm2/m128, and add theF5 /rxmm1, xmm2/m128 adjacent doubleword results.PSADBW-Pack

15、ed Sum of Absolute DifferencesOpcodeInstructionDescriptionPSADBW mm1,Absolute difference of packed unsigned byte integers0F F6 /rfrom mm2 /m64 and mm1; differences are then summedmm2/m64to produce an unsigned word integer result.PSADBWAbsolute difference of packed unsigned byte integers66 0Ffrom xmm

16、2 /m128 and xmm1; the 8 low differences andxmm1,F6 /rxmm2/m1288 high differences are then summed separately toproduce two word integer results.;.PSUBB/PSUBW/PSUBD-Packed SubtractOpcodeInstructionDescription0F F8 /rPSUBBmm,Subtract packed byte integers in mm/m64frommm/m64packed byte integers in mm.66

17、 0F F8PSUBBxmm1,Subtract packed byte integers in xmm2/m128 from/rxmm2/m128packed byte integers in xmm1.0F F9 /rPSUBWmm,Subtract packed word integers in mm/m64frommm/m64packed word integers in mm.66 0F F9PSUBWxmm1,Subtract packed word integers in xmm2/m128 from/rxmm2/m128packed word integers in xmm1.

18、0F FA /rPSUBDmm,Subtract packed doubleword integers in mm/m64mm/m64from packed doubleword integers in mm.66 0F FAPSUBDxmm1,Subtract packed doubleword integers in/rxmm2/m128xmm2/mem128 from packed doubleword integers inxmm1.PSUBQ-Packed Subtract QuadwordOpcodeInstructionDescription0F FB /rPSUBQmm1,Su

19、btract quadword integer in mm1 from mm2mm2/m64/m64.66 0F FBPSUBQxmm1,Subtract packed quadword integers in xmm1/rxmm2/m128from xmm2 /m128.PSUBSB/PSUBSW-Packed Subtract with SaturationOpcodeInstructionDescription0F E8 /rPSUBSBmm,Subtract signed packed bytes in mm/m64 from signedmm/m64packed bytes in m

20、m and saturate results.66 0F E8PSUBSBxmm1,Subtract packed signed byte integers in xmm2/m128/rxmm2/m128from packed signed byte integers in xmm1andsaturate results.0F E9 /rPSUBSWmm,Subtract signed packed words in mm/m64frommm/m64signed packed words in mm and saturate results.;.66 0F E9 PSUBSW xmm1, Su

21、btract packed signed word integers in xmm2/m128 from packed signed word integers in xmm1 and/rxmm2/m128saturate results.PSUBUSB/PSUBUSW-Packed Subtract Unsigned with SaturationOpcodeInstructionDescription0F D8 /rPSUBUSBmm,Subtract unsigned packed bytes inmm/m64 frommm/m64unsigned packed bytes in mm

22、and saturate result.660FPSUBUSBxmm1,Subtractpackedunsignedbyteintegersinxmm2/m128 from packed unsigned byte integers inD8 /rxmm2/m128xmm1 and saturate result.0F D9 /rPSUBUSWmm, Subtract unsigned packed words in mm/m64 frommm/m64unsigned packed words in mm and saturate result.660FPSUBUSWxmm1,Subtract

23、packedunsignedwordintegersinxmm2/m128 from packed unsigned word integers inD9 /rxmm2/m128xmm1 and saturate result.SUBPD-Packed Double-Precision Floating-Point SubtractOpcodeInstructionDescription66 0F 5CSUBPDxmm1, Subtract packed double-precision floating-point/rxmm2/m128values in xmm2/m128 from xmm

24、1.SUBPS-Packed Single-Precision Floating-Point SubtractOpcodeInstructionDescription0F 5CSUBPSxmm1 Subtract packed single-precision floating-point/rxmm2/m128values in xmm2/mem from xmm1.SUBSD-Scalar Double-Precision Floating-Point SubtractOpcodeInstructionDescriptionF2 0F 5CSUBSDxmm1, Subtracts the l

25、ow double-precision floating-point/rxmm2/m64numbers in xmm2/mem64 from xmm1.SUBSS-Scalar Single-FP Subtract;.OpcodeInstructionDescriptionF3 0F 5C SUBSSxmm1, Subtract the lowersingle-precision floating-point/rxmm2/m32numbers in xmm2/m32 from xmm1.-PMULHUW-Packed Multiply High UnsignedOpcodeInstructio

26、nDescription0F E4 /rPMULHUW mm1,Multiply the packed unsigned word integers in mm1mm2/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHUW xmm1,Multiply the packed unsigned word integers in xmm1E4 /rxmm2/m128and xmm2/m128, and store the high 16 bits of theresults in x

27、mm1.PMULHW-Packed Multiply High SignedOpcodeInstructionDescriptionPMULHWMultiply the packed signed word integers in mm1mm,0F E5 /rmm/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHWMultiply the packed signed word integers in xmm1xmm1,E5 /rxmm2/m128and xmm2/m128, a

28、nd store the high 16 bits of theresults in xmm1.;.PMULLW-Packed Multiply Low SignedOpcodeInstructionDescriptionPMULLWmm,Multiply the packed signed word integers in mm10F D5 /rregister and mm2/m64, and store the low 16 bits ofmm/m64the results in mm1.66 0FPMULLWxmm1,Multiply the packed signed word in

29、tegers in xmm1and xmm2/m128, and store the low 16 bits of theD5 /rxmm2/m128results in xmm1.PMULUDQ-Multiply Doubleword UnsignedOpcodeInstructionDescriptionPMULUDQ mm1,Multiplyunsigned doubleword integer in mm1 by0FF4 /runsigned doubleword integer in mm2/m64, and storemm2/m64the quadword result in mm

30、1.66OFPMULUDQMultiplypacked unsigned doubleword integers inxmm1,xmm1by packed unsigned doubleword integers inF4/rxmm2/m128xmm2/m128, and store the quadword results in xmm1.PMULUDQ instruction with 64-Bit operands:DEST63-0DEST31-0 * SRC31-0;PMULUDQ instruction with 128-Bit operands:;.DEST63-0DEST31-0

31、 * SRC31-0;DEST127-64DEST95-64*SRC95-64;MULPD-Packed Double-Precision Floating-Point MultiplyOpcodeInstructionDescription66 0F 59MULPDxmm1, Multiply packed double-precision floating-point/rxmm2/m128values in xmm2/m128 by xmm1.;.DEST63-0DEST63-0 * SRC63-0;DEST127-64DEST127-64*SRC127-64;MULPS-Packed S

32、ingle-Precision Floating-Point MultiplyOpcodeInstructionDescription0F 59MULPSxmm1, Multiply packed single-precision floating-point/rxmm2/m128values in xmm2/mem by xmm1.;.DEST31-0DEST31-0 * SRC31-0;DEST63-32DEST63-32*SRC63-32;.DEST95-64DEST95-64*SRC95-64;DEST127-96DEST127-96*SRC127-96;MULSD-Scalar Do

33、uble-Precision Floating-Point MultiplyOpcodeInstructionDescriptionF2 0FMULSD xmm1,Multiply the low double-precision floating-point value59 /rxmm2/m64in xmm2/mem64 by low double-precision floating-pointvalue in xmm1.;.DEST63-0DEST63-0*xmm2/m6463-0;* DEST127-64 remains unchanged *;MULSS-Scalar Single-

34、FP MultiplyOpcodeInstructionDescriptionF3 0F 59MULSS xmm1,Multiply the low single-precision floating-point value inxmm2/mem by the low single-precision floating-point/rxmm2/m32value in xmm1.DEST31-0DEST31-0 * SRC31-0;* DEST127-32 remains unchanged *;-;.DIVPD-Packed Double-Precision Floating-Point Di

35、videDIVPD xmm0, xmm1/m128DEST63-0DEST63-0 / (SRC63-0);DEST127-64DEST127-64/(SRC127-64);DIVPS-Packed Single-Precision Floating-Point DivideDIVPS xmm0, xmm1/m128;.DEST31-0DEST31-0 / (SRC31-0);DEST63-32DEST63-32/(SRC63-32);.DEST95-64DEST95-64/(SRC95-64);DEST127-96DEST127-96/(SRC127-96);DIVSD-Scalar Dou

36、ble-Precision Floating-Point DivideDIVSD xmm0, xmm1/m64;.DEST63-0DEST63-0 / SRC63-0;* DEST127-64 remains unchanged *;DIVSS-Scalar Single-Precision Floating-Point DivideDIVSS xmm0, xmm1/m32DEST31-0DEST31-0 / SRC31-0;* DEST127-32 remains unchanged *;-PAVGB/PAVGW-Packed AverageOpcode InstructionDescrip

37、tionPAVGBmm1, Average packed unsigned byte integers from0F E0 /rmm2/m64 and mm1, with rounding.mm2/m64;.66 0F E0, PAVGBxmm1, Average packed unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.0F E3 /rPAVGWmm1,Average packed unsigned wordintegersfrommm2/m64mm2/m64 and mm1, with roun

38、ding.66 0F E3 PAVGWxmm1,Average packed unsigned wordintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.-PMAXSW-Packed Signed Integer Word MaximumOpcodeInstructionDescription0F EE /rPMAXSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F EEPMAXSWxmm1,Compare si

39、gned word integers in xmm2/m128/rxmm2/m128and xmm1 for maximum values.PMAXUB-Packed Unsigned Integer Byte MaximumOpcodeInstructionDescription0F DE /rPMAXUBmm1, Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F DEPMAXUBxmm1, Compare unsigned byte integers in xmm2/m128/r

40、xmm2/m128and xmm1 for maximum values.PMINSW-Packed Signed Integer Word MinimumOpcodeInstructionDescription0F EA /rPMINSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for minimum values.66 0F EAPMINSWxmm1, Compare signed word integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.PM

41、INUB-Packed Unsigned Integer Byte MinimumOpcodeInstructionDescription0F DA /rPMINUBmm1,Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for minimum values.;.66 0F DA PMINUBxmm1, Compare unsigned byte integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.-RCPPS-Packed Single-Precision Fl

42、oating-Point ReciprocalOpcodeInstructionDescription0F 53RCPPS xmm1,Returns to xmm1 the packed approximations of the/rxmm2/m128reciprocals of the packed single-precision floating-pointvalues in xmm2/m128.DEST31-0APPROXIMATE(1.0/(SRC31-0);DEST63-32APPROXIMATE(1.0/(SRC63-32);DEST95-64;.APPROXIMATE(1.0/

43、(SRC95-64);DEST127-96APPROXIMATE(1.0/(SRC127-96);RCPSS-Scalar Single-Precision Floating-Point ReciprocalOpcodeInstructionDescriptionF3 0F 53RCPSS xmm1,Returns to xmm1 thepacked approximation of the/rxmm2/m32reciprocal of the lowsingle-precision floating-pointvalue in xmm2/m32.;.DEST31-0APPROX(1.0/(S

44、RC31-0);* DEST127-32 remains unchanged *;RSQRTPS-Packed Single-Precision Floating-Point Square Root ReciprocalOpcodeInstructionDescription0F 52RSQRTPS xmm1,Returns toxmm1 the packed approximations of thereciprocalsof the square roots of the packed/rxmm2/m128single-precision floating-point values in

45、xmm2/m128.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);DEST63-32;.APPROXIMATE(1.0/SQRT(SRC63-32);DEST95-64APPROXIMATE(1.0/SQRT(SRC95-64);DEST127-96;.APPROXIMATE(1.0/SQRT(SRC127-96);RSQRTSS-Scalar Single-Precision Floating-Point Square Root ReciprocalOpcode InstructionDescriptionF3RSQRTSSReturns to xmm1 an

46、approximation of the reciprocal of0Fthe square root of the low single-precision52 /rxmm1,xmm2/m32floating-point value in xmm2/m32.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);* DEST127-32 remains unchanged *;.SQRTPD-Packed Double-Precision Floating-Point Square RootOpcodeInstructionDescription66 0F 51SQRTP

47、D xmm1,Computes square roots of the packed double-precisionfloating-point values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTPS-Packed Single-Precision Floating-Point Square RootOpcodeInstructionDescription0F 51SQRTPS xmm1,Computes square roots of the packed single-precisionfloating-po

48、int values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTSD-Scalar Double-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowdouble-precisionF2 0F 51SQRTSD xmm1, floating-point value in xmm2/m64and stores the/rxmm2/m64results in xmm1.SQRTSS-Sca

49、lar Single-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowsingle-precisionF3 0F 51SQRTSS xmm1, floating-point value in xmm2/m32and stores the/rxmm2/m32results in xmm1.移動 (Move) 指令：MASKMOVDQU-Mask Move of Double Quadword Unaligned;.MASKMOVDQU xmm0, xmm1

50、MASKMOVQ-Mask Move of QuadwordMASKMOVQ mm0, mm1MOV APD-Move Aligned Packed Double-Precision Floating-Point Values MOVAPD xmm0, xmm1/m128MOVAPD xmm1/m128, xmm0MOV APS-Move Aligned Packed Single-Precision Floating-Point Values MOVAPS xmm0, xmm1/m128MOVD-Move DoublewordInstructionDescriptionMOVD mm, r/

51、m32Move doubleword from r/m32 to mm.MOVD r/m32, mmMove doubleword from mm to r/m32.MOVD xmm, r/m32Move doubleword from r/m32 to xmm.MOVD r/m32, xmmMove doubleword from xmm register to r/m32.MOVDQ2Q - Move QuadwordInstructionDescriptionMOVDQ2Q mm, xmmMove low quadword from xmm to mmx register .MOVQ2D

52、Q-Move QuadwordOpcodeInstructionDescriptionF30FMOVQ2DQxmm,Move quadword from mmx to low quadword ofD6mmxmm.;.DEST63-0SRC63-0;DEST127-6400000000000000000H;MOVDQA - Move Aligned Double QuadwordInstructionDescriptionMOVDQAxmm1, Move aligned double quadword from xmm2/m128 toxmm2/m128xmm1.MOVDQAxmm2/m128

53、, Movealigned doublequadword fromxmm1toxmm1xmm2/m128.MOVDQU - Move Unaligned Double QuadwordInstructionDescription;.MOVDQUxmm1, Move unaligned double quadword from xmm2/m128xmm2/m128to xmm1.MOVDQUxmm2/m128, Move unaligned double quadword fromxmm1 toxmm1xmm2/m128.MOVHLPS- Move Packed Single-Precision

54、 Floating-Point Values High to LowInstructionDescriptionMOVHLPSMove two packed single-precision floating-point values fromxmm1, xmm2high quadword of xmm2 to low quadword of xmm1.DEST63-0SRC127-64;* DEST127-64 unchanged *;MOVLHPS - Move Packed Single-Precision Floating-Point Values Low to HighInstruc

55、tionDescriptionMOVLHPSMove two packed single-precision floating-point values fromxmm1, xmm2low quadword of xmm2 to high quadword of xmm1.MOVHPD-Move High Packed Double-Precision Floating-Point ValueInstructionDescriptionMOVHPDxmm,Move double-precision floating-point value from m64 to highm64quadword

56、 of xmm.MOVHPDm64,Move double-precision floating-point value from high;.xmmquadword of xmm to m64.MOVHPD instruction for memory to XMM move:DEST127-64SRC ;* DEST63-0 unchanged *;MOVHPD instruction for XMM to memory move:DESTSRC127-64 ;MOVHPS-Move High Packed Single-Precision Floating-Point ValuesIns

57、tructionDescriptionMOVHPSxmm,Movetwopacked single-precision floating-pointvalues fromm64m64 to high quadword of xmm.MOVHPSm64,Movetwopacked single-precision floating-pointvalues from;.xmmhigh quadword of xmm to m64.MOVLPD-Move Low Packed Double-Precision Floating-Point ValueInstructionDescriptionMOV

58、LPDxmm,Movedouble-precision floating-point value fromm64 to lowm64quadword of xmm register.MOVLPDm64,Movedouble-precision floating-point nvaluefrom lowxmmquadword of xmm register to m64.MOVLPS - Move Low Packed Single-Precision Floating-Point ValuesOpcodeInstructionDescription0F12MOVLPS xmm,Movetwo

59、packed single-precision floating-pointvalues/rm64from m64 to low quadword of xmm.0F13MOVLPS m64,Movetwo packed single-precision floating-pointvalues/rxmmfrom low quadword of xmm to m64.MOVMSKPD - Extract Packed Double-Precision Floating-Point Sign Mask MOVMSKPD r32, xmmDEST0SRC63;.DEST1SRC127;DEST3-

60、200B;DEST31-40000000H;.MOVMSKPS - Extract Packed Single-Precision Floating-Point Sign Mask MOVMSKPS r32, xmmDEST0SRC31;DEST1SRC63;.DEST1SRC95;DEST1SRC127;DEST31-4000000H;.MOVNTDQ - Move Double Quadword Non-TemporalOpcodeInstructionDescription66 0F E7MOVNTDQMove double quadword from xmm to m128,/rm12

人人文庫> 全部分類> 辦公材料 > 辦公文檔

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權(quán)或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

雙三次插值及優(yōu)化

文檔簡介

溫馨提示

最新文檔

評論

雙三次插值及優(yōu)化

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔