![計算機(jī)視覺-計算理論與算法基礎(chǔ)課件_第1頁](http://file3.renrendoc.com/fileroot_temp3/2022-2/21/5ecef277-86ed-401c-b6ca-27974b2f2fd1/5ecef277-86ed-401c-b6ca-27974b2f2fd11.gif)
![計算機(jī)視覺-計算理論與算法基礎(chǔ)課件_第2頁](http://file3.renrendoc.com/fileroot_temp3/2022-2/21/5ecef277-86ed-401c-b6ca-27974b2f2fd1/5ecef277-86ed-401c-b6ca-27974b2f2fd12.gif)
![計算機(jī)視覺-計算理論與算法基礎(chǔ)課件_第3頁](http://file3.renrendoc.com/fileroot_temp3/2022-2/21/5ecef277-86ed-401c-b6ca-27974b2f2fd1/5ecef277-86ed-401c-b6ca-27974b2f2fd13.gif)
![計算機(jī)視覺-計算理論與算法基礎(chǔ)課件_第4頁](http://file3.renrendoc.com/fileroot_temp3/2022-2/21/5ecef277-86ed-401c-b6ca-27974b2f2fd1/5ecef277-86ed-401c-b6ca-27974b2f2fd14.gif)
![計算機(jī)視覺-計算理論與算法基礎(chǔ)課件_第5頁](http://file3.renrendoc.com/fileroot_temp3/2022-2/21/5ecef277-86ed-401c-b6ca-27974b2f2fd1/5ecef277-86ed-401c-b6ca-27974b2f2fd15.gif)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、次講座的題目/時間計算機(jī)視覺的背景及幾何基礎(chǔ) (2/13,第1周)攝像機(jī)的幾何標(biāo)定 (3/6,第4周)剛體運(yùn)動姿態(tài)估計問題 (3/27,第7周)姿態(tài)估計問題 (II)(或?qū)?yīng)問題) (4/17,第10周)應(yīng)用 (5/8,第13周)要求v聽5 次講座并積極提問,共同討論(每次有約15-20分鐘的提問及討論時間)v至少完成3個實(shí)驗(yàn)中的一個(程序+報告)v(上機(jī)地點(diǎn)頭兩周內(nèi)定,到時候我通知)v完成一篇(與實(shí)驗(yàn)相關(guān)的) “學(xué)術(shù)”論文v最終成績計算:v本科生: 60%(實(shí)驗(yàn)) + 40%(文章)v研究生: 40%(實(shí)驗(yàn)) + 60%(文章)綱要v什么是什么是CV? 什么是CV? 它是從什么時候發(fā)展起來的
2、?它有哪些研究內(nèi)容?它與哪些學(xué)科/領(lǐng)域相關(guān)?CV的若干問題及應(yīng)用展望 v幾何基礎(chǔ)概率基礎(chǔ)v一些相關(guān)資源Definitions of CV (1)v“Today, the study of extracting 3-D information from video images and building a 3-D model of the scene, called computer vision or image understanding, is one of the research areas that attract the most attention all over the w
3、orld.” from K. Kanatani, “Statistical Optimization for Geometric Computation: Theory and Practics”, 1996.CV的定義 (2)v“視覺,不僅指對光信號的感受,它還包括了對視覺信息的獲取、傳輸、處理、存儲與理解的全過程信號處理理論與計算機(jī)出現(xiàn)以后,人們試圖用攝像機(jī)獲取環(huán)境圖像并將其轉(zhuǎn)換成數(shù)字信號,用計算機(jī)實(shí)現(xiàn)對視覺信息處理的全過程,這樣,就形成了一門新興的學(xué)科計算機(jī)視覺計算機(jī)視覺”“計算機(jī)視覺的研究目標(biāo)是使計算機(jī)具有通過二維圖像認(rèn)知三維環(huán)境信息的能力” “計算機(jī)視覺計算理論與算法基礎(chǔ)”, 馬頌德
4、, 張正友, 1998.v“計算機(jī)視覺是當(dāng)前計算機(jī)科學(xué)研究的一個非?;钴S的領(lǐng)域,該學(xué)科旨在為計算機(jī)和機(jī)器人開發(fā)出具有與人類水平相當(dāng)?shù)囊曈X能力。各國學(xué)者對于計算機(jī)視覺的研究始于20世紀(jì)60年代初,但相關(guān)基礎(chǔ)研究的大部分重要進(jìn)展則是在80年代以后取得的?!?“http:/ image transformation, image restoration, image enhancement, thresholding, region labelling, and shape characterization.v“Tried to identify and classify objects in im
5、ages by techniques of Pattern Recognition (模式識別模式識別), which had been developed for the purpose of recognizing 2-D characters and symbols by feature extraction and statistical decision making by learning”.v“Many pattern recognition researchers believed that the paradigm of pattern recognition would a
6、lso lead to intelligent vision systems that could understand 3-D scenes”.v“However, they soon realized the crucial fact that 3-D objects look very different from viewpoint to viewpoint beyond the capability of 2-D feature-based learning; 3-D meanings of 2-D images cannot be understood unless some a
7、prior knowledge about the scene is given. Thus, Knowledge came to play an essential role”.v“This type of knowledge-based high-level reasoning is called the top-down (自上而下自上而下) (or goal-driven (目標(biāo)驅(qū)動) approach.” v“In a sense, this approach corresponds to the psychological view toward human perception(
8、感知) that humans understand the environment by unconsciously matching the vast amount of knowledge accumulated from experience in the process of growth.”v“This view can be compared to what is known as the Gestalt psychology, which regards human perception as integration of the environment and experie
9、nce. ”vThus, the problem of how to represent and organize such knowledge became a major concern, and many symbolic schemes were derived. Establishing such symbolic representations is one of the central themes of artificial intelligence (人工智能人工智能), and machine vision was regarded as problem solving b
10、y artificial intelligence.v“However, the inherent difficulty of this approach was soon realized: the amount of necessary knowledge, most of which has the form of “if then else ”, is limitless, heavily depending on the domain of each application (“office scene”, “outdoor scene”, etc) and constantly c
11、hanging (e.g., today, many telephones are no longer black and do not have dials). However large the amount of knowledge is, exceptions are bound to appear, and computation time blows up exponentially as the amount of knowledge increases.”vMany combinatorial techniques were proposed so as to find pla
12、usible interpretation efficiently without doing exhaustive search. Such techniques include various types of heuristic (啟發(fā)啟發(fā)式的式的) search as well as special techniques such as constraint propagation (約束繁殖約束繁殖) and probabilistic relaxation (概率松弛概率松弛).v“Realizing that such computational problems are ine
13、vitable as long as knowledge is directly matched with features extracted from raw images, researchers began to pay attention to “physical/optical laws” governing 3-D scenes. In analyzing 2-D images, such laws can provide clues to the 3-D shapes and positions of objects. ”v“For example, the surface g
14、radients of objects can be estimated by analyzing shading intensities (shape from shading). The orientation of a surface in the scene can also be estimated by analyzing the perspective distortion of a texture on it (shape from texture). If objects are moving in the scene (or the camera is moving rel
15、ative to the objects), the 3-D shapes of the objects and their 3-D motions (or the camera motion) can be computed (shape from motion or structure from motion).” v“Although such analyses require appropriate assumptions about surface reflectance, illumination, perspective distortion, and rigid motion,
16、 they do not depend on specific application domains; they are called constraints in contrast to knowledge for the top-down approach. vThis approach is in line with the psychological view toward human vision that human perception occurs automatically when visual signals trigger computation in the bra
17、in and that this computational functionality is innate, acquired in the process of evolution. ”vThis view was asserted by J. J. Gibson, who had a great influence on not only psychologists but also machine vision researchers. vThus, a new paradigm (范例) was established. First, primitive features are e
18、xtracted from raw images by edge detection and image segmentation, resulting in primal sketches; next, approximate shapes and surface orientations are estimated by applying available constraints (shading, texture, motion, stereo, etc.), resulting in 2.5-D sketches; vthen, appropriate 3-D models (e.g
19、., generalized cylinders) are fitted to such data, resulting in a numerical and symbolic representation of the scene; finally, high-level inference is made from such representations. This is called the bottom-up (自下向上自下向上) (or data-driven (數(shù)據(jù)驅(qū)動數(shù)據(jù)驅(qū)動) approach, which is also known as the Marr paradigm
20、 after David Marr, who strongly endorsed this approach. Marr的計算視覺理論框架vMarr從信息處理系統(tǒng)的角度出發(fā),認(rèn)為視覺系統(tǒng)的研究應(yīng)分為三個層次,即計算理論層次、表達(dá)(representation)與算法層次、硬件實(shí)現(xiàn)層次v計算理論層次要回答系統(tǒng)各部分的計算目的與計算策略,亦即各部分的輸入輸出是什么,之間的關(guān)系是什么變換或什么約束v表達(dá)與算法層次應(yīng)給出各部分的輸入輸出和內(nèi)部的信息表達(dá),以及實(shí)現(xiàn)計算理論所規(guī)定的目標(biāo)的算法.v硬件實(shí)現(xiàn)層次要回答“如何用硬件實(shí)現(xiàn)以上算法”vA major drawback of this approach
21、 is its susceptibility to noise. Computation solely based on physical/optical constraints is likely to produce meaningless interpretations in the presence of noise. This is because 3-D reconstruction from 2-D data is a typical inverse problem (逆問題), for which solutions are known to be generally unst
22、able with respect to noise. v“In order to cope with this inherent ill-posedness, many optimization techniques were devised so as to force the solution to have required properties. Such techniques are generally called regularization. Other types of optimization include a stochastic relaxation techniq
23、ue called simulated annealing (模擬退火), which was constructed by analogy with statistical mechanics, and the use of neural networks, which gave rise to a new view toward human cognition called connectionism. ”v“Today, many attempts are being made to enhance the reliability of image data. One approach
24、is to actively control the motion of the camera so that the resulting 3-D interpretation becomes stable (active vision). Another approach is using multiple sensors (stereo, range sensing, etc.) and fusing the data (sensor fusion). ”v“In order to fuse data, the reliability of individual data must be
25、evaluated in quantitative terms so that reliable data contribute more than unreliable data.” v“Some researchers are attempting to use only minimum information that is enough to achieve a specific goal such as object avoidance (qualitative vision, purposive vision, etc.). ”v- for detailed information
26、, read “intro_KKanatani.doc”相關(guān)領(lǐng)域v數(shù)學(xué),物理學(xué)v腦科學(xué)(或神經(jīng)生理學(xué))v心理學(xué),認(rèn)知科學(xué), AI, “計算機(jī)視覺發(fā)展得益于神經(jīng)生理學(xué)、心理學(xué)與認(rèn)知科學(xué)對動物視覺系統(tǒng)的研究,但計算機(jī)視覺已發(fā)展起一套獨(dú)立的計算理論與算法獨(dú)立的計算理論與算法,它并不刻意去仿真生物視覺系統(tǒng)”相關(guān)學(xué)科與相關(guān)課程的聯(lián)系相關(guān)學(xué)科與相關(guān)課程的聯(lián)系數(shù)字圖象處理計算機(jī)視覺模式識別機(jī)器視覺計算機(jī)圖形學(xué)線性代數(shù)集合論高級語言程序設(shè)計數(shù)據(jù)結(jié)構(gòu)先后順序重疊量反應(yīng)相關(guān)程度基礎(chǔ)知識計算機(jī)視覺專題(如圖象與視覺計算)高等代數(shù)最優(yōu)化方法。信號與系統(tǒng)計算幾何Overview (1)v計算機(jī)視覺的幾何學(xué)基礎(chǔ)攝像機(jī)模型
27、v單攝像機(jī)(pinhole model/perspective transformation)v雙攝像機(jī) (epipolar geometry: fundamental/essential matrix)v三攝像機(jī)及更多(multi-view geometry)運(yùn)動估計v對應(yīng)點(diǎn)問題(correspondence problem)v光流計算方法v剛體運(yùn)動參數(shù)估計(minimal projective reconstruction)2-view, 7 points in correspondence; (Faugeras)3-view, 6 points in correspondence; (Q
28、uan Long)3-view, 8 points with one missing in one of the three view. (Quan Long)幾何重構(gòu)(Geometry reconstruction)v立體視覺(stereo vision)vShape from X (shading/motion/texture/contour/focus/de-focus/.)Overview (2)v計算機(jī)視覺的物理學(xué)基礎(chǔ)攝像機(jī)及其成像過程v視點(diǎn)、光源、空間中光線、表面處的光線.v明暗 (shading)、陰影 (shadow)光學(xué)/色彩 (light/color)v輻射學(xué)(radiom
29、etry),輻照率, , 物體表面特性v漫反射表面(各向同性)Lambertian surfacevBDRF (bi-directional reflectance distribution function)Overview (3)v計算機(jī)視覺的圖像模型基礎(chǔ)攝像機(jī)模型及其校準(zhǔn)v內(nèi)參數(shù)、外參數(shù)圖像特征v邊緣、角點(diǎn)、輪廓、紋理、形狀圖像序列特征 (運(yùn)動)v對應(yīng)點(diǎn)、光流Overview (4)計算機(jī)視覺的信號處理層次v低層視覺處理單圖像:濾波/邊緣檢測/紋理多圖像:幾何/立體/從運(yùn)動恢復(fù)仿射或透視結(jié)構(gòu) (affine/perspective structure from motion)v中層視覺處
30、理聚類分割/擬合線條、曲線、輪廓 clustering for segmentation, fitting line基于概率方法的聚類分割/擬合跟蹤 trackingv高層視覺處理匹配模式分類/關(guān)聯(lián)模型識別 pattern classification/aspect graph recognitionv應(yīng)用距離數(shù)據(jù)(range data)/圖像數(shù)據(jù)檢索/基于圖像的繪制Overview (5)計算機(jī)視覺的數(shù)學(xué)基礎(chǔ)v射影仿射幾何、微分幾何v概率統(tǒng)計與隨機(jī)過程v數(shù)值計算與優(yōu)化方法v機(jī)器學(xué)習(xí)計算機(jī)視覺的基本的分析工具和數(shù)學(xué)模型vSignal processing approach: FFT, filt
31、ering, wavelets, vSubspace approach: PCA, LDA, ICA, vBayesian inference approach: EM, Condensation/sequential importance sampling (SIS) , Markov chain Monte Carlo (MCMC) , .vMachine learning approach: SVM/Kernel machine, Boosting/Adaboost, k-NN/Regression, vHMM, BN/DBN (Dynamic Bayesian Network), vG
32、ibbs, MRF, vOverview (6)計算機(jī)視覺問題的特點(diǎn)v高維數(shù)據(jù)的本質(zhì)維數(shù)很低,使得模型化成為可能。High dimensional image/video data lie in a very low dimensional manifold.v解的不唯一性 缺少約束的逆問題v優(yōu)化問題CV的若干問題及應(yīng)用展望v基本視覺系統(tǒng)如下:特征檢測Shape from X識別圖像低層特征位置與形狀物體描述 涉及模塊與系統(tǒng)的研究存在的問題與出現(xiàn)的一些新 思路,如“視覺信息處理系統(tǒng)的任務(wù)”, “關(guān)于模塊化 問題” , “局部特征與全局特征” , “物體建模” ,等等 三維計算機(jī)視覺將會有極廣泛
33、的應(yīng)用前景, 如: 計算機(jī)人機(jī)交互;多媒體技術(shù),數(shù)據(jù)庫與圖像通信; 生產(chǎn)自動化;醫(yī)學(xué);自動導(dǎo)航;三維場景建模與可視化綱要v什么是CV? 什么是CV? 它是從什么時候發(fā)展起來的?它有哪些研究內(nèi)容?它與哪些學(xué)科/領(lǐng)域相關(guān)?CV的若干問題及應(yīng)用展望 v幾何基礎(chǔ)概率基礎(chǔ)幾何基礎(chǔ)概率基礎(chǔ)v一些相關(guān)資源射影幾何知識簡介v歐氏幾何:旋轉(zhuǎn)和平移都是歐氏變換研究在歐氏變換下保持不變的性質(zhì)(歐氏性質(zhì))的幾何是歐氏幾何如平行性,長度,角度等都是歐氏性質(zhì)v射影幾何:照相機(jī)的成像過程是一個射影(透視或中心射影)的過程它不保持歐氏性質(zhì),如平行線不再平行研究射影空間射影空間中在射影變換下保持不變的性質(zhì)(射影性質(zhì))的幾何學(xué)是
34、射影幾何無窮遠(yuǎn)元素v平行線交于一個無窮遠(yuǎn)點(diǎn);v平行平面交于一條無窮遠(yuǎn)直線;v在一條直線上只有唯一一個無窮遠(yuǎn)點(diǎn);v所有的一組平行線共有一個無窮遠(yuǎn)點(diǎn)v在一個平面上,所有的無窮遠(yuǎn)點(diǎn)組成一條直線,稱為這個平面的無窮遠(yuǎn)直線維空間中所有的無窮遠(yuǎn)點(diǎn)組成一個平面, 稱為這個空間的無窮遠(yuǎn)平面射影空間v對n維歐氏空間加入無窮遠(yuǎn)元素,并對有限元素和無窮遠(yuǎn)元素不加區(qū)分不加區(qū)分,則它們共同構(gòu)成了n維射影空間射影空間.v1維射影空間是一條射影直線,它由歐氏直線和它的無窮遠(yuǎn)點(diǎn)組成;v2維射影空間是一個射影平面,它由歐氏平面和它的無窮遠(yuǎn)直線組成;v3維射影空間是由3維歐氏空間加上無窮遠(yuǎn)平面組成齊次坐標(biāo)v在歐氏空間中建立坐標(biāo)系
35、以后,點(diǎn)與坐標(biāo)有了一一對應(yīng),但當(dāng)引入無窮遠(yuǎn)點(diǎn)以后,無窮遠(yuǎn)點(diǎn)沒有坐標(biāo),為了刻劃無窮遠(yuǎn)點(diǎn)的坐標(biāo),可以引入齊次坐標(biāo)v在n維歐氏空間中,建立直角坐標(biāo)以后,每個點(diǎn)的坐標(biāo)為(m1, , mn),對任意n+1個數(shù)x1, , xn, x0,如果滿足x00, xi/x0 = mi, (i = 1n)則稱(x1, , xn, x0)為該點(diǎn)的齊次坐標(biāo)齊次坐標(biāo)而(m1, , mn)被稱為非齊次坐標(biāo)齊次坐標(biāo)v不全為0的數(shù)x1, , xn組成的坐標(biāo) (x1, , xn, 0)被稱為無窮遠(yuǎn)點(diǎn)的齊次坐標(biāo)v例 設(shè)在歐氏直線上的普通點(diǎn)的坐標(biāo)為x,則適合x1/ x0 = x的任意兩個數(shù)組成的坐標(biāo)(x1, x0)為該點(diǎn)的齊次坐標(biāo),而
36、x為該點(diǎn)的非齊次坐標(biāo)對任意x1 0,則(x1, 0)是無窮遠(yuǎn)點(diǎn)的齊次坐標(biāo)射影參數(shù)交比射影變換射影平面中的對偶v“點(diǎn)”與“直線”叫做射影平面上的對偶元素v“過一點(diǎn)作一直線”與“在直線上取一點(diǎn)”叫做對偶作圖v在射影平面設(shè)有點(diǎn),直線及其相互結(jié)合和順序關(guān)系所組成的一個命題,將此命題中的各元素改為它的對偶元,各作圖改為它的對偶作圖,其結(jié)果形成另一個命題,這兩個命題稱為平面對偶命題v對偶原則:在射影平面中,若一個命題成立,則其對偶命題也成立調(diào)和關(guān)系v若點(diǎn)對(P1, P2)和(P3, P4)的交比是-1,即 (P1, P2;P3, P4) = -1,則稱(P1, P2)與(P3, P4) 是調(diào)和調(diào)和的v點(diǎn)對(P1, P2)與(P3, P4) 是調(diào)和的當(dāng)且僅當(dāng)(1+2)(3+4) = 2(12 +34)其中i分別是Pi (i = 1, , 4)的射影參數(shù)完全四點(diǎn)(線)形中的調(diào)和關(guān)系二次曲線絕對二次曲線(Absolute Conic)極點(diǎn)與極線v對于一個二次曲線C和某個點(diǎn)A(向量),由L=CA確定的直線(線坐標(biāo))稱為點(diǎn)A關(guān)于二次
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年度數(shù)據(jù)中心安全防護(hù)與應(yīng)急響應(yīng)合同模板
- 一米菜園認(rèn)養(yǎng)合同范本
- 中標(biāo)監(jiān)控合同范本
- 2025年度高效環(huán)保型聚合氯化鋁采購合同
- 2025年度盡職調(diào)查保密協(xié)議期限及保密期限延長規(guī)定
- 入團(tuán)申請書入團(tuán)的要求
- 2022-2027年中國鹽酸苯奎胺行業(yè)發(fā)展概況及行業(yè)投資潛力預(yù)測報告
- 農(nóng)村低保申請書格式
- 中國冷庫工作服市場運(yùn)行態(tài)勢及行業(yè)發(fā)展前景預(yù)測報告
- 特困生補(bǔ)助申請書
- 全名校北師大版數(shù)學(xué)五年級下冊第三單元達(dá)標(biāo)測試卷(含答案)
- 新員工入職通識考試(中軟國際)
- 四星級酒店工程樣板房裝修施工匯報
- 圓翳內(nèi)障病(老年性白內(nèi)障)中醫(yī)診療方案
- 中考物理復(fù)習(xí)備考策略
- 博士后進(jìn)站申請書博士后進(jìn)站申請書八篇
- 小報:人工智能科技科學(xué)小報手抄報電子小報word小報
- GB/T 41509-2022綠色制造干式切削工藝性能評價規(guī)范
- 公安系防暴安全03安檢
- 孫權(quán)勸學(xué)教案全國一等獎教學(xué)設(shè)計
- 企業(yè)生產(chǎn)現(xiàn)場6S管理知識培訓(xùn)課件
評論
0/150
提交評論