计算机视觉——两视图几何求解投影矩阵

上文我提到了通过图像匹配得到基本矩阵，接下来我们要接着求解投影矩阵。

计算投影矩阵思路

假设两个投影矩阵为规范化相机，因此采用基本矩阵进行恢复。在规范化相机下， $P = [I ∣0]$ , $P^{'} = [M ∣ m]$ 。
我们知道一对 $(P, P^{'})$ 可以唯一确定基本矩阵 $F=[m]_xM$ ，而基本矩阵则在相差一个射影变换的意义下才能唯一对应一组 $(P, P^{'})$ 。
$P = [I ∣0]$ , $P'=[[e']_xF|e']$ ,其中 $e^{'}$ 为 $e'^TF=0$ 的对极点。因此我们需要求解对极点。而对极点是极线聚焦的点：
$l_1 ×l_2=e$
同时我们可以通过两点确定一条直线：
$p_1 × p_2 = l$

这种点叉积为线，线叉积为点叫做对偶

def drawlines(img1,img2,lines,pts1,pts2):

    r,c,_ = img1.shape
    for r,pt1,pt2 in zip(lines,pts1,pts2):
        color = tuple(np.random.randint(0,255,3).tolist())
        x0,y0 = map(int, [0, -r[2]/r[1] ])
        x1,y1 = map(int, [c, -(r[2]+r[0]*c)/r[1] ])
        img1 = cv2.line(img1, (x0,y0), (x1,y1), color,1)
        img1 = cv2.circle(img1,tuple(pt1),5,color,-1)
        img2 = cv2.circle(img2,tuple(pt2),5,color,-1)
    return img1,img2

pts1 = np.int32(src_points)
pts2 = np.int32(dst_points)
lines1 = cv2.computeCorrespondEpilines(pts2.reshape(-1,1,2), 2,F)
lines1 = lines1.reshape(-1,3)
img_line1,_ = drawlines(left_img,right_img,lines1,pts1,pts2)

lines2 = cv2.computeCorrespondEpilines(pts1.reshape(-1,1,2), 1,F)
lines2 = lines2.reshape(-1,3)
img_line2,_ = drawlines(right_img,left_img,lines2,pts2,pts1)

在这里插入图片描述
然后我们统计一下对极点，还是采用SVD分解。因为两条直线相交能够唯一确定一个点，而我们这里有数十条直线，它们不一定能确保相交于一个点。因此我们需要找到一个点 $e$ ，使得 $e$ 到每一条直线的距离和最小，也就是：
$\underset{e}{argmin}\sum_{i}^{N} dist(l_i,e)$
然后根据我上面写的公式 $P = [I ∣0]$ , $P'=[[e']_xF|e']$ 即可算出P矩阵

Ⅰ. $a]_x$ 表示a×b中a的矩阵写法。当我们计算两个向量的叉积时，可以将向量写成一个反对称矩阵 $a]_x$ 的形式。
Ⅱ. $设\vec{a}=(a_1,a_2,a_3)，[a]_x=\begin{bmatrix} 0 & -a_3 &a_2 \\ a_3 & 0 & -a_1 \\ -a_2 & a_1 & 0 \end{bmatrix}$
Ⅲ. np.array格式的矩阵可以用@进行相乘，当然还有np.dot()，np.multiply()，np.matmul()，可以查一下区别。

def getEpiPoint(A):
    U,sigma,VT = np.linalg.svd(A)
    pts = VT.T[:,-1]
    pts  = pts / pts[2]
    return pts
Epts1 = getEpiPoint(lines1) 
Epts2 = getEpiPoint(lines2)
e2x = np.array([[0,-Epts2[2],Epts2[1]],[Epts2[2],0,-Epts2[0]],[-Epts2[1],Epts2[0],0]])
P1 = np.array([[1,0,0,0],
      [0,1,0,0],
      [0,0,1,0]])
P2 = np.column_stack((e2x @ F,np.array(Epts2).T))

对投影矩阵分解得到内参

我们知道 $P = K [R ∣ t] = [K R ∣ K t]$ ，设 $P$ 前三列为 $\hat P$ ，那么我们对 $\hat P$ 进行QR分解可以得到两矩阵。在QR分解中，左Q矩阵为正交矩阵，右R矩阵为上三角矩阵，而 $\hat{P}=KR$ 中K为上三角矩阵，R为正交矩阵，与QR分解得到的矩阵并不能对应上。因此我们应该先对 $\hat{P}$ 进行求逆， $\hat{P}^{-1}=R^{-1}K^{-1}$ ，而正交矩阵和上三角矩阵的逆仍保持自身的性质，因此我们对 $\hat{P}$ 进行QR分解，得到矩阵再求一遍逆即可。

def decompose_projection_matrix(projection_matrix):

    R_inv, K_inv = np.linalg.qr(np.linalg.inv(projection_matrix[:, :3])) 
    K_inv_tmp = K_inv
    K_inv /= K_inv[2, 2]  # 归一化

    K = np.linalg.inv(K_inv)
    R = np.linalg.inv(R_inv)
    # if np.linalg.det(K) < 0:
    #     K *= -1
    extrinsic_matrix = np.matmul(K_inv_tmp, projection_matrix)

    return K, R, extrinsic_matrix

线性三角测量

投影矩阵是将世界三维点投影到图像二维点，因此我们想通过图像二位点和投影矩阵恢复出三维点，从而恢复出图像中物体三维结构，这个过程叫做三维重建。想要恢复出较好的三维图，需要大量的多角度图片进行拍摄和计算。而两张图片只能用于简单的三维坐标计算。
$\lambda \begin{bmatrix} x\\ y\\ 1 \end{bmatrix}=P\begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix}$
想要使用SVD分解就需要构造 $A X = 0$ 的形式。我们发现x与PX是成比例的的，因此x × PX=0，从而得到：
$\begin{align*} x(p^{3T}X)-(p^{1T}X) & = 0 \\ y(p^{3T}X)-(p^{2T}X) & =0\\ x(p^{2T}X)-y(p^{1T}X) & = 0 \end{align*}$
其中 $p^{iT}$ 是 $P$ 的行的转置，这三个方程的系数的秩为2。
然后结合x × PX=0、x’ × P’X=0得到：
$\begin{bmatrix} xp^{3T}-p^{1T} \\ yp^{3T}-p^{2T} \\ x'p'^{3T}-p'^{1T} \\ y'p'^{3T}-p'^{2T} \end{bmatrix},AX=0$
从而能够解出X的坐标。

def triangulate(P1, P2, x1, x2):
    A = np.vstack((x1[0] * P1[2] - P1[0],
                   x1[1] * P1[2] - P1[1],
                   x2[0] * P2[2] - P2[0],
                   x2[1] * P2[2] - P2[1]))
    _, _, VT = np.linalg.svd(A)
    X_homogeneous = VT.T[:,-1]
    
    X_homogeneous /= X_homogeneous[3] 
    X = X_homogeneous[:3] 
    
    return X