Mar 14, 2024
[Runyi Yang](<https://runyiyang.github.io/>) / [Runyi’s Blogs](<https://runyiyang.notion.site/Runyi-s-Blogs-f52d6bf73e104c51a4f5e80529b6a9b6>)
3D contents are combinations of shape and appearance models that can be rendered into 2D images from different viewpoints.
View synthesis is a problem of generating a synthetic image that looks as if it was taken from a novel viewpoint.
Imaging you have a set of images for a scene, which are enough to describe details of a scene / item in the 3D world, you could definitely imagine what it would be like in a novel view. This task is to teach algorithms / computers to imagine this.
PSNR: Peak Signal to Noise Ratio
$$ MSE = \frac{1}{mn}\sum_{i=0}^{m-1}\sum_{j=0}^{n-1} [I(i,j) - K(i, j)]^2 $$
$$ PSNR = 10·\log_{10}(\frac{MAX^2_I}{MSE}) $$
SSIM: Structural Similarity Index Measure
To determine whether the picture is distorted and describe the similarity between 2 images
Measured three properties of the picture: Luminance, Contrast, Structure
luminance
$$ \mu_x = \frac{1}{N}\sum^N_{i=1}x_i \\ l(x,y) = \frac{2\mu_x\mu_y + C_1}{\mu_x^2+\mu_y^2+C_1} $$
Contrast
$$ \sigma_x = (\frac{1}{N-1}\sum_{i=1}^N(x_i-\mu_x)^2)^\frac{1}{2} \\ c(x,y) = \frac{2\sigma_x\sigma_y + C_2}{\sigma_x^2+\sigma_y^2+C_2} $$
Structure
$$ \sigma_{xy}=\frac{1}{N-1}\sum_{i=1}^N(x_i-\mu_x)(y_i-\mu_y) \\ s(x,y) = \frac{\sigma_{xy} + C_3}{\sigma_x\sigma_y^+C_3} $$
SSIM
$$ SSIM(x,y) = l(x,y)^\alpha · c(x,y)^\beta · s(x,y)^\gamma $$
where $\alpha, \beta, \gamma$ are hyperparameters and are usually set to 1.
LPIPS: Learned Perceptual Image Patch Similarity
Use the deep feature to compare the similarity to solve the problem that using L2, PSNR, and SSIM couldn’t recognize the smoothed images.
Use the deep neural network (e.g. AlexNet) to generate features of 2 images, and compare the 2 features using L2 / MSE.