Video Super-Resolution Project
|
1Chih-Chun Hsu, 1Chia-Wen Lin, and Li-Wei Kang |
1Department of Electrical Engineering National Tsing Hua University Hsinchu 30013, Taiwan |
There are six videos in each demo video. We compare the proposed method with sparse coding SR, ASDS SR, NLBP SR, and TSS-SR. The naïve resolutions for video #5, #7, and #8 is 1280x720, while the resolution for the rest videos is 640x480. To have a fair comparison, we suggest that the resolution of YouTube player can be changed to the highest resolution.
All demo videos (without cropping) can be downloaded from Here.
Video
#1: Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
We can see that the proposed method can
achieve fine-detail reconstruction
with temporal coherent result. Although the reconstructed video using TS-SR
achieves fine-detail result, the temporal incoherency problem is presented. NLBP
shows the over-sharpness property, ASDS and Sc-SR show the over-smooth results.
Table #1 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #1
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 80.00% | 85.00% | 90.00% | 95.00% | 90.00% | 88.00% |
ASDS- SR | 20.00% | - | 50.00% | 60.00% | 90.00% | 70.00% | 58.00% |
NLIBP-SR | 15.00% | 50.00% | - | 60.00% | 90.00% | 70.00% | 57.00% |
SC-SR | 10.00% | 40.00% | 40.00% | - | 80.00% | 60.00% | 46.00% |
Bicubic | 5.00% | 10.00% | 10.00% | 20.00% | - | 40.00% | 17.00% |
TS-SR | 10.00% | 30.00% | 30.00% | 40.00% | 60.00% | - | 34.00% |
Video
#2: Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: ScSR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
In this video, we can see that the reconstructed video
of our method provides more fine-detail result than that of other methods. The dynamic
background (grass) also keeps temporal coherency.
Table #2 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #2
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 75.00% | 80.00% | 85.00% | 90.00% | 85.00% | 83.00% |
ASDS- SR | 25.00% | - | 60.00% | 60.00% | 90.00% | 70.00% | 61.00% |
NLIBP-SR | 20.00% | 40.00% | - | 55.00% | 90.00% | 75.00% | 56.00% |
SC-SR | 15.00% | 40.00% | 45.00% | - | 90.00% | 60.00% | 50.00% |
Bicubic | 10.00% | 10.00% | 10.00% | 10.00% | - | 40.00% | 16.00% |
TS-SR | 15.00% | 30.00% | 25.00% | 40.00% | 60.00% | - | 34.00% |
Video
#3: Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
The rendered & SR video using the proposed method
shows high-quality reconstruction with temporal coherency. Compared to other SR
methods, the proposed method presents superior performance.
Table #3 Subjective “visual quality” evaluation by
paired comparisons (in relative winning percentage) for the HR video #3
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 85.00% | 80.00% | 85.00% | 90.00% | 85.00% | 85.00% |
ASDS- SR | 15.00% | - | 55.00% | 55.00% | 80.00% | 60.00% | 53.00% |
NLIBP-SR | 20.00% | 45.00% | - | 60.00% | 70.00% | 75.00% | 54.00% |
SC-SR | 15.00% | 45.00% | 40.00% | - | 70.00% | 55.00% | 45.00% |
Bicubic | 10.00% | 20.00% | 30.00% | 30.00% | - | 45.00% | 27.00% |
TS-SR | 15.00% | 40.00% | 25.00% | 45.00% | 55.00% | - | 36.00% |
Video
#4: Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
In this video, the fast motion is presented. We can
observed that the proposed method still generates high-quality reconstruction, which outperforms the results of other methods. Besides, the
dynamic background (grass) can be well reconstructed in a temporal coherent way,
while the still-background (i.e. stonewalling) is also reconstructed without
jitter-like artifacts, compared to TS-SR.
Table #4 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #4
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 85.00% | 90.00% | 90.00% | 90.00% | 95.00% | 90.00% |
ASDS- SR | 15.00% | - | 60.00% | 70.00% | 80.00% | 65.00% | 58.00% |
NLIBP-SR | 10.00% | 40.00% | - | 60.00% | 80.00% | 65.00% | 51.00% |
SC-SR | 10.00% | 30.00% | 40.00% | - | 95.00% | 50.00% | 45.00% |
Bicubic | 10.00% | 20.00% | 20.00% | 5.00% | - | 40.00% | 19.00% |
TS-SR | 5.00% | 35.00% | 35.00% | 50.00% | 60.00% | - | 37.00% |
Video #5 (Lucy): Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
It is clear that the details of building, especially
in window, significantly outperform other methods like ASDS, NLIBP. It is
remarkable that the performance of the proposed method for general video is
comparable with other advanced SR methods.
Table #5 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #5
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 80.00% | 90.00% | 90.00% | 95.00% | 90.00% | 89.00% |
ASDS- SR | 20.00% | - | 45.00% | 65.00% | 90.00% | 70.00% | 58.00% |
NLIBP-SR | 10.00% | 55.00% | - | 60.00% | 80.00% | 70.00% | 55.00% |
SC-SR | 10.00% | 35.00% | 40.00% | - | 80.00% | 55.00% | 44.00% |
Bicubic | 5.00% | 10.00% | 20.00% | 20.00% | - | 40.00% | 19.00% |
TS-SR | 10.00% | 30.00% | 30.00% | 45.00% | 60.00% | - | 35.00% |
Video #7 (Campus): Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
The larger motion of the dynamic textures (tree)
result in smoothing synthesized dynamic textures. However, we still want to
point out that the visual quality of the reconstructed video using the proposed
method still slightly outperforms that of other methods.
Table #6 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #7
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 65.00% | 75.00% | 80.00% | 90.00% | 90.00% | 80.00% |
ASDS- SR | 35.00% | - | 65.00% | 60.00% | 85.00% | 75.00% | 64.00% |
NLIBP-SR | 25.00% | 35.00% | - | 50.00% | 85.00% | 85.00% | 56.00% |
SC-SR | 20.00% | 40.00% | 50.00% | - | 90.00% | 60.00% | 52.00% |
Bicubic | 10.00% | 15.00% | 15.00% | 10.00% | - | 55.00% | 21.00% |
TS-SR | 10.00% | 25.00% | 15.00% | 40.00% | 45.00% | - | 27.00% |
Video #8 (Ocean): Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
In this video, the dynamic textures is quite limited,
but synthesized result still outperforms other methods.
Table #7 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #8
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 85.00% | 80.00% | 95.00% | 95.00% | 85.00% | 88.00% |
ASDS- SR | 15.00% | - | 65.00% | 75.00% | 90.00% | 70.00% | 63.00% |
NLIBP-SR | 20.00% | 35.00% | - | 75.00% | 85.00% | 75.00% | 58.00% |
SC-SR | 5.00% | 25.00% | 25.00% | - | 90.00% | 45.00% | 38.00% |
Bicubic | 5.00% | 10.00% | 15.00% | 10.00% | - | 40.00% | 16.00% |
TS-SR | 15.00% | 30.00% | 25.00% | 55.00% | 60.00% | - | 37.00% |
Video #6 (Building): Top-left: Ground truth, Top-middle: NLIBP-SR
[6],
Top-right: The proposed method, Bottom-left: Sc-SR [9], Bottom-middle:
ASDS-SR [11], and Bottom-right: TS-SR [26].
In this video, the details of the building is well
reconstructed using TS-SR. The proposed DTS-SR method is further used to maintain the
temporal consistence. In conclusion, the proposed method has superior
performance over existing SR methods.
Table #8 Subjective “visual quality” evaluation by paired comparisons (in relative winning percentage) for the HR video #6
Method | Proposed | ASDS- SR | NLIBP-SR | SC-SR | Bicubic | TS-SR | Average |
Proposed | - | 90.00% | 85.00% | 95.00% | 95.00% | 85.00% | 90.00% |
ASDS- SR | 10.00% | - | 55.00% | 60.00% | 90.00% | 70.00% | 57.00% |
NLIBP-SR | 15.00% | 45.00% | - | 75.00% | 85.00% | 75.00% | 59.00% |
SC-SR | 5.00% | 40.00% | 25.00% | - | 90.00% | 60.00% | 44.00% |
Bicubic | 5.00% | 10.00% | 15.00% | 10.00% | - | 50.00% | 18.00% |
TS-SR | 15.00% | 30.00% | 25.00% | 40.00% | 50.00% | - | 32.00% |
[6]
W. Dong, L. Zhang, G. Shi, and X. Wu, “Nonlocal back-projection for adaptive
image enlargement,” in Proc. IEEE Int. Conf.
Image Process.,
Cairo, Egypt,
Nov. 2009, pp. 349−352.
[9]
J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution via sparse
representation,” IEEE Trans. Image
Process., vol. 19,
no. 11, pp. 2861–2873,
Nov. 2010.
[11]
W. Dong, L. Zhang, G. Shi, and X. Wu, “Image deblurring and super-resolution by
adaptive sparse domain selection and adaptive regularization,”
IEEE Trans. Image Process., vol. 20,
no. 7, pp. 1838−1857,
July 2011.
[26]
Y. HaCohen, R. Fattal, and D. Lischinski, “Image upsampling via texture
hallucination,” in Proc.
IEEE Int.
Conf.
Comput. Photography,
Cambridge,
MA,
USA, pp. 20−30,
Mar.
2010.