近年來,大家多透過 Swin-Transformer 為基礎模型去嘗試增大模型Receptive field 或設計複雜的模塊來增強模型萃取特徵的能力。然而,我們觀察出了一個普遍的現象,就是特徵圖的強度分布往往會隨著網路深度增加而變大,但在最末端的時候被抑制到一個較小的區間,這可能隱式地限制了模型的效能上界,因為空間訊息會因為遭逢資訊瓶頸 (Information bottleneck) 而損失甚至幾乎不見,而導致大家需要設計非常複雜的網路或模塊才能把 SR 效能再往上提升。
我們研究跟競賽往往都主打一個重點:實務與理論併重。我們提出了 Dense-residual-connected Transformer (DRCT) ,透過結合dense-connection,來穩定前向過程而不用增加額外模組,讓SR模型能夠”遠離”資訊瓶頸,相比於現今SOTA模型省了33%的參數量就能在效能上有顯著提升,這樣的設計很符合我們心目中的概念,有效、但不複雜。
[Paper] https://arxiv.org/abs/2404.00722
[Project Page] https://allproj002.github.io/drct.github.io/
============
Congratulations to Chia-Ming Li and Yi-Hsuan Nelly Chou for achieving the Top 3% Performance Award (6/199) in the CVPR NTIRE2024 Image Super-resolution Competition!
This is a remarkable achievement, especially considering that they were competing against large research teams and companies. Moreover, their research paper on this competition has also been accepted by NTIRE. NTIRE is a well-known workshop held annually at CVPR that aims to promote a broader understanding and discussion among researchers on key computer vision tasks such as image restoration, enhancement, and editing. Specifically, it focuses on how to recover degraded image content and fill in missing information, or achieve desired goals (enhance the performance of applications related to perceptual quality, content, or processing such images).
In recent years, many have attempted to increase the model’s receptive field or design complex modules to enhance the model’s feature extraction capability by using Swin-Transformer as the base model. However, they have observed a common phenomenon: the intensity distribution of feature maps tends to increase with network depth, but is suppressed to a smaller range at the end stage. This may implicitly limit the upper bound of the model’s performance, as spatial information may be lost or even almost invisible due to the information bottleneck, resulting in the need to design very complex networks or modules to further improve SR performance.
Their research and competition have always focused on one key point: the equal importance of practice and theory.
They have proposed Dense-residual-connected Transformer (DRCT), which combines dense-connection to stabilize the forward process without adding additional modules, allowing the SR model to “stay away” from the information bottleneck. Compared to current SOTA models, it can significantly improve performance while saving 33% of the number of parameters. This design is in line with their concept of being effective but not complex.