Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Maheshkumar H. Kolekar

CFAT: Unleashing TriangularWindows for Image Super-resolution

Mar 24, 2024
Abhisek Ray, Gaurav Kumar, Maheshkumar H. Kolekar

Transformer-based models have revolutionized the field of image super-resolution (SR) by harnessing their inherent ability to capture complex contextual features. The overlapping rectangular shifted window technique used in transformer architecture nowadays is a common practice in super-resolution models to improve the quality and robustness of image upscaling. However, it suffers from distortion at the boundaries and has limited unique shifting modes. To overcome these weaknesses, we propose a non-overlapping triangular window technique that synchronously works with the rectangular one to mitigate boundary-level distortion and allows the model to access more unique sifting modes. In this paper, we propose a Composite Fusion Attention Transformer (CFAT) that incorporates triangular-rectangular window-based local attention with a channel-based global attention technique in image super-resolution. As a result, CFAT enables attention mechanisms to be activated on more image pixels and captures long-range, multi-scale features to improve SR performance. The extensive experimental results and ablation study demonstrate the effectiveness of CFAT in the SR domain. Our proposed model shows a significant 0.7 dB performance improvement over other state-of-the-art SR architectures.

* Accepted to CVPR 2024

Via

Access Paper or Ask Questions

Visual Interest Prediction with Attentive Multi-Task Transfer Learning

May 27, 2020
Deepanway Ghosal, Maheshkumar H. Kolekar

Figure 1 for Visual Interest Prediction with Attentive Multi-Task Transfer Learning

Figure 2 for Visual Interest Prediction with Attentive Multi-Task Transfer Learning

Figure 3 for Visual Interest Prediction with Attentive Multi-Task Transfer Learning

Figure 4 for Visual Interest Prediction with Attentive Multi-Task Transfer Learning

Visual interest & affect prediction is a very interesting area of research in the area of computer vision. In this paper, we propose a transfer learning and attention mechanism based neural network model to predict visual interest & affective dimensions in digital photos. Learning the multi-dimensional affects is addressed through a multi-task learning framework. With various experiments we show the effectiveness of the proposed approach. Evaluation of our model on the benchmark dataset shows large improvement over current state-of-the-art systems.

Via

Access Paper or Ask Questions