Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chang-Su Kim

Semantic Line Combination Detector

May 01, 2024
Jinwon Ko, Dongkwon Jin, Chang-Su Kim

A novel algorithm, called semantic line combination detector (SLCD), to find an optimal combination of semantic lines is proposed in this paper. It processes all lines in each line combination at once to assess the overall harmony of the lines. First, we generate various line combinations from reliable lines. Second, we estimate the score of each line combination and determine the best one. Experimental results demonstrate that the proposed SLCD outperforms existing semantic line detectors on various datasets. Moreover, it is shown that SLCD can be applied effectively to three vision tasks of vanishing point detection, symmetry axis detection, and composition-based image retrieval. Our codes are available at https://github.com/Jinwon-Ko/SLCD.

* CVPR 2024 accepted

Via

Access Paper or Ask Questions

Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement

Apr 30, 2024
Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

The main function of depth completion is to compensate for an insufficient and unpredictable number of sparse depth measurements of hardware sensors. However, existing research on depth completion assumes that the sparsity -- the number of points or LiDAR lines -- is fixed for training and testing. Hence, the completion performance drops severely when the number of sparse depths changes significantly. To address this issue, we propose the sparsity-adaptive depth refinement (SDR) framework, which refines monocular depth estimates using sparse depth points. For SDR, we propose the masked spatial propagation network (MSPN) to perform SDR with a varying number of sparse depths effectively by gradually propagating sparse depth information throughout the entire depth map. Experimental results demonstrate that MPSN achieves state-of-the-art performance on both SDR and conventional depth completion scenarios.

Via

Access Paper or Ask Questions

Clicks2Line: Using Lines for Interactive Image Segmentation

Apr 29, 2024
Chaewon Lee, Chang-Su Kim

For click-based interactive segmentation methods, reducing the number of clicks required to obtain a desired segmentation result is essential. Although recent click-based methods yield decent segmentation results, we observe that substantial amount of clicks are required to segment elongated regions. To reduce the amount of user-effort required, we propose using lines instead of clicks for such cases. In this paper, an interactive segmentation algorithm which adaptively adopts either clicks or lines as input is proposed. Experimental results demonstrate that using lines can generate better segmentation results than clicks for several cases.

Via

Access Paper or Ask Questions

MFP: Making Full Use of Probability Maps for Interactive Image Segmentation

Apr 29, 2024
Chaewon Lee, Seon-Ho Lee, Chang-Su Kim

In recent interactive segmentation algorithms, previous probability maps are used as network input to help predictions in the current segmentation round. However, despite the utilization of previous masks, useful information contained in the probability maps is not well propagated to the current predictions. In this paper, to overcome this limitation, we propose a novel and effective algorithm for click-based interactive image segmentation, called MFP, which attempts to make full use of probability maps. We first modulate previous probability maps to enhance their representations of user-specified objects. Then, we feed the modulated probability maps as additional input to the segmentation network. We implement the proposed MFP algorithm based on the ResNet-34, HRNet-18, and ViT-B backbones and assess the performance extensively on various datasets. It is demonstrated that MFP meaningfully outperforms the existing algorithms using identical backbones. The source codes are available at \href{https://github.com/cwlee00/MFP}{https://github.com/cwlee00/MFP}.

* Accepted to CVPR 2024

Via

Access Paper or Ask Questions

Recursive Video Lane Detection

Aug 22, 2023
Dongkwon Jin, Dahyun Kim, Chang-Su Kim

Figure 1 for Recursive Video Lane Detection

Figure 2 for Recursive Video Lane Detection

Figure 3 for Recursive Video Lane Detection

Figure 4 for Recursive Video Lane Detection

A novel algorithm to detect road lanes in videos, called recursive video lane detector (RVLD), is proposed in this paper, which propagates the state of a current frame recursively to the next frame. RVLD consists of an intra-frame lane detector (ILD) and a predictive lane detector (PLD). First, we design ILD to localize lanes in a still frame. Second, we develop PLD to exploit the information of the previous frame for lane detection in a current frame. To this end, we estimate a motion field and warp the previous output to the current frame. Using the warped information, we refine the feature map of the current frame to detect lanes more reliably. Experimental results show that RVLD outperforms existing detectors on video lane datasets. Our codes are available at https://github.com/dongkwonjin/RVLD.

* ICCV 2023 accepted

Via

Access Paper or Ask Questions

BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

Apr 05, 2023
Junheum Park, Jintae Kim, Chang-Su Kim

Figure 1 for BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

Figure 2 for BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

Figure 3 for BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

Figure 4 for BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

A novel 4K video frame interpolator based on bilateral transformer (BiFormer) is proposed in this paper, which performs three steps: global motion estimation, local motion refinement, and frame synthesis. First, in global motion estimation, we predict symmetric bilateral motion fields at a coarse scale. To this end, we propose BiFormer, the first transformer-based bilateral motion estimator. Second, we refine the global motion fields efficiently using blockwise bilateral cost volumes (BBCVs). Third, we warp the input frames using the refined motion fields and blend them to synthesize an intermediate frame. Extensive experiments demonstrate that the proposed BiFormer algorithm achieves excellent interpolation performance on 4K datasets. The source codes are available at https://github.com/JunHeum/BiFormer.

* Accepted to CVPR2023

Via

Access Paper or Ask Questions

Versatile Depth Estimator Based on Common Relative Depth Estimation and Camera-Specific Relative-to-Metric Depth Conversion

Mar 20, 2023
Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

Figure 1 for Versatile Depth Estimator Based on Common Relative Depth Estimation and Camera-Specific Relative-to-Metric Depth Conversion

Figure 2 for Versatile Depth Estimator Based on Common Relative Depth Estimation and Camera-Specific Relative-to-Metric Depth Conversion

Figure 3 for Versatile Depth Estimator Based on Common Relative Depth Estimation and Camera-Specific Relative-to-Metric Depth Conversion

Figure 4 for Versatile Depth Estimator Based on Common Relative Depth Estimation and Camera-Specific Relative-to-Metric Depth Conversion

A typical monocular depth estimator is trained for a single camera, so its performance drops severely on images taken with different cameras. To address this issue, we propose a versatile depth estimator (VDE), composed of a common relative depth estimator (CRDE) and multiple relative-to-metric converters (R2MCs). The CRDE extracts relative depth information, and each R2MC converts the relative information to predict metric depths for a specific camera. The proposed VDE can cope with diverse scenes, including both indoor and outdoor scenes, with only a 1.12\% parameter increase per camera. Experimental results demonstrate that VDE supports multiple cameras effectively and efficiently and also achieves state-of-the-art performance in the conventional single-camera scenario.

Via

Access Paper or Ask Questions

Context-Based Trit-Plane Coding for Progressive Image Compression

Mar 13, 2023
Seungmin Jeon, Kwang Pyo Choi, Youngo Park, Chang-Su Kim

Figure 1 for Context-Based Trit-Plane Coding for Progressive Image Compression

Figure 2 for Context-Based Trit-Plane Coding for Progressive Image Compression

Figure 3 for Context-Based Trit-Plane Coding for Progressive Image Compression

Figure 4 for Context-Based Trit-Plane Coding for Progressive Image Compression

Trit-plane coding enables deep progressive image compression, but it cannot use autoregressive context models. In this paper, we propose the context-based trit-plane coding (CTC) algorithm to achieve progressive compression more compactly. First, we develop the context-based rate reduction module to estimate trit probabilities of latent elements accurately and thus encode the trit-planes compactly. Second, we develop the context-based distortion reduction module to refine partial latent tensors from the trit-planes and improve the reconstructed image quality. Third, we propose a retraining scheme for the decoder to attain better rate-distortion tradeoffs. Extensive experiments show that CTC outperforms the baseline trit-plane codec significantly in BD-rate on the Kodak lossless dataset, while increasing the time complexity only marginally. Our codes are available at https://github.com/seungminjeon-github/CTC.

* Accepted to CVPR 2023

Via

Access Paper or Ask Questions

Applying Eigencontours to PolarMask-Based Instance Segmentation

Aug 24, 2022
Wonhui Park, Dongkwon Jin, Chang-Su Kim

Figure 1 for Applying Eigencontours to PolarMask-Based Instance Segmentation

Figure 2 for Applying Eigencontours to PolarMask-Based Instance Segmentation

Figure 3 for Applying Eigencontours to PolarMask-Based Instance Segmentation

Figure 4 for Applying Eigencontours to PolarMask-Based Instance Segmentation

Eigencontours are the first data-driven contour descriptors based on singular value decomposition. Based on the implementation of ESE-Seg, eigencontours were applied to the instance segmentation task successfully. In this report, we incorporate eigencontours into the PolarMask network for instance segmentation. Experimental results demonstrate that the proposed algorithm yields better results than PolarMask on two instance segmentation datasets of COCO2017 and SBD. Also, we analyze the characteristics of eigencontours qualitatively. Our codes are available at https://github.com/dnjs3594/Eigencontours.

Via

Access Paper or Ask Questions

Depth Map Decomposition for Monocular Depth Estimation

Aug 23, 2022
Jinyoung Jun, Jae-Han Lee, Chul Lee, Chang-Su Kim

Figure 1 for Depth Map Decomposition for Monocular Depth Estimation

Figure 2 for Depth Map Decomposition for Monocular Depth Estimation

Figure 3 for Depth Map Decomposition for Monocular Depth Estimation

Figure 4 for Depth Map Decomposition for Monocular Depth Estimation

We propose a novel algorithm for monocular depth estimation that decomposes a metric depth map into a normalized depth map and scale features. The proposed network is composed of a shared encoder and three decoders, called G-Net, N-Net, and M-Net, which estimate gradient maps, a normalized depth map, and a metric depth map, respectively. M-Net learns to estimate metric depths more accurately using relative depth features extracted by G-Net and N-Net. The proposed algorithm has the advantage that it can use datasets without metric depth labels to improve the performance of metric depth estimation. Experimental results on various datasets demonstrate that the proposed algorithm not only provides competitive performance to state-of-the-art algorithms but also yields acceptable results even when only a small amount of metric depth data is available for its training.

Via

Access Paper or Ask Questions