Picture for Jingdong Wang

Jingdong Wang

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

Add code
Jun 05, 2024
Viaarxiv icon

OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding

Add code
Jun 04, 2024
Viaarxiv icon

StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond

Add code
Jun 04, 2024
Viaarxiv icon

Towards Unified Multi-granularity Text Detection with Interactive Attention

May 30, 2024
Viaarxiv icon

Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers?

May 28, 2024
Viaarxiv icon

Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation

Add code
May 22, 2024
Viaarxiv icon

Dense Connector for MLLMs

Add code
May 22, 2024
Viaarxiv icon

RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting

Add code
May 01, 2024
Viaarxiv icon

Training-Free Unsupervised Prompt for Vision-Language Models

Add code
Apr 25, 2024
Viaarxiv icon

CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding

Add code
Apr 22, 2024
Viaarxiv icon