清华大学 Jifeng DAI--Home-- Foundation Models for Visual Perception

Research Focus

Foundation models for visual perception aim to address the fundamental tasks of object recognition and localization. This line of research focuses on visual backbone networks and object detection models, providing foundational architectures for general-purpose visual perception.

Representative Works：

High-Accuracy, High-Efficiency Object Detection Foundation Model

R-FCN: Object Detection via Region-based Fully Convolutional Networks
[3rd Most Influential Paper at NeurIPS 2016]
[Included in Pytorch Vision Operator Library]

Deformable DETR: Deformable Transformers for End-to-End Object Detection
[2nd Most Influential Paper at ICLR 2021]

Visual Backbone Networks Centered on Deformable Convolutions, Large-Scale General-Purpose Visual Foundation Models

Deformable Convolutional Networks v1/v2
[6th Most Influential Paper at ICCV 2017]
[Included in Pytorch Vision Operator Library]

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
[CVPR 2023 highlight paper]

Doctoral Degree in Engineering

Jifeng DAI

Click:

The Last Update Time:--

MOBILE Version