My name is Min-Hung (Steve) Chen (陳敏弘 in Chinese). I am a Senior Research Scientist at NVIDIA Research, working on Vision+X Multi-Modal AI. I received my Ph.D. degree from Georgia Tech, advised by Prof. Ghassan AlRegib and in collaboration with Prof. Zsolt Kira. Before joining NVIDIA, I was a Research Engineer II at Microsoft Azure AI, working on Cutting-edge AI Research for Cognitive Services. Before Microsoft, I was a Senior AI Engineer at MediaTek, working on Deep Learning Research for Edge-AI and Vision Transformer.
My research interest is mainly on Learning without Fully Supervision, including domain adaptation, continual learning, self-/semi-supervised learning, etc.
In addition, I have also conducted researches on
transformer, attention, domain adaptation, transfer learning, action segmentation, action recognition, and video understanding.
[Recruiting] We hiring Research Scientist full-time/internship in Taiwan. I am also open to research collaboration. Please drop me an email if you are interested in.
PhD in Electrical and Computer Engineering, 2020
Georgia Institute of Technology
MSc in Integrated Circuits and Systems, 2012
National Taiwan University
BSc in Electrical Engineering, 2010
National Taiwan University
An ultimately comprehensive paper list of Vision Transformer and Attention, including papers, codes, and related websites.
Deep Learning and Computer Vision system for real-time autonomous retail stores using only RGB cameras.
The Learned Smartphone ISP Challenge for the CVPR 2021 MAI Workshop.
Cross-domain action segmentation by aligning temporal feature spaces.
Two methods (TS-LSTM and Temporal-Inception) to exploit spatiotemporal dynamics for activity recognition.
Cross-domain action recognition with new datasets and novel video-based DA approaches.
A large-scale traffic sign detection dataset with various challenging conditions.
[CVPR 2020] Cross-domain action segmentation by aligning feature spaces across multiple temporal scales with self-supervised learning to reduce spatio-temporal variability.
[ICCV 2019 (Oral)] Cross-domain action recognition with new datasets and novel attention-based DA approaches.
Graduate Teaching Assistant