Min-Hung Chen

Min-Hung Chen

Research Engineer II

Microsoft

About Me

My name is Min-Hung (Steve) Chen (陳敏弘 in Chinese). I am a Research Engineer II at Microsoft Azure AI, working on Cutting-edge AI Research for Cognitive Services. I received my Ph.D. degree from Georgia Tech, advised by Prof. Ghassan AlRegib and in collaboration with Prof. Zsolt Kira. Before joining Microsoft, I was a Senior AI Engineer at MediaTek, working on Deep Learning Research for Edge-AI and Vision Transformer.

My research interest is mainly on Learning without Fully Supervision, including continual learning, self-/semi-supervised learning, federated learning, etc.
In addition, I have also conducted researches on transformer, attention, domain adaptation, transfer learning, action segmentation, action recognition, and video understanding.

Interests

  • Transfer Learning
  • Unsupervised Learning
  • Video Understanding
  • Computer Vision
  • Deep Learning
  • Machine Learning

Education

  • PhD in Electrical and Computer Engineering, 2020

    Georgia Institute of Technology

  • MSc in Integrated Circuits and Systems, 2012

    National Taiwan University

  • BSc in Electrical Engineering, 2010

    National Taiwan University

News

Work Experience

 
 
 
 
 

Research Engineer II

Microsoft

Jan 2022 – Present Taipei, Taiwan
Cutting-edge AI Research for Azure Cognitive Services
 
 
 
 
 

Senior AI Engineer

MediaTek Inc.

Oct 2020 – Dec 2021 Hsinchu, Taiwan
Research and develop cutting-edge methodologies for Edge-AI
Coordinate academic-industry collaboration for EcoSystem (e.g. co-host CVPR'21 workshop)
 
 
 
 
 

Research Intern

Baidu USA

May 2019 – Dec 2019 Sunnyvale, CA, US
Cross-domain action segmentation with self-supervised learning
 
 
 
 
 

Research Intern

PlayStation

May 2018 – Aug 2018 San Mateo, CA, US
Cross-domain action recognition
 
 
 
 
 

Deep Learning Engineer Intern

Aipoly

Aug 2017 – Dec 2017 San Francisco, CA, US
Vision-based autonomous retail store
 
 
 
 
 

Ph.D. Research

Georgia Institute of Technology

Aug 2014 – Aug 2020 Atlanta, GA, US
Video understanding beyond fully supervision
Human action understanding
Robust machine learning for autonomous vehicle
 
 
 
 
 

Research Assistant

Academia Sinica

Jul 2013 – Jul 2014 Taipei City, Taiwan
Multi-modal action recognition

Projects

*

Vision-based Autonomous Retail Store

Deep Learning and Computer Vision system for real-time autonomous retail stores using only RGB cameras.

Deep Learning for Smartphone ISP

The Learned Smartphone ISP Challenge for the CVPR 2021 MAI Workshop.

Action Segmentation with Temporal Domain Adaptation

Cross-domain action segmentation by aligning temporal feature spaces.

Activity Recognition with RNN and Temporal-ConvNet

Two methods (TS-LSTM and Temporal-Inception) to exploit spatiotemporal dynamics for activity recognition.

Temporal Attentive Alignment for Video Domain Adaptation

Cross-domain action recognition with new datasets and novel video-based DA approaches.

Traffic Sign Detection under Challenging Conditions

A large-scale traffic sign detection dataset with various challenging conditions.

Professional Activities

Competition committees

Professional Talks

Conference reviewers

  • International Conference on Learning Representations (ICLR)
  • International Conference on Computer Vision (ICCV)
  • International Conference on Machine Learning (ICML)
  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR), including Workshop (CVPRW)
  • Association for the Advancement of Artificial Intelligence (AAAI)
  • Advances in Neural Information Processing Systems (NeurIPS)
  • European Conference on Computer Vision (ECCV)
  • IEEE International Conference on Image Processing (ICIP)
  • IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
  • International Conference on Image Analysis and Processing (ICIAP)
  • IEEE International Workshop on Multimedia Signal Processing (MMSP)
  • European Signal Processing Conference (EUSIPCO)

Journal reviewers

  • Elsevier Pattern Recognition (PR)
  • Springer International Journal of Computer Vision (IJCV)
  • IEEE Transactions on Intelligent Transportation Systems (TITS)
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
  • IEEE Access

Recent & Upcoming Talks

Learned Smartphone ISP Challenge

10-min invited presentation for the MAI workshop at CVPR 2021

Bridging Distributional Discrepancy with Temporal Dynamics for Video Understanding

Invited talk by Dr. Jun-Cheng Chen at Academia Sinica

My Research Journey for Video Understanding

Invited talk by Prof. Yen-Yu Lin at NYCU

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

5-min invited presentation for the WebVision workshop at CVPR 2020

Temporal Attentive Alignment for Large-Scale Video Domain Adaptation

5-min video for the Oral presentation in ICCV 2019

Selected Publications

Please see my Google Scholar for the complete publication list.
Quickly discover relevant content by filtering publications.

Learned Smartphone ISP on Mobile NPUs With Deep Learning, Mobile AI 2021 Challenge: Report

[CVPRW 2021] The report paper for The Learned Smartphone ISP challenge in the CVPR 2021 MAI Workshop.

Network Space Search for Pareto-Efficient Spaces

[CVPRW 2021 (Oral)] A novel AutoML paradigm to directly search for favorable network spaces automatically instead of a single architecture.

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

[CVPR 2020] Cross-domain action segmentation by aligning feature spaces across multiple temporal scales with self-supervised learning to reduce spatio-temporal variability.

Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding

[ICASSP 2020] Driving behavior classification based on temporal and causal reasoning.

Action Segmentation with Mixed Temporal Domain Adaptation

[WACV 2020] Cross-domain action segmentation by aligning temporal feature spaces to reduce spatio-temporal variability.

Color learning

[US Patent] Color-component spatio-temporal learning for traffic sign detection.

Temporal Attentive Alignment for Large-Scale Video Domain Adaptation

[ICCV 2019 (Oral)] Cross-domain action recognition with new datasets and novel attention-based DA approaches.

Traffic Sign Detection Under Challenging Conditions: A Deeper Look into Performance Variations and Spectral Characteristics

[TITS 2019] A large-scale traffic sign detection dataset with various challenging conditions.

TS-LSTM and temporal-inception: Exploiting spatiotemporal dynamics for activity recognition

[SPIC 2019] Simple but effective CNN- and RNN-based approaches to exploit temporal dynamics for videos.

Depth and Skeleton Associated Action Recognition without Online Accessible RGB-D Cameras

[CVPR 2014] Multi-modal adaptation with multiple kernel learning for action recognition.

Honors & Awards

  • Outstanding Reviewer for ICCV (Fall 2021)
  • Outstanding Reviewer for CVPR (Summer 2021)
  • Student Travel Grant Award for ICCV (Fall 2019)
  • Ministry of Education Technologies Incubation Scholarship, Taiwan (Fall 2014 - Spring 2017)
  • Otto F. and Jenny H. Krauss Fellowship, Georgia Institute of Technology (Fall 2014 - Spring 2015)

Teaching Experience

Graduate Teaching Assistant

Georgia Institute of Technology

National Taiwan University

  • Statistical Image Processing (Spring 2012)
  • Computer Programming (Fall 2011)

Contact