Bridging Distributional Discrepancy with Temporal Dynamics for Video Understanding

August 2020

Abstract

Video has become one of the major media in our society, bringing considerable interests in the development of video analysis techniques for various applications. Temporal Dynamics, which characterize how information changes along time, are the key components for videos. However, it is still not clear how temporal dynamics benefit video tasks, especially for the cross-domain case, which is close to real-world scenarios. Therefore, the objective of this thesis is to effectively exploit temporal dynamics from videos to tackle distributional discrepancy problems for video understanding.

To achieve this objective, firstly the benefits for exploiting temporal dynamics for videos are investigated, by proposing Temporal Segment LSTM (TS-LSTM) and Inception-style Temporal-ConvNet (Temporal-Inception) for general video understanding, and demonstrating that temporal dynamics can help reduce temporal variations for cross-domain video understanding. Since most previous work only evaluates the performance on small-scale datasets with little domain discrepancy, two large-scale datasets for video domain adaptation, UCF-HMDB_full and Kinetics-Gameplay, are collected to facilitate cross-domain video research, and Temporal Attentive Adversarial Adaptation Network (TA³N) is proposed to simultaneously attend, align, and learn temporal dynamics across domains. Finally, to utilize temporal dynamics from unlabeled videos for action segmentation, Self-Supervised Temporal Domain Adaptation (SSTDA) is proposed to jointly align cross-domain feature spaces embedded with local and global temporal dynamics by two self-supervised auxiliary tasks, binary and sequential domain prediction, demonstrating the usefulness of adapting to unlabeled videos across variations.

Type

Thesis

Publication

In Georgia Institute of Technology

Resources

Code (TS-LSTM)	Code (TA3N)	Code (SSTDA)

Citation

Min-Hung Chen, “Bridging Distributional Discrepancy with Temporal Dynamics for Video Understanding”, PhD Dissertation, Georgia Institute of Technology, 2020.

BibTex

@phdthesis{chen2020bridging,
  title={Bridging Distributional Discrepancy with Temporal Dynamics for Video Understanding},
  author={Chen, Min-Hung},
  year={2020},
  school={Georgia Institute of Technology}
}

Members

Georgia Institute of Technology

Bridging Distributional Discrepancy with Temporal Dynamics for Video Understanding

Abstract

Resources

Citation

BibTex

Members

Min-Hung Chen

Senior Research Scientist

Related