[Open DMQA Seminar] Multimodal Representation Learning
Multimodal learning은 이미지, 텍스트, 음성 등 서로 다른 유형의 데이터(modality)를 통합하여 보다 포괄적인 정보를 얻는 Deep learning 접근 방식이다. 이번 세미나에서는 이러한 Multimodal learning 중에서도 Representation learning의 최신 연구 동향을 소개하고자 한다. 첫 번째 논문은 Multimodal 데이터에 대해 Representation learning을 효과적으로 하기위해 Step correlation을 추가한 CorrMCNN을 제안하고, 두 번째 논문은 단계별 Multimodal 데이터에서 시간적 구조를 고려한 Representation Learning을 하는 새로운 방법론인 CorrRNN을 소개한다. 마지막으로, Masking 전략을 통해 MultiMAE라는 구조를 제안하여 여러 Modality와 Task를 동시에 처리하는 방법을 소개한다. 이 연구들은 컴퓨터 비전, 자연어 처리, 음성 인식 등 다양한 분야에서 복잡한 문제를 해결하기 위해 Multimodal 데이터를 효과적으로 활용하는 방법을 제시하고 있다.
[1] Bhatt, G., Jha, P., & Raman, B. (2019). Representation learning using step-based deep multi-modal autoencoders. Pattern Recognition, 95, 12-23.
[2] Yang, X., Ramesh, P., Chitta, R., Madhvanath, S., Bernal, E. A., & Luo, J. (2017). Deep multimodal representation learning from temporal data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5447-5455).
[3] Bachmann, R., Mizrahi, D., Atanov, A., & Zamir, A. (2022, October). Multimae: Multi-modal multi-task masked autoencoders. In European Conference on Computer Vision (pp. 348-367). Cham: Springer Nature Switzerland
Видео [Open DMQA Seminar] Multimodal Representation Learning канала 김성범[ 교수 / 산업경영공학부 ]
[1] Bhatt, G., Jha, P., & Raman, B. (2019). Representation learning using step-based deep multi-modal autoencoders. Pattern Recognition, 95, 12-23.
[2] Yang, X., Ramesh, P., Chitta, R., Madhvanath, S., Bernal, E. A., & Luo, J. (2017). Deep multimodal representation learning from temporal data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5447-5455).
[3] Bachmann, R., Mizrahi, D., Atanov, A., & Zamir, A. (2022, October). Multimae: Multi-modal multi-task masked autoencoders. In European Conference on Computer Vision (pp. 348-367). Cham: Springer Nature Switzerland
Видео [Open DMQA Seminar] Multimodal Representation Learning канала 김성범[ 교수 / 산업경영공학부 ]
Комментарии отсутствуют
Информация о видео
16 августа 2024 г. 7:26:17
00:39:24
Другие видео канала
![[Open DMQA Seminar] Improving Sampling Speed of Diffusion Models](https://i.ytimg.com/vi/Ptsg2B8h1nM/default.jpg)
![[Open DMQA Seminar] Human Action Recognition](https://i.ytimg.com/vi/1mMYgOh8vuA/default.jpg)
![[Open DMQA Seminar] Hybrid Methods for Semi-Supervised Learning Under Class Distribution Mismatch](https://i.ytimg.com/vi/mq3wp93Wmu0/default.jpg)
![[DMQA Open Seminar] Open Set Recognition with Background Data](https://i.ytimg.com/vi/aajoR-lX6ak/default.jpg)
![[Open DMQA Seminar] Change Point Detection in Time Series](https://i.ytimg.com/vi/l6GprjLizHo/default.jpg)
![[Open DMQA Seminar] Anomaly Detection Through Imaging of Time Series Data](https://i.ytimg.com/vi/VJWcP7jN27s/default.jpg)
![[Open DMQA Seminar] Machine Learning for Combinatorial Optimization](https://i.ytimg.com/vi/VjPNsVuomO8/default.jpg)
![[Open DMQA Seminar] Generative Adversarial Network for Imbalanced Data](https://i.ytimg.com/vi/KcaHFzAIiLw/default.jpg)
![[Open DMQA Seminar] Semi-Supervised Learning (FixMatch, SelfMatch, SimMatch)](https://i.ytimg.com/vi/5pQa3R59pVY/default.jpg)
![[핵심 머신러닝] Training, Validation, Testing (학습, 검증, 테스트)](https://i.ytimg.com/vi/AFGDcwt8x5c/default.jpg)
![[Open DMQA Seminar] Deep Metric Learning](https://i.ytimg.com/vi/hH-UgJ--r_4/default.jpg)
![[Open DMQA Seminar] Class Mismatch in Domain Adaptation](https://i.ytimg.com/vi/4MJYDs64PkE/default.jpg)
![[DMQA Open Seminar] 타이어 산업 데이터 특징 및 성능 예측 사례](https://i.ytimg.com/vi/itrMjpw5Szc/default.jpg)
![[DMQA Open Seminar] Active Learning in Semiconductor Manufacturing](https://i.ytimg.com/vi/wP5YwmKNE8U/default.jpg)
![[Open DMQA Seminar] Super resolution with diffusion models](https://i.ytimg.com/vi/h-t6QNUBL8o/default.jpg)
![[핵심 머신러닝] Hidden Markov Models - Part 2 (Decoding, Learning)](https://i.ytimg.com/vi/P02Lws57gqM/default.jpg)
![[Open DMQA Seminar] How to Transfer Knowledge Across Domains by Deep Neural Networks?](https://i.ytimg.com/vi/Y-nGW6mkdTs/default.jpg)
![[핵심 머신러닝] Self-supervised Learning - Part 1 (Introduction, Pretext Methods)](https://i.ytimg.com/vi/gpe3xmiGUzg/default.jpg)
![[Open DMQA Seminar] Dynamic Pagerank on Streaming Data](https://i.ytimg.com/vi/Uejz2UFtFoo/default.jpg)
![[Open DMQA Seminar] Image Denoising](https://i.ytimg.com/vi/0G1-CsHqBFw/default.jpg)
![[Open DMQA Seminar] Deep Semi-Supervised Learning with Out-of-distribution Unlabeled Data](https://i.ytimg.com/vi/pcs7jTbDmV8/default.jpg)