I am a PhD student at Korea Advanced Institute of Science and Technology (KAIST), advised by Prof. Jinwoo Shin. Prior to this, I received B.S. in Mathematical Science and Computer Science at KAIST in 2023. My research interests lie in representation learning and robotic foundation models, along with their applications. Recently, I have been particularly focused on developing vision-language-action models (VLAs) that better perceive visual content and understand the physical world.
RLDX-1 Technical Report
Technical Report, 2026
Modular Sensory Stream for Integrating Physical Feedback in Vision-Language-Action Models
Arxiv preprint, 2026
Dual-stream diffusion for world-model augmented vision-language-action model
ICML 2026
ContextVLA: Vision-Language-Action Model with Amortized Multi-Frame Context
Arxiv preprint, 2025
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
NeurIPS 2025
SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning
NeurIPS 2025 Workshop on Space in Vision, Language, and Embodied AI
Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction
CVPR 2025
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
ICLR 2025, Oral Presentation (207/11672=1.77%)
Visual Representation Learning with Stochastic Frame Prediction
ICML 2024
Modality-agnostic Self-supervised Learning with Meta-learned Masked Auto-encoder
NeurIPS 2023
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
ICLR 2023, Spotlight Presentation (280/4956=5.6%)
NeurIPSW-MetaLearn 2022
RLDX-1 Technical Report
Technical Report, 2026
Modular Sensory Stream for Integrating Physical Feedback in Vision-Language-Action Models
Arxiv preprint, 2026
RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models
Arxiv preprint, 2026
Dual-stream diffusion for world-model augmented vision-language-action model
ICML 2026
ContextVLA: Vision-Language-Action Model with Amortized Multi-Frame Context
Arxiv preprint, 2025
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
NeurIPS 2025
SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning
NeurIPS 2025 Workshop on Space in Vision, Language, and Embodied AI
Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction
CVPR 2025
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
ICLR 2025, Oral Presentation (207/11672=1.77%)
TrackIME: Enhanced Video Point Tracking via Instance Motion Estimation
NeurIPS 2024, Spotlight Presentation (326/15671=2%)
Adversarial Robustification via Text-to-Image Diffusion Models
ECCV 2024, Oral Presentation (200/8585=2.3%)
Visual Representation Learning with Stochastic Frame Prediction
ICML 2024
Modality-agnostic Self-supervised Learning with Meta-learned Masked Auto-encoder
NeurIPS 2023
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
ICLR 2023, Spotlight Presentation (280/4956=5.6%)
NeurIPSW-MetaLearn 2022
AltUB: Alternating Training Method to Update Base Distribution of Normalizing Flow for Anomaly Detection
Arxiv preprint, 2022
Korea Advanced Institute of Science and Technology (KAIST)Mar. 2023 - Present
PhD. Student in Artificial Intelligence
Korea Advanced Institute of Science and Technology (KAIST)Mar. 2019 - Feb. 2023
B.S. in Mathematical Science and Computer Science
AI Research InternOct. 2022 - Oct. 2023
i-SENSConference Reviewer, IJCAI'23; NeurIPS'24-26; ICLR'25-26; ICML'25-26; CVPR'25-26; ICCV'25; ECCV'26; AISTATS'25
Journal Reviewer, IJCV
Travel Award, International Conference on Machine Learning (ICML) 2024Jul. 2024
Travel Award, Conference on Neural Information Processing Systems (NeurIPS) 2023 Dec. 2023
Travel Award, International Conference on Learning Representations (ICLR) 2023 May. 2023
Recipient, Google Conference Scholarships (APAC)May. 2023
Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder
Samsung Electronics Device Solution (DS)Oct. 2024
Samsung AI Forum (SAIF) 2023Nov. 2023
Samsung Advanced Institue of Technology (SAIT)Jun. 2023
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
International Conference on Learning Representations (ICLR)May. 2023