Sungyong Park

Sungyong Park

MS Student in Digital Media (Artificial Intelligence) at Soongsil University

#602, 50 Sadang-ro, Dongjak-gu
Seoul, Republic of Korea, 07027

ejqdl010 at gmail dot com

About

I am a master's student at the Reality Lab in the Department of Digital Media at Soongsil University, advised by Prof. Heewon Kim.

My research explores scalable data generation and utilization in robotics, with a recent focus on foundation models for robotic manipulation.

Research Interests

My research centers on embodied AI systems that perceive and interact effectively in challenging real-world settings. Key areas of interest include:

  • Foundation Models for Robotic Manipulation: Exploring large-scale, generalizable visual manipulation models trained with multi-modal data for robotic tasks.
  • Scalable Data Generation: Building scalable and realistic simulation environments for generating diverse training data, with a focus on physical dynamics and high-fidelity visual rendering.
  • Image Restoration in Real-World Settings: Tackling low-level vision problems like denoising and deblurring in the presence of realistic degradations, including lens contamination.

News

Publications

    (* indicates equal contribution)
  • DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI

    Sangmin Lee*, Sungyong Park*, Heewon Kim, IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025

    Paper Slides
    Robotic manipulation in embodied AI critically depends on large-scale, high-quality datasets that reflect realistic object interactions and physical dynamics. However, existing data collection pipelines are often slow, expensive, and heavily reliant on manual efforts. We present DynScene, a diffusion-based framework for generating dynamic robotic manipulation scenes directly from textual instructions. Unlike prior methods that focus solely on static environments or isolated robot actions, DynScene decomposes the generation into two phases static scene synthesis and action trajectory generation allowing fine-grained control and diversity. Our model enhances realism and physical feasibility through scene refinement (layout sampling, quaternion quantization) and leverages residual action representation to enable action augmentation, generating multiple diverse trajectories from a single static configuration. Experiments show DynScene achieves 26.8x faster generation, 1.84x higher accuracy, and 28% greater action diversity than human-crafted data. Furthermore, agents trained with DynScene exhibit up to 19.4% higher success rates across complex manipulation tasks. Our approach paves the way for scalable, automated dataset generation in robot learning.
    @inproceedings{lee2025dynscene,
      title={DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI},
      author={Lee, Sangmin and Park, Sungyong and Kim, Heewon},
      booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
      pages={12166--12175},
      year={2025}
    }
                                        
  • SIDL: A Real-World Dataset for Restoring Smartphone Images with Dirty Lenses

    Sooyoung Choi*, Sungyong Park*, Heewon Kim, AAAI Conference on Artificial Intelligence, 2025

    PDF Website Talk Slides
    Smartphone cameras are ubiquitous in daily life, yet their performance can be severely impacted by dirty lenses, leading to degraded image quality. This issue is often overlooked in image restoration research, which assumes ideal or controlled lens conditions. To address this gap, we introduced SIDL (Smartphone Images with Dirty Lenses), a novel dataset designed to restore images captured through contaminated smartphone lenses. SIDL contains diverse real-world images taken under various lighting conditions and environments. These images feature a wide range of lens contaminants, including water drops, fingerprints, and dust. Each contaminated image is paired with a clean reference image, enabling supervised learning approaches for restoration tasks. To evaluate the challenge posed by SIDL, various state-of-the-art restoration models were trained and compared on this dataset. Their performances achieved some level of restoration but did not adequately address the diverse and realistic nature of the lens contaminants in SIDL. This challenge highlights the need for more robust and adaptable image restoration techniques for restoring images with dirty lenses
    @inproceedings{choi2025sidl,
      title={SIDL: A Real-World Dataset for Restoring Smartphone Images with Dirty Lenses},
      author={Choi, Sooyoung and Park, Sungyong and Kim, Heewon},
      booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
      volume={39},
      number={3},
      pages={2545--2554},
      year={2025}
    }
                                        

Awards

  • 1st Place in the ARNOLD Challenge

    Dowon Kim, Chaewoo Lim, Sungyong Park, Sangmin Lee, Heewon Kim
    CVPR 2025 Embodied AI Workshop

    Challenge Page Slides
  • 3rd Place in the ARNOLD Challenge

    Sangmin Lee, Sungyong Park, Heewon Kim
    CVPR 2024 Embodied AI Workshop

    Challenge Page Slides