Processing math: 100%

AnyMoLe: Any Character
Motion In-betweening Leveraging
Video Diffusion Models

Kwan Yun ,   Seokhyeon Hong ,   Chaelin Kim ,   Junyong Noh

KAIST, Visual Media Lab


Accepted to CVPR 2025
Teaser Teaser

AnyMoLe performs motion in-betweening for
arbitrary characters only using inputs.



Abstract

Despite recent advancements in learning-based motion in-betweening, a key limitation has been overlooked: the requirement for character-specific datasets. In this work, we introduce AnyMoLe, a novel method that addresses this limitation by leveraging video diffusion models to generate motion in-between frames for arbitrary characters without external data. Our approach employs a two-stage frame generation process to enhance contextual understanding. Furthermore, to bridge the domain gap between real-world and rendered character animations, we introduce ICAdapt, a fine-tuning technique for video diffusion models. Additionally, we propose a motion-video mimicking'' optimization technique, enabling seamless motion generation for characters with arbitrary joint structures using 2D and 3D-aware features. AnyMoLe significantly reduces data dependency while generating smooth and realistic transitions, making it applicable to a wide range of motion in-betweening tasks.

AnyMoLe results





Additional results

  
  




Comparisons





Application on two objects

Overview of AnyMoLe

AvatarTalk Overview

Overview of AnyMoLe: First, the video diffusion model is fine-tuned without using any external data while the scene-specific joint estimator is trained. Next, the fine-tuned video generation model produces an in-between video, which is then refined through motion video mimicking to generate the final in-between motion.