Discrete Motion Token vs. Continuous Motion Token. Discrete: generation is constrained by finite codebook entries, leading to quantization artifacts and piecewise transitions under the same prompt. Continuous: smoother kinematics and joint trajectories, natural phase transitions without staircase and jitter.
Overview of the SafeMoEngine. We first classify and rewrite harmful texts (Level 2 & 3), route Level 1 texts to original motions, compose text conditions and syhthesize motions via two generative models, to construct SafeMoVAE-29K and SafeMoVQ-29K, respectively.
Overview of SafeMo. Stage 1 (top): the unsafe stream optimizes through a harmful motion-specific loss and a random decoupling strategy, while the safe stream applies a negative preservation divergence. Only LoRA adapters on DiP are updated to obtain the pure harmful task vector. Stage 2 (bottom): we negate the learned harmful task vector via a motion-class aware α, such that the model suppresses unsafe behaviors on unsafe prompts and preserve performance on safe prompts.
@article{wang2026safemo,
title={SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation},
author={Wang, Yiling and Zhang, Zeyu and Wang, Yiran and Tang, Hao},
journal={arXiv preprint arXiv:2601.00590},
year={2026}
}