AITopics | cose

CoSE: Compositional Stroke Embeddings

Neural Information Processing SystemsDec-24-2025, 04:22:12 GMT

We present a generative model for stroke-based drawing tasks which is able to model complex free-form structures. While previous approaches rely on sequence-based models for drawings of basic objects or handwritten text, we propose a model that treats drawings as a collection of strokes that can be composed into complex structures such as diagrams (e.g., flow-charts). At the core of the approach lies a novel auto-encoder that projects variable-length strokes into a latent space of fixed dimension. This representation space allows a relational model, operating in latent space, to better capture the relationship between strokes and to predict subsequent strokes. We demonstrate qualitatively and quantitatively that our proposed approach is able to model the appearance of individual strokes, as well as the compositional structure of larger diagram drawings. Our approach is suitable for interactive use cases such as auto-completing diagrams. We make code and models publicly available at https://eth-ait.github.io/cose.

compositional stroke embedding, electronic proceedings, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

723e8f97fde15f7a8d5ff8d558ea3f16-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 05:42:46 GMT

artificial intelligence, machine learning, prediction, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Compose Yourself: Average-Velocity Flow Matching for One-Step Speech Enhancement

Yang, Gang, Lei, Yue, Tai, Wenxin, Wu, Jin, Chen, Jia, Zhong, Ting, Zhou, Fan

arXiv.org Artificial IntelligenceSep-23-2025

Diffusion and flow matching (FM) models have achieved remarkable progress in speech enhancement (SE), yet their dependence on multi-step generation is computationally expensive and vulnerable to discretization errors. Recent advances in one-step generative modeling, particularly MeanFlow, provide a promising alternative by reformulating dynamics through average velocity fields. In this work, we present COSE, a one-step FM framework tailored for SE. To address the high training overhead of Jacobian-vector product (JVP) computations in MeanFlow, we introduce a velocity composition identity to compute average velocity efficiently, eliminating expensive computation while preserving theoretical consistency and achieving competitive enhancement quality. Extensive experiments on standard benchmarks show that COSE delivers up to 5x faster sampling and reduces training cost by 40%, all without compromising speech quality. Code is available at https://github.com/ICDM-UESTC/COSE.

artificial intelligence, machine learning, speech enhancement, (15 more...)

arXiv.org Artificial Intelligence

2509.15952

Country: Asia > China > Sichuan Province > Chengdu (0.40)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Review for NeurIPS paper: CoSE: Compositional Stroke Embeddings

Neural Information Processing SystemsJan-25-2025, 17:23:20 GMT

Weaknesses: The rebuttal and discussion clarified my concerns about [1,2] (although I would highly encourage that these works be citied for a more complete related works section). However, I remain unconvinced by the novelty of the approach -- the fact that transformer based models work better compared to simple VAE based models is not surprising to the general NeurIPS audience. However, I do agree that from the point of view of stroke based generative models the work is novel and makes a good contribution to this specific field. Novelty wrt to [1] is not clear -- both methods use a transformer based architecture to model long-range dependencies in strokes. The advantage of an autoregressive structure along with transformers is not clear as transformers contain self-attention layers to capture long range dependencies.

compositional stroke embedding, neurips paper, transformer, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.88)

Add feedback

CoSE: Compositional Stroke Embeddings

Neural Information Processing SystemsOct-10-2024, 12:37:52 GMT

We present a generative model for stroke-based drawing tasks which is able to model complex free-form structures. While previous approaches rely on sequence-based models for drawings of basic objects or handwritten text, we propose a model that treats drawings as a collection of strokes that can be composed into complex structures such as diagrams (e.g., flow-charts). At the core of the approach lies a novel auto-encoder that projects variable-length strokes into a latent space of fixed dimension. This representation space allows a relational model, operating in latent space, to better capture the relationship between strokes and to predict subsequent strokes. We demonstrate qualitatively and quantitatively that our proposed approach is able to model the appearance of individual strokes, as well as the compositional structure of larger diagram drawings.

compositional stroke embedding, cose, latent space, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback

CoSE: Compositional Stroke Embeddings

Aksan, Emre, Deselaers, Thomas, Tagliasacchi, Andrea, Hilliges, Otmar

arXiv.org Machine LearningJun-17-2020

We present a generative model for stroke-based drawing tasks which is able to model complex free-form structures. While previous approaches rely on sequence-based models for drawings of basic objects or handwritten text, we propose a model that treats drawings as a collection of strokes that can be composed into complex structures such as diagrams (e.g., flow-charts). At the core of the approach lies a novel auto-encoder that projects variable-length strokes into a latent space of fixed dimension. This representation space allows a relational model, operating in latent space, to better capture the relationship between strokes and to predict subsequent strokes. We demonstrate qualitatively and quantitatively that our proposed approach is able to model the appearance of individual strokes, as well as the compositional structure of larger diagram drawings. Our approach is suitable for interactive use cases such as auto-completing diagrams.

artificial intelligence, machine learning, prediction, (20 more...)

arXiv.org Machine Learning

2006.0993

Country: