AITopics | style parameter

Despite advancements in language-controlled reinforcement learning (LC-RL) for basic domains and straightforward commands (e.g., object manipulation and navigation), effectively extending LC-RL to comprehend and execute high-level or abstract instructions in complex, multi-agent environments, such as football games, remains a significant challenge. To address this gap, we introduce Language-Controlled Diverse Style Policies (LCDSP), a novel LC-RL paradigm specifically designed for complex scenarios. LCDSP comprises two key components: a Diverse Style Training (DST) method and a Style Interpreter (SI). The DST method efficiently trains a single policy capable of exhibiting a wide range of diverse behaviors by modulating agent actions through style parameters (SP). The SI is designed to accurately and rapidly translate high-level language instructions into these corresponding SP . Through extensive experiments in a complex 5v5 football environment, we demonstrate that LCDSP effectively comprehends abstract tactical instructions and accurately executes the desired diverse behavioral styles, showcasing its potential for complex, real-world applications.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2511.19885

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games (1.00)
Leisure & Entertainment > Sports > Football (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Appendix of Memory with No Forgetting

Neural Information Processing SystemsNov-15-2025, 05:52:54 GMT

Figure 7: (a) The generator architecture adopted in this paper. It inherits the architecture from GP-GAN which is the same as the green/frozen part of our GAN memory (see Figure 1(a)). Given a target task/data ( e.g., Flowers, Cathedrals, or Cats), all the parameters are trainable and fine-tuned to fit the target data. FC/Conv layer, one need to apply it to all layers in real implementation. "None" means no modulation from target data is applied, we only use the modulation from the Only" means that we replace the Only" means that we replace the Only" means that we replace the b "All" means all style parameters from target data "All" is obtained via a similar way to that of Figure 2(a); " means using the style parameters from target data without "None" and "All" are obtained via a similar way to that of Figure 2(a); "FC" is obtained by applying a newly designed style parameters which copies the style parameters " is obtained by copying the designed style parameters under the "FC" setting and replacing " is obtained by copying the designed style parameters under the " "Our" is our GAN memory; "NoNorm" is a modified version of our GAN memory which removes the normalization on "NoBias" is a modified version of our GAN memory which removes the bias term Here we discuss the detailed techniques for interpolation among different generative processes with our GAN memory and show more examples in Figure 8, 9, 10, and 11.

artificial intelligence, gan memory, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GAN Memory with No Forgetting

Neural Information Processing SystemsNov-15-2025, 05:52:47 GMT

That revelation is anticipated, for if the characteristics of previous data are remembered perfectly ( e.g., via realistic generative replay), no forgetting should be expected for lifelong learning.

arxiv preprint arxiv, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Instructional Material (0.36)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Education > Educational Setting (0.50)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Appendix of Memory with No Forgetting

Neural Information Processing SystemsAug-16-2025, 04:41:55 GMT

Figure 7: (a) The generator architecture adopted in this paper. It inherits the architecture from GP-GAN which is the same as the green/frozen part of our GAN memory (see Figure 1(a)). Given a target task/data ( e.g., Flowers, Cathedrals, or Cats), all the parameters are trainable and fine-tuned to fit the target data. FC/Conv layer, one need to apply it to all layers in real implementation. "None" means no modulation from target data is applied, we only use the modulation from the Only" means that we replace the Only" means that we replace the Only" means that we replace the b "All" means all style parameters from target data "All" is obtained via a similar way to that of Figure 2(a); " means using the style parameters from target data without "None" and "All" are obtained via a similar way to that of Figure 2(a); "FC" is obtained by applying a newly designed style parameters which copies the style parameters " is obtained by copying the designed style parameters under the "FC" setting and replacing " is obtained by copying the designed style parameters under the " "Our" is our GAN memory; "NoNorm" is a modified version of our GAN memory which removes the normalization on "NoBias" is a modified version of our GAN memory which removes the bias term Here we discuss the detailed techniques for interpolation among different generative processes with our GAN memory and show more examples in Figure 8, 9, 10, and 11.

gan memory, style parameter, target data, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

bf201d5407a6509fa536afc4b380577e-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 04:41:48 GMT

arxiv preprint arxiv, gan memory, learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Education (0.95)
Health & Medicine > Therapeutic Area (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

OpenVoice: Versatile Instant Voice Cloning

Qin, Zengyi, Zhao, Wenliang, Yu, Xumin, Sun, Xin

arXiv.org Artificial IntelligenceJan-2-2024

We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotion, accent, rhythm, pauses, and intonation, in addition to replicating the tone color of the reference speaker. The voice styles are not directly copied from and constrained by the style of the reference speaker. Previous approaches lacked the ability to flexibly manipulate voice styles after cloning. 2) Zero-Shot Cross-Lingual Voice Cloning. OpenVoice achieves zero-shot cross-lingual voice cloning for languages not included in the massive-speaker training set. Unlike previous approaches, which typically require extensive massive-speaker multi-lingual (MSML) dataset for all languages, OpenVoice can clone voices into a new language without any massive-speaker training data for that language. OpenVoice is also computationally efficient, costing tens of times less than commercially available APIs that offer even inferior performance. To foster further research in the field, we have made the source code and trained model publicly accessible. We also provide qualitative results in our demo website. Prior to its public release, our internal version of OpenVoice was used tens of millions of times by users worldwide between May and October 2023, serving as the backend of MyShell.

openvoice, reference speaker, tone color, (11 more...)

arXiv.org Artificial Intelligence

2312.01479

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

GANzilla: User-Driven Direction Discovery in Generative Adversarial Networks

Evirgen, Noyan, Chen, Xiang 'Anthony'

arXiv.org Artificial IntelligenceAug-13-2022

Generative Adversarial Network (GAN) is widely adopted in numerous application areas, such as data preprocessing, image editing, and creativity support. However, GAN's 'black box' nature prevents non-expert users from controlling what data a model generates, spawning a plethora of prior work that focused on algorithm-driven approaches to extract editing directions to control GAN. Complementarily, we propose a GANzilla: a user-driven tool that empowers a user with the classic scatter/gather technique to iteratively discover directions to meet their editing goals. In a study with 12 participants, GANzilla users were able to discover directions that (i) edited images to match provided examples (closed-ended tasks) and that (ii) met a high-level goal, e.g., making the face happier, while showing diversity across individuals (open-ended tasks).

ganzilla, participant, target image, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3526113.3545638

2207.0832

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generating, With Style: The Mechanics Behind NVIDIA's Highly Realistic GAN Images

#artificialintelligenceJan-21-2019, 16:27:45 GMT

Although the z vector is just sampled randomly, our ultimate goal is to create a mapping between the distribution of images and our reference distribution Z, such that each vector in z corresponds to a plausibly real image. As a result, despite being meaningless at first, each particular z ends up corresponding to and encoding properties of the image that it will produce. In their simplest form, transposed convolutions work by learning a filter matrix (for example, 3x3), and multiplying that by the value at each pixel to expand its information outward spatially. Each of the single "pixels" in a 4x4 representation influences the values in a 3x3 patch of output; these patches overlap and sum to create the final "blown out" representation. The visual above, while good for building simplified intuition, is a little misleading, since it makes it look like the values of the enlarged patch have to be spun out of a single piece of information from a single pixel.

artificial intelligence, machine learning, vector, (18 more...)

#artificialintelligence

Industry: Information Technology > Hardware (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback