tortoise
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation
Hu, Xinshuo, Li, Dongfang, Zheng, Zihao, Liu, Zhenyu, Hu, Baotian, Zhang, Min
Large language models (LLMs) have been widely used in various applications but are known to suffer from issues related to untruthfulness and toxicity. While parameter-efficient modules (PEMs) have demonstrated their effectiveness in equipping models with new skills, leveraging PEMs for deficiency unlearning remains underexplored. In this work, we propose a PEMs operation approach, namely Extraction-before-Subtraction (Ext-Sub), to enhance the truthfulness and detoxification of LLMs through the integration of ``expert'' PEM and ``anti-expert'' PEM. Remarkably, even anti-expert PEM possess valuable capabilities due to their proficiency in generating fabricated content, which necessitates language modeling and logical narrative competence. Rather than merely negating the parameters, our approach involves extracting and eliminating solely the deficiency capability within anti-expert PEM while preserving the general capabilities. To evaluate the effectiveness of our approach in terms of truthfulness and detoxification, we conduct extensive experiments on LLMs, encompassing additional abilities such as language modeling and mathematical reasoning. Our empirical results demonstrate that our approach effectively improves truthfulness and detoxification, while largely preserving the fundamental abilities of LLMs.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- (12 more...)
Visual Information Matters for ASR Error Correction
Kumar, Vanya Bannihatti, Cheng, Shanbo, Peng, Ningxin, Zhang, Yuchen
Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely developed due to their efficiency in using parallel text data. Previous works mainly focus on using text or/ and speech data, which hinders the performance gain when not only text and speech information, but other modalities, such as visual information are critical for EC. The challenges are mainly two folds: one is that previous work fails to emphasize visual information, thus rare exploration has been studied. The other is that the community lacks a high-quality benchmark where visual information matters for the EC models. Therefore, this paper provides 1) simple yet effective methods, namely gated fusion and image captions as prompts to incorporate visual information to help EC; 2) large-scale benchmark datasets, namely Visual-ASR-EC, where each item in the training data consists of visual, speech, and text information, and the test data are carefully selected by human annotators to ensure that even humans could make mistakes when visual information is missing. Experimental results show that using captions as prompts could effectively use the visual information and surpass state-of-the-art methods by upto 1.2% in Word Error Rate(WER), which also indicates that visual information is critical in our proposed Visual-ASR-EC dataset
Better speech synthesis through scaling
In recent years, the field of image generation has been revolutionized by the application of autoregressive transformers and DDPMs. These approaches model the process of image generation as a step-wise probabilistic processes and leverage large amounts of compute and data to learn the image distribution. This methodology of improving performance need not be confined to images. This paper describes a way to apply advances in the image generative domain to speech synthesis. The result is TorToise -- an expressive, multi-voice text-to-speech system. All model code and trained weights have been open-sourced at https://github.com/neonbjb/tortoise-tts.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.94)
- Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.92)
AI Tom Hanks didn't offer me a job, but it sure sounds like he did
Tom Hanks didn't just call me to pitch me a part, but it sure sounds like it. Ever since PCWorld began covering the rise of various AI applications like AI art, I've been poking around in the code repositories in GitHub and links within Reddit, where people will post tweaks to their own AI models for various approaches. Some of these models actually end up on commercial sites, which either roll their own algorithms or adapt others that have published as open source. A great example of an existing AI audio site is Uberduck.ai, Enter the text in the text field and you can have a virtual Elon Musk, Bill Gates, Peggy Hill, Daffy Duck, Alex Trebek, Beavis, The Joker, or even Siri read out your pre-programmed lines.
- Information Technology (0.31)
- Consumer Products & Services (0.31)
When machine learning meets surrealist art meets Reddit, you get DALL-E mini
An image of babies doing parkour generated by DALL-E mini. An image of babies doing parkour generated by DALL-E mini. DALL-E mini is the AI bringing to life all of the goofy "what if" questions you never asked: What if Voldemort was a member of Green Day? What if there was a McDonald's in Mordor? What if scientists sent a Roomba to the bottom of the Mariana Trench?
What is machine "learning" and artificial intelligence
An important feature of human intelligence is the ability to learn. The amazing learning abilities of the human brain enable babbling babies to grow into learned and easy-to-talk adults. For human beings, learning is an innate ability. The universal existence of this ability makes us ignore its strangeness and preciousness. As far as artificial intelligence research is concerned, how to make machines possess the most universal capabilities in the human world is a very challenging research direction. In different research paths, the subjects, contents and methods of learning are quite different.
Self-driving SCOOTERS will drive themselves back to charging points and busy areas
Self-driving scooters could soon be whizzing around without riders in a tech development designed to improve cities' transport-sharing networks. A California-based start-up, Tortoise, is working on self-driving technology with which bikes and scooters drive themselves home after someone has used them. And taxi-hailing app Uber announced earlier in the year that it was working towards the same goal. Set to launch next month, the initiative is a progression for the temporary bike, scooter and Segway hire schemes which already exist around the world. It could mean fewer of the bikes and scooters are left lying around in obscure places and that they all return to charging stations or busy areas when not in use.
- North America > United States > California (0.26)
- Europe (0.05)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
Why We Stink at Tackling Climate Change - Issue 69: Patterns
If human beings are as Hamlet suggested, "noble in reason, infinite in faculty," then why are we facing so many problems? In many ways, people are better off than ever before: reduced infant mortality, longer lifespans, less poverty, fewer epidemic diseases, even fewer deaths per capita due to violence. And yet global threats abound and by nearly all measures they are getting worse: environmental destruction and wildlife extinction, ethnic and religious hatred, the specter of nuclear war, and above all, the disaster of global climate change. For some religious believers, the primary culprit is original sin. For ideologues of left, right, and otherwise, it's ill-functioning political structures.
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Public Health (0.89)
- Education > Health & Safety > School Nutrition (0.47)
Preventing Aggressive Behavior, Robotically!
In my last blog post, I introduced readers to William Grey Walter (1910 – 1977), a renowned neurophysiologist, cybernetician, and robotician. His futuristic aim was to construct mechanical models--robots--that were capable of realistically simulating the behavior of living beings. Grey Walter's most famous robotic creations were his Cybernetic Tortoises. Elmer and Elsie, his first two robots, were constructed between 1948 and 1949. They appeared to exhibit intelligent action: they were goal-directed (they moved toward light and stopped doing so when they reached the light) and they avoided obstacles that blocked their way to the goal. In a truly remarkable coincidence, robotic tortoises have very recently made the news!