clever
CLEVER: Stream-based Active Learning for Robust Semantic Perception from Human Instructions
Lee, Jongseok, Birr, Timo, Triebel, Rudolph, Asfour, Tamim
We propose CLEVER, an active learning system for robust semantic perception with Deep Neural Networks (DNNs). For data arriving in streams, our system seeks human support when encountering failures and adapts DNNs online based on human instructions. In this way, CLEVER can eventually accomplish the given semantic perception tasks. Our main contribution is the design of a system that meets several desiderata of realizing the aforementioned capabilities. The key enabler herein is our Bayesian formulation that encodes domain knowledge through priors. Empirically, we not only motivate CLEVER's design but further demonstrate its capabilities with a user validation study as well as experiments on humanoid and deformable objects. To our knowledge, we are the first to realize stream-based active learning on a real robot, providing evidence that the robustness of the DNN-based semantic perception can be improved in practice. The project website can be accessed at https://sites.google.com/view/thecleversystem.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
MixLoRA-DSI: Dynamically Expandable Mixture-of-LoRA Experts for Rehearsal-Free Generative Retrieval over Dynamic Corpora
Huynh, Tuan-Luc, Vu, Thuy-Trang, Wang, Weiqing, Le, Trung, Gašević, Dragan, Li, Yuan-Fang, Do, Thanh-Toan
Continually updating model-based indexes in generative retrieval with new documents remains challenging, as full retraining is computationally expensive and impractical under resource constraints. We propose MixLoRA-DSI, a novel framework that combines an expandable mixture of Low-Rank Adaptation experts with a layer-wise out-of-distribution (OOD)-driven expansion strategy. Instead of allocating new experts for each new corpus, our proposed expansion strategy enables sublinear parameter growth by selectively introducing new experts only when significant number of OOD documents are detected. Experiments on NQ320k and MS MARCO Passage demonstrate that MixLoRA-DSI outperforms full-model update baselines, with minimal parameter overhead and substantially lower training costs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (3 more...)
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art
Lei, Yiming, Zhang, Chenkai, Liu, Zeming, Leng, Haitao, Liu, Shaoguo, Gao, Tingting, Liu, Qingjie, Wang, Yunhong
Video Comment Art enhances user engagement by providing creative content that conveys humor, satire, or emotional resonance, requiring a nuanced and comprehensive grasp of cultural and contextual subtleties. Although Multimodal Large Language Models (MLLMs) and Chain-of-Thought (CoT) have demonstrated strong reasoning abilities in STEM tasks (e.g. mathematics and coding), they still struggle to generate creative expressions such as resonant jokes and insightful satire. Moreover, existing benchmarks are constrained by their limited modalities and insufficient categories, hindering the exploration of comprehensive creativity in video-based Comment Art creation. To address these limitations, we introduce GODBench, a novel benchmark that integrates video and text modalities to systematically evaluate MLLMs' abilities to compose Comment Art. Furthermore, inspired by the propagation patterns of waves in physics, we propose Ripple of Thought (RoT), a multi-step reasoning framework designed to enhance the creativity of MLLMs. Extensive experiments reveal that existing MLLMs and CoT methods still face significant challenges in understanding and generating creative video comments. In contrast, RoT provides an effective approach to improve creative composing, highlighting its potential to drive meaningful advancements in MLLM-based creativity. GODBench is publicly available at https://github.com/stan-lei/GODBench-ACL2025.
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (5 more...)
Contrastive Learning Via Equivariant Representation
Song, Sifan, Wang, Jinfeng, Zhao, Qiaochu, Li, Xiang, Wu, Dufan, Stefanidis, Angelos, Su, Jionglong, Zhou, S. Kevin, Li, Quanzheng
Invariant-based Contrastive Learning (ICL) methods have achieved impressive performance across various domains. However, the absence of latent space representation for distortion (augmentation)-related information in the latent space makes ICL sub-optimal regarding training efficiency and robustness in downstream tasks. Recent studies suggest that introducing equivariance into Contrastive Learning (CL) can improve overall performance. In this paper, we rethink the roles of augmentation strategies and equivariance in improving CL efficacy. We propose a novel Equivariant-based Contrastive Learning (ECL) framework, CLeVER (Contrastive Learning Via Equivariant Representation), compatible with augmentation strategies of arbitrary complexity for various mainstream CL methods and model frameworks. Experimental results demonstrate that CLeVER effectively extracts and incorporates equivariant information from data, thereby improving the training efficiency and robustness of baseline models in downstream tasks.
- North America > United States > Massachusetts (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
Visually Grounded Commonsense Knowledge Acquisition
Yao, Yuan, Yu, Tianyu, Zhang, Ao, Li, Mengdi, Xie, Ruobing, Weber, Cornelius, Liu, Zhiyuan, Zheng, Hai-Tao, Wermter, Stefan, Chua, Tat-Seng, Sun, Maosong
Large-scale commonsense knowledge bases empower a broad range of AI applications, where the automatic extraction of commonsense knowledge (CKE) is a fundamental and challenging problem. CKE from text is known for suffering from the inherent sparsity and reporting bias of commonsense in text. Visual perception, on the other hand, contains rich commonsense knowledge about real-world entities, e.g., (person, can_hold, bottle), which can serve as promising sources for acquiring grounded commonsense knowledge. In this work, we present CLEVER, which formulates CKE as a distantly supervised multi-instance learning problem, where models learn to summarize commonsense relations from a bag of images about an entity pair without any human annotation on image instances. To address the problem, CLEVER leverages vision-language pre-training models for deep understanding of each image in the bag, and selects informative instances from the bag to summarize commonsense entity relations via a novel contrastive attention mechanism. Comprehensive experimental results in held-out and human evaluation show that CLEVER can extract commonsense knowledge in promising quality, outperforming pre-trained language model-based methods by 3.9 AUC and 6.4 mAUC points. The predicted commonsense scores show strong correlation with human judgment with a 0.78 Spearman coefficient. Moreover, the extracted commonsense can also be grounded into images with reasonable interpretability. The data and codes can be obtained at https://github.com/thunlp/CLEVER.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > New Jersey (0.04)
- (2 more...)
Alessandro LANTERI, PhD, CPA on LinkedIn: "#AI and #platforms are rewriting the rules of strategy. Karim Lakhani and Marco Iansiti's forthcoming book is a great place to catch up with the new rules. I say this because their research was a source for my book #CLEVER."
Karim Lakhani and Marco Iansiti's forthcoming book is a great place to catch up with the new rules. I say this because their research was a source for my book #CLEVER. How will you compete in the Age of AI? Nice article in the HBS Digital Initiative online site on our forthcoming book with Marco Iansiti - look out for AI-driven collisions in every sector.
Catching Bugs Without Really Trying
Finding and fixing bugs is critical to delivering quality software. One-time new bugs are often introduced into a system when changes are uploaded to a software repository. Changes may be due to adding new features or possibly correcting existing bugs. Mozilla is planning on taking advantage of research being done by Ubisoft using artificial intelligence (AI) and machine learning (ML) to automatically catch software bugs when source code is committed to a software repository. The software is called CLEVER for Combining Levels of Bug Prevention and Resolution techniques.
Dash Robotics Acquires Bots Alive for Clever, Affordable Robot Toys
It is with much rejoicing that today we can share that one of our favorite robotics startups, Dash Robotics, is acquiring another of our favorite robotics startups, Bots Alive. Usually, we don't cover acquisitions, or when we do, it's with resigned skepticism--all too often, one company gets completely swallowed by another, and the things that made them unique and exciting simply vanish. The sense that we get from talking with Dash Robotics' CEO Nick Kohut and Bots Alive founder Brad Knox is that the amazing things that Bots Alive does fit right in with the equally amazing but totally different things that Dash Robotics does, and that together, they'll be able to come up with some totally cool (and totally affordable) robotic toys with sophisticated personalities built right in. Part of the reason that we're fans of Dash Robotics and Bots Alive is that they're both successful examples of taking robotics research and turning it directly into a compelling product. Dash Robotics turned UC Berkeley's DASH pop-up hexapod robot into a skittery and blisteringly fast toy called Kamigami that's now being sold in partnership with Mattel for US $50, while Bots Alive's software runs on your phone and gives a $20 Hexbug more brains and personality than an enthusiastic and mildly well trained puppy.
- Education (0.55)
- Leisure & Entertainment > Games > Computer Games (0.31)
Essay: When Artificial Intelligence Gets Too Clever by Half
There's an anthill in the way, but the engineers don't care or even notice; they flood the area anyway, and too bad for the ants. Just as we now have power to dictate the fate of less intelligent beings, so might such computers someday exert life-and-death power over us. Now replace the ants with humans, happily going about their own business, and the engineers with a race of superintelligent computers that happen to have other priorities. Just as we now have power to dictate the fate of less intelligent beings, so might such computers someday exert life-and-death power over us. That's the analogy the superstar physicist Stephen Hawking used in 2015 to describe the mounting perils he sees in the current explosion of artificial intelligence.
What Makes You Clever: The Puzzle of Intelligence: Derek Partridge: 9789814513043: Amazon.com: Books
A detailed look at the progress we have made so far in understanding human intelligence. Topics include the Turing Test from the 1950's that we still rely on today, the Deep Blue computer that beat Kasparov, and the Watson computer that beat the Jeopardy champion. Do these triumphs over the human species represent IQ? The author thinks we still do not have the answer. Playing blitz chess shows the computer's weakness.