référence
Enhancing CBMs Through Binary Distillation with Applications to Test-Time Intervention
Shen, Matthew, Hsu, Aliyah, Agarwal, Abhineet, Yu, Bin
Concept bottleneck models~(CBM) aim to improve model interpretability by predicting human level ``concepts" in a bottleneck within a deep learning model architecture. However, how the predicted concepts are used in predicting the target still either remains black-box or is simplified to maintain interpretability at the cost of prediction performance. We propose to use Fast Interpretable Greedy Sum-Trees~(FIGS) to obtain Binary Distillation~(BD). This new method, called FIGS-BD, distills a binary-augmented concept-to-target portion of the CBM into an interpretable tree-based model, while mimicking the competitive prediction performance of the CBM teacher. FIGS-BD can be used in downstream tasks to explain and decompose CBM predictions into interpretable binary-concept-interaction attributions and guide adaptive test-time intervention. Across $4$ datasets, we demonstrate that adaptive test-time intervention identifies key concepts that significantly improve performance for realistic human-in-the-loop settings that allow for limited concept interventions.
- Health & Medicine (0.46)
- Transportation (0.34)
Codebook Reduction and Saturation: Novel observations on Inductive Thematic Saturation for Large Language Models and initial coding in Thematic Analysis
De Paoli, Stefano, Mathis, Walter Stan
This paper reflects on the process of performing Thematic Analysis with Large Language Models (LLMs). Specifically, the paper deals with the problem of analytical saturation of initial codes, as produced by LLMs. Thematic Analysis is a well-established qualitative analysis method composed of interlinked phases. A key phase is the initial coding, where the analysts assign labels to discrete components of a dataset. Saturation is a way to measure the validity of a qualitative analysis and relates to the recurrence and repetition of initial codes. In the paper we reflect on how well LLMs achieve analytical saturation and propose also a novel technique to measure Inductive Thematic Saturation (ITS). This novel technique leverages a programming framework called DSPy. The proposed novel approach allows a precise measurement of ITS.
NANOGPT: A Query-Driven Large Language Model Retrieval-Augmented Generation System for Nanotechnology Research
Chandrasekhar, Achuth, Farimani, Omid Barati, Ajenifujah, Olabode T., Ock, Janghoon, Farimani, Amir Barati
This paper presents the development and application of a Large Language Model Retrieval-Augmented Generation (LLM-RAG) system tailored for nanotechnology research. The system leverages the capabilities of a sophisticated language model to serve as an intelligent research assistant, enhancing the efficiency and comprehensiveness of literature reviews in the nanotechnology domain. Central to this LLM-RAG system is its advanced query backend retrieval mechanism, which integrates data from multiple reputable sources. The system retrieves relevant literature by utilizing Google Scholar's advanced search, and scraping open-access papers from Elsevier, Springer Nature, and ACS Publications. This multifaceted approach ensures a broad and diverse collection of up-to-date scholarly articles and papers. The proposed system demonstrates significant potential in aiding researchers by providing a streamlined, accurate, and exhaustive literature retrieval process, thereby accelerating research advancements in nanotechnology. The effectiveness of the LLM-RAG system is validated through rigorous testing, illustrating its capability to significantly reduce the time and effort required for comprehensive literature reviews, while maintaining high accuracy, query relevance and outperforming standard, publicly available LLMS.
- Europe (0.28)
- Asia > China (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Research Report (1.00)
- Overview > Innovation (0.45)
- Law (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Energy > Oil & Gas > Upstream (1.00)
- (9 more...)
Automated Fact-Checking of Climate Change Claims with Large Language Models
Leippold, Markus, Vaghefi, Saeid Ashraf, Stammbach, Dominik, Muccione, Veruska, Bingler, Julia, Ni, Jingwei, Colesanti-Senni, Chiara, Wekhof, Tobias, Schimanski, Tobias, Gostlow, Glen, Yu, Tingyu, Luterbacher, Juerg, Huggel, Christian
This paper presents Climinator, a novel AI-based tool designed to automate the fact-checking of climate change claims. Utilizing an array of Large Language Models (LLMs) informed by authoritative sources like the IPCC reports and peer-reviewed scientific literature, Climinator employs an innovative Mediator-Advocate framework. This design allows Climinator to effectively synthesize varying scientific perspectives, leading to robust, evidence-based evaluations. Our model demonstrates remarkable accuracy when testing claims collected from Climate Feedback and Skeptical Science. Notably, when integrating an advocate with a climate science denial perspective in our framework, Climinator's iterative debate process reliably converges towards scientific consensus, underscoring its adeptness at reconciling diverse viewpoints into science-based, factual conclusions. While our research is subject to certain limitations and necessitates careful interpretation, our approach holds significant potential. We hope to stimulate further research and encourage exploring its applicability in other contexts, including political fact-checking and legal domains.
- North America > United States (1.00)
- Europe > Switzerland > Zürich > Zürich (0.14)
- South America > Brazil (0.14)
- (4 more...)
- Media > News (1.00)
- Law > Environmental Law (1.00)
- Health & Medicine (1.00)
- (6 more...)
Learning the References of Online Model Predictive Control for Urban Self-Driving
Wang, Yubin, Peng, Zengqi, Ghazzai, Hakim, Ma, Jun
In this work, we propose a novel learning-based online model predictive control (MPC) framework for motion synthesis of self-driving vehicles. In this framework, the decision variables are generated as instantaneous references to modulate the cost functions of online MPC, where the constraints of collision avoidance and drivable surface boundaries are latently represented in the soft form. Hence, the embodied maneuvers of the ego vehicle are empowered to adapt to complex and dynamic traffic environments, even with unmodeled uncertainties of other traffic participants. Furthermore, we implement a deep reinforcement learning (DRL) framework for policy search to cast the step actions as the decision variables, where the practical and lightweight observations are considered as the input features of the policy network. The proposed approach is implemented in the high-fidelity simulator involving compound-complex urban driving scenarios, and the results demonstrate that the proposed development manifests remarkable adaptiveness to complex and dynamic traffic environments with a success rate of 85%. Also, its advantages in terms of safety, maneuverability, and robustness are illustrated.
Hybrid AI Inferencing managed with Microsoft Azure Arc-Enabled Kubernetes
Cloud native deployment with Kubernetes orchestration has enabled the "Write Once, Deploy Anywhere" paradigm for applications. This application development and deployment model enables scale and agility in today's hybrid and multi-cloud environments. Applications or services packaged as containers can be deployed and managed with the same Kubernetes based eco-system tools in the public cloud, on premise or Edge locations. Microsoft Azure Arc-Enabled Kubernetes (Reference 1) could be viewed as one such ecosystem tool the enables central management of Kubernetes clusters deployed on premises locations or across different public clouds. Kubernetes based offerings from different vendors are supported and they need not be based on Azure Kubernetes Service (AKS) (Reference 2).
ML in Action: Campaign to Collect and Share Machine Learning Use Cases
ML in Action is a virtual event to collect and share cool and useful machine learning (ML) use cases that leverage multiple Google ML products. This is the first run of an ML use case campaign by the ML Developer Programs team. Let us announce the winners right now, right here. They have showcased practical uses of ML, and how ML was adapted to real life situations. We hope these projects can spark new applied ML project ideas and provide opportunities for ML community leaders to discuss ML use cases.
- Asia > India (0.16)
- Africa > Nigeria (0.07)
- North America > United States (0.05)
- (2 more...)
FMCS Introduction to Deep Learning via @Algorithmia #AI #DeepLearning #Reference
Deep Learning is at the cutting edge of what machines can do, and developers and business leaders absolutely need to understand what it is and how it works. This unique type of algorithm has far surpassed any previous benchmarks for classification of images, text, and voice. WHY IT MATTERS: good introduction to a technology that will impact all our lives in the very near future.
Artificial Intelligence for Recruitment 101
It seems that "AI is whatever hasn't been done yet", and "As soon as AI successfully solves a problem, the problem is no longer a part of AI". ASI is the AI of science fiction – the artificial intelligence that surpasses human intelligence in every respect. "Sell", "Source" and "Match" are hopefully obvious tasks, with "Manage" including managing the entire recruitment process, and relationships with clients and candidates. There are undoubtedly more tasks that should be on the list, but let's run with the following: It's interesting to see that 100% of the Source and Match tasks are automatable whereas only 20% and 25% respectively of Manage and Sell are automatable.
Machine learning algorithm can identify drunken tweeting
To do that, he and his team collected thousands of geotagged posts tweeted between July 2013 and July 2014 in New York state, and then winnowed them down to tweets containing booze-related keywords (ranging from "beer keg" to "shitfaced"). Each tweet passed through three human "Turkers," who were asked three questions: Q1: Does the tweet make any reference to drinking alcoholic beverages? Q3: if so, is it likely that the tweet was sent at the time and place the tweeter was drinking alcoholic beverages? The success rate--that is, the rate at which the machines' answers matched the Turkers' consensus--ranged from 92 percent for the algorithm answering Q1, to 82 percent for the drunk-spotting algorithm answering Q3.