AITopics | Banff

Collaborating Authors

Banff

A Neural Column Generation Approach to the Vehicle Routing Problem with Two-Dimensional Loading and Last-In-First-Out Constraints

arXiv.org Artificial IntelligenceJun-18-2024

The vehicle routing problem with two-dimensional loading constraints (2L-CVRP) and the last-in-first-out (LIFO) rule presents significant practical and algorithmic challenges. While numerous heuristic approaches have been proposed to address its complexity, stemming from two NP-hard problems: the vehicle routing problem (VRP) and the two-dimensional bin packing problem (2D-BPP), less attention has been paid to developing exact algorithms. Bridging this gap, this article presents an exact algorithm that integrates advanced machine learning techniques, specifically a novel combination of attention and recurrence mechanisms. This integration accelerates the state-of-the-art exact algorithm by a median of 29.79% across various problem instances. Moreover, the proposed algorithm successfully resolves an open instance in the standard test-bed, demonstrating significant improvements brought about by the incorporation of machine learning models. Code is available at https://github.com/xyfffff/NCG-for-2L-CVRP.

algorithm, constraint, loading constraint, (16 more...)

arXiv.org Artificial Intelligence

2406.12454

Country:

North America > United States (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Spain (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.82)

Industry: Transportation > Freight & Logistics Services (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

Wang, Yibin, Shi, Haizhou, Han, Ligong, Metaxas, Dimitris, Wang, Hao

arXiv.org Machine LearningJun-18-2024

Large Language Models (LLMs) often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the LLMs are trained, enabling them to quantify uncertainty. However, such post-training approaches' performance is severely limited by the parameters learned during training. In this paper, we go beyond post-training Bayesianization and propose Bayesian Low-Rank Adaptation by Backpropagation (BLoB), an algorithm that continuously and jointly adjusts both the mean and covariance of LLM parameters throughout the whole fine-tuning process. Our empirical results verify the effectiveness of BLoB in terms of generalization and uncertainty estimation, when evaluated on both in-distribution and out-of-distribution data.

arxiv preprint arxiv, blob, uncertainty estimation, (13 more...)

arXiv.org Machine Learning

2406.11675

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(4 more...)

Genre: Research Report (1.00)

Add feedback

A Personalised Learning Tool for Physics Undergraduate Students Built On a Large Language Model for Symbolic Regression

Zhu, Yufan, Khoo, Zi-Yu, Low, Jonathan Sze Choong, Bressan, Stephane

arXiv.org Artificial IntelligenceJun-17-2024

Interleaved practice enhances the memory and problem-solving ability of students in undergraduate courses. We introduce a personalized learning tool built on a Large Language Model (LLM) that can provide immediate and personalized attention to students as they complete homework containing problems interleaved from undergraduate physics courses. Our tool leverages the dimensional analysis method, enhancing students' qualitative thinking and problem-solving skills for complex phenomena. Our approach combines LLMs for symbolic regression with dimensional analysis via prompt engineering and offers students a unique perspective to comprehend relationships between physics variables. This fosters a broader and more versatile understanding of physics and mathematical principles and complements a conventional undergraduate physics education that relies on interpreting and applying established equations within specific contexts. We test our personalized learning tool on the equations from Feynman's lectures on physics. Our tool can correctly identify relationships between physics variables for most equations, underscoring its value as a complementary personalized learning tool for undergraduate physics students.

equation, language model, regression, (14 more...)

arXiv.org Artificial Intelligence

2407.00065

Country:

Asia > Singapore > Central Region > Singapore (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(10 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting > Higher Education (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models

Yang, Shuo, Yuan, Chenchen, Rong, Yao, Steinbauer, Felix, Kasneci, Gjergji

arXiv.org Artificial IntelligenceJun-17-2024

A multitude of industries depend on accurate and reasonable tabular data augmentation for their business processes. Contemporary methodologies in generating tabular data revolve around utilizing Generative Adversarial Networks (GAN) or fine-tuning Large Language Models (LLM). However, GAN-based approaches are documented to produce samples with common-sense errors attributed to the absence of external knowledge. On the other hand, LLM-based methods exhibit a limited capacity to capture the disparities between synthesized and actual data distribution due to the absence of feedback from a discriminator during training. Furthermore, the decoding of LLM-based generation introduces gradient breakpoints, impeding the backpropagation of loss from a discriminator, thereby complicating the integration of these two approaches. To solve this challenge, we propose using proximal policy optimization (PPO) to apply GANs, guiding LLMs to enhance the probability distribution of tabular features. This approach enables the utilization of LLMs as generators for GANs in synthesizing tabular data. Our experiments demonstrate that PPO leads to an approximately 4\% improvement in the accuracy of models trained on synthetically generated data over state-of-the-art across three real-world datasets.

dataset, explanation, tabular data, (17 more...)

arXiv.org Artificial Intelligence

2406.11391

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.68)
Banking & Finance (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Feasibility of Fidelity$^-$ for Graph Pruning

Shin, Yong-Min, Shin, Won-Yong

arXiv.org Artificial IntelligenceJun-17-2024

As one of popular quantitative metrics to assess the quality of explanation of graph neural networks (GNNs), fidelity measures the output difference after removing unimportant parts of the input graph. Fidelity has been widely used due to its straightforward interpretation that the underlying model should produce similar predictions when features deemed unimportant from the explanation are removed. This raises a natural question: "Does fidelity induce a global (soft) mask for graph pruning?" To solve this, we aim to explore the potential of the fidelity measure to be used for graph pruning, eventually enhancing the GNN models for better efficiency. To this end, we propose Fidelity$^-$-inspired Pruning (FiP), an effective framework to construct global edge masks from local explanations. Our empirical observations using 7 edge attribution methods demonstrate that, surprisingly, general eXplainable AI methods outperform methods tailored to GNNs in terms of graph pruning performance.

explanation, neural network, pruning, (13 more...)

arXiv.org Artificial Intelligence

2406.11504

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.05)
(13 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.49)

Add feedback

On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions

Wang, Weiqi, Fang, Tianqing, Shi, Haochen, Xu, Baixuan, Ding, Wenxuan, Zhang, Liyu, Fan, Wei, Bai, Jiaxin, Li, Haoran, Liu, Xin, Song, Yangqiu

arXiv.org Artificial IntelligenceJun-16-2024

Entity- and event-level conceptualization, as fundamental elements of human cognition, plays a pivotal role in generalizable reasoning. This process involves abstracting specific instances into higher-level concepts and forming abstract knowledge that can be applied in unfamiliar or novel situations, which can enhance models' inferential capabilities and support the effective transfer of knowledge across various domains. Despite its significance, there is currently a lack of a systematic overview that comprehensively examines existing works in the definition, execution, and application of conceptualization to enhance reasoning tasks. In this paper, we address this gap by presenting the first comprehensive survey of 150+ papers, categorizing various definitions, resources, methods, and downstream applications related to conceptualization into a unified taxonomy, with a focus on the entity and event levels. Furthermore, we shed light on potential future directions in this field and hope to garner more attention from the community.

computational linguistic, conceptualization, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2406.10885

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.05)
(42 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(6 more...)

Add feedback

Order-theoretic models for decision-making: Learning, optimization, complexity and computation

Hack, Pedro

arXiv.org Artificial IntelligenceJun-15-2024

The study of intelligent systems explains behaviour in terms of economic rationality. This results in an optimization principle involving a function or utility, which states that the system will evolve until the configuration of maximum utility is achieved. Recently, this theory has incorporated constraints, i.e., the optimum is achieved when the utility is maximized while respecting some information-processing constraints. This is reminiscent of thermodynamic systems. As such, the study of intelligent systems has benefited from the tools of thermodynamics. The first aim of this thesis is to clarify the applicability of these results in the study of intelligent systems. We can think of the local transition steps in thermodynamic or intelligent systems as being driven by uncertainty. In fact, the transitions in both systems can be described in terms of majorization. Hence, real-valued uncertainty measures like Shannon entropy are simply a proxy for their more involved behaviour. More in general, real-valued functions are fundamental to study optimization and complexity in the order-theoretic approach to several topics, including economics, thermodynamics, and quantum mechanics. The second aim of this thesis is to improve on this classification. The basic similarity between thermodynamic and intelligent systems is based on an uncertainty notion expressed by a preorder. We can also think of the transitions in the steps of a computational process as a decision-making procedure. In fact, by adding some requirements on the considered order structures, we can build an abstract model of uncertainty reduction that allows to incorporate computability, that is, to distinguish the objects that can be constructed by following a finite set of instructions from those that cannot. The third aim of this thesis is to clarify the requirements on the order structure that allow such a framework.

information theory and statistical mechanics, multi-utility injective monotone strict monotone, recursive function and effective computability, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.18725/OPARU-52612

2406.1073

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Alameda County > Berkeley (0.13)
North America > United States > California > San Francisco County > San Francisco (0.13)
(14 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Education > Educational Setting (0.45)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)

Add feedback

Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last

Hacohen, Guy, Tuytelaars, Tinne

arXiv.org Artificial IntelligenceJun-14-2024

Catastrophic forgetting poses a significant challenge in continual learning, where models often forget previous tasks when trained on new data. Our empirical analysis reveals a strong correlation between catastrophic forgetting and the learning speed of examples: examples learned early are rarely forgotten, while those learned later are more susceptible to forgetting. We demonstrate that replay-based continual learning methods can leverage this phenomenon by focusing on mid-learned examples for rehearsal. We introduce Goldilocks, a novel replay buffer sampling method that filters out examples learned too quickly or too slowly, keeping those learned at an intermediate speed. Goldilocks improves existing continual learning algorithms, leading to state-of-the-art performance across several image classification tasks.

buffer, learning, replay buffer, (16 more...)

arXiv.org Artificial Intelligence

2406.09935

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(6 more...)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

MirrorCheck: Efficient Adversarial Defense for Vision-Language Models

Fares, Samar, Ziu, Klea, Aremu, Toluwani, Durasov, Nikita, Takáč, Martin, Fua, Pascal, Nandakumar, Karthik, Laptev, Ivan

arXiv.org Artificial IntelligenceJun-13-2024

Vision-Language Models (VLMs) are becoming increasingly vulnerable to adversarial attacks as various novel attack strategies are being proposed against these models. While existing defenses excel in unimodal contexts, they currently fall short in safeguarding VLMs against adversarial threats. To mitigate this vulnerability, we propose a novel, yet elegantly simple approach for detecting adversarial samples in VLMs. Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs. Subsequently, we calculate the similarities of the embeddings of both input and generated images in the feature space to identify adversarial samples. Empirical evaluations conducted on different datasets validate the efficacy of our approach, outperforming baseline methods adapted from image classification domains. Furthermore, we extend our methodology to classification tasks, showcasing its adaptability and model-agnostic nature. Theoretical analyses and empirical findings also show the resilience of our approach against adaptive attacks, positioning it as an excellent defense mechanism for real-world deployment against adversarial threats.

adv-query 0, adv-transfer 0, mirrorcheck, (14 more...)

arXiv.org Artificial Intelligence

2406.0925

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.87)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Unveiling Incomplete Modality Brain Tumor Segmentation: Leveraging Masked Predicted Auto-Encoder and Divergence Learning

Sun, Zhongao, Li, Jiameng, Wang, Yuhan, Cheng, Jiarong, Zhou, Qing, Li, Chun

arXiv.org Artificial IntelligenceJun-12-2024

Brain tumor segmentation remains a significant challenge, particularly in the context of multi-modal magnetic resonance imaging (MRI) where missing modality images are common in clinical settings, leading to reduced segmentation accuracy. To address this issue, we propose a novel strategy, which is called masked predicted pre-training, enabling robust feature learning from incomplete modality data. Additionally, in the fine-tuning phase, we utilize a knowledge distillation technique to align features between complete and missing modality data, simultaneously enhancing model robustness. Notably, we leverage the Holder pseudo-divergence instead of the KLD for distillation loss, offering improve mathematical interpretability and properties. Extensive experiments on the BRATS2018 and BRATS2020 datasets demonstrate significant performance enhancements compared to existing state-of-the-art methods.

divergence, modality, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2406.08634

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
(17 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.87)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback