Goto

Collaborating Authors

 rsr


An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks

Dehghankar, Mohsen, Erfanian, Mahdi, Asudeh, Abolfazl

arXiv.org Artificial Intelligence

Despite their tremendous success and versatility, Large Language Models (LLMs) suffer from inference inefficiency while relying on advanced computational infrastructure. To address these challenges and make LLMs more accessible and cost-effective, in this paper, we propose algorithms to improve the inference time and memory efficiency of 1.58-bit LLMs with ternary weight matrices. Particularly focusing on matrix multiplication as the bottle-neck operation of inference, we observe that, once trained, the weight matrices of a model no longer change. This allows us to preprocess these matrices and create indices that help reduce the storage requirements by a logarithmic factor while enabling our efficient inference algorithms. Specifically, for a $n$ by $n$ weight matrix, our efficient algorithm guarantees a time complexity of $O(\frac{n^2}{\log n})$, a logarithmic factor improvement over the standard $O(n^2)$ vector-matrix multiplication. Besides theoretical analysis, we conduct extensive experiments to evaluate the practical efficiency of our algorithms. Our results confirm the superiority of the approach both with respect to time and memory, as we observed a reduction in inference time up to 29x and memory usage up to 6x.


Learning Randomized Reductions and Program Properties

Erata, Ferhat, Paradise, Orr, Antonopoulos, Timos, Nguyen, ThanhVu, Goldwasser, Shafi, Piskac, Ruzica

arXiv.org Artificial Intelligence

The correctness of computations remains a significant challenge in computer science, with traditional approaches relying on automated testing or formal verification. Self-testing/correcting programs introduce an alternative paradigm, allowing a program to verify and correct its own outputs via randomized reductions, a concept that previously required manual derivation. In this paper, we present Bitween, a method and tool for automated learning of randomized (self)-reductions and program properties in numerical programs. Bitween combines symbolic analysis and machine learning, with a surprising finding: polynomial-time linear regression, a basic optimization method, is not only sufficient but also highly effective for deriving complex randomized self-reductions and program invariants, often outperforming sophisticated mixed-integer linear programming solvers. We establish a theoretical framework for learning these reductions and introduce RSR-Bench, a benchmark suite for evaluating Bitween's capabilities on scientific and machine learning functions. Our empirical results show that Bitween surpasses state-of-the-art tools in scalability, stability, and sample efficiency when evaluated on nonlinear invariant benchmarks like NLA-DigBench. Bitween is open-source as a Python package and accessible via a web interface that supports C language programs.


Raising the Stakes: Performance Pressure Improves AI-Assisted Decision Making

Haduong, Nikita, Smith, Noah A.

arXiv.org Artificial Intelligence

The potential is not necessarily realized, however, because of several challenges: debates on ethical resposibility of decisions [8, 26, 44], the human ability to recognize when AI advice should be taken [43], mental models (biases) regarding AI performance and ability [12, 27] to perform well on subjective tasks, and effects of how the AI advice is delivered [46]. Many research directions thus aim to resolve these barriers to complementarity in human-AI performance, including examining the effects of having AI systems explain predictions [4] using explainable AI (XAI) methods, introducing cognitive forcing functions when presenting AI advice [6], adjusting AI advice interactions/presentation methods [40], and adjusting task framing to account for mental models about the types of tasks AI can work with [9]. In AI-assisted decision making, the human makes the final decision, bearing full responsibility for its consequences. Performance pressure from responsibility can influence decision making behavior [2]. The bulk of research working towards complementary human-AI performance isolates human behavior away from the effects of performance pressure because the field is rapidly evolving its understanding of how humans perceive and work with AI tools. Intrinsically high and low stakes tasks are used in these experiments, but the stakes have little tangible effect or implication for evaluators. Hence, we observe a gap in the literature of how people rely on AI assistants under performance pressure, or when stakes matter personally. In this work, we seek to understand how performance pressure affects AI advice usage when AI advice is provided as a second opinion. We induce performance pressure through a pay-by-performance scheme framed as a loss.


Can Large Language Models Automatically Jailbreak GPT-4V?

Wu, Yuanwei, Huang, Yue, Liu, Yixin, Li, Xiang, Zhou, Pan, Sun, Lichao

arXiv.org Artificial Intelligence

GPT-4V has attracted considerable attention due to its extraordinary capacity for integrating and processing multimodal information. At the same time, its ability of face recognition raises new safety concerns of privacy leakage. Despite researchers' efforts in safety alignment through RLHF or preprocessing filters, vulnerabilities might still be exploited. In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization. We leverage Large Language Models (LLMs) for red-teaming to refine the jailbreak prompt and employ weak-to-strong in-context learning prompts to boost efficiency. Furthermore, we present an effective search method that incorporates early stopping to minimize optimization time and token expenditure. Our experiments demonstrate that AutoJailbreak significantly surpasses conventional methods, achieving an Attack Success Rate (ASR) exceeding 95.3\%. This research sheds light on strengthening GPT-4V security, underscoring the potential for LLMs to be exploited in compromising GPT-4V integrity.


Design and Experimental Evaluation of a Haptic Robot-Assisted System for Femur Fracture Surgery

Alruwaili, Fayez H., Clancy, Michael P., Saeedi-Hosseiny, Marzieh S., Logar, Jacob A., Papachristou, Charalampos, Haydel, Christopher, Parvizi, Javad, Iordachita, Iulian I., Abedin-Nasab, Mohammad H.

arXiv.org Artificial Intelligence

In the face of challenges encountered during femur fracture surgery, such as the high rates of malalignment and X-ray exposure to operating personnel, robot-assisted surgery has emerged as an alternative to conventional state-of-the-art surgical methods. This paper introduces the development of Robossis, a haptic system for robot-assisted femur fracture surgery. Robossis comprises a 7-DOF haptic controller and a 6-DOF surgical robot. A unilateral control architecture is developed to address the kinematic mismatch and the motion transfer between the haptic controller and the Robossis surgical robot. A real-time motion control pipeline is designed to address the motion transfer and evaluated through experimental testing. The analysis illustrates that the Robossis surgical robot can adhere to the desired trajectory from the haptic controller with an average translational error of 0.32 mm and a rotational error of 0.07 deg. Additionally, a haptic rendering pipeline is developed to resolve the kinematic mismatch by constraining the haptic controller (user hand) movement within the permissible joint limits of the Robossis surgical robot. Lastly, in a cadaveric lab test, the Robossis system assisted surgeons during a mock femur fracture surgery. The result shows that Robossis can provide an intuitive solution for surgeons to perform femur fracture surgery.


Haptic-Enhanced Virtual Reality Simulator for Robot-Assisted Femur Fracture Surgery

Alruwaili, Fayez H., Halim-Banoub, David W., Rodgers, Jessica, Dalkilic, Adam, Haydel, Christopher, Parvizi, Javad, Iordachita, Iulian I., Abedin-Nasab, Mohammad H.

arXiv.org Artificial Intelligence

In this paper, we develop a virtual reality (VR) simulator for the Robossis robot-assisted femur fracture surgery. Due to the steep learning curve for such procedures, a VR simulator is essential for training surgeon(s) and staff. The Robossis Surgical Simulator (RSS) is designed to immerse user(s) in a realistic surgery setting using the Robossis system as completed in a previous real-world cadaveric procedure. The RSS is designed to interface the Sigma-7 Haptic Controller with the Robossis Surgical Robot (RSR) and the Meta Quest VR headset. Results show that the RSR follows user commands in 6 DOF and prevents the overlapping of bone segments. This development demonstrates a promising avenue for future implementation of the Robossis system.


Dubins Curve Based Continuous-Curvature Trajectory Planning for Autonomous Mobile Robots

Huang, Xuanhao, Yan, Chao-Bo

arXiv.org Artificial Intelligence

AMR is widely used in factories to replace manual labor to reduce costs and improve efficiency. However, it is often difficult for logistics robots to plan the optimal trajectory and unreasonable trajectory planning can lead to low transport efficiency and high energy consumption. In this paper, we propose a method to directly calculate the optimal trajectory for short distance on the basis of the Dubins set, which completes the calculation of the Dubins path. Additionally, as an improvement of Dubins path, we smooth the Dubins path based on clothoid, which makes the curvature varies linearly. AMR can adjust the steering wheels while following this trajectory. The experiments show that the Dubins path can be calculated quickly and well smoothed.


Can representation learning for multimodal image registration be improved by supervision of intermediate layers?

Wetzer, Elisabeth, Lindblad, Joakim, Sladoje, Nataša

arXiv.org Artificial Intelligence

Multimodal imaging and correlative analysis typically require image alignment. Contrastive learning can generate representations of multimodal images, reducing the challenging task of multimodal image registration to a monomodal one. Previously, additional supervision on intermediate layers in contrastive learning has improved biomedical image classification. We evaluate if a similar approach improves representations learned for registration to boost registration performance. We explore three approaches to add contrastive supervision to the latent features of the bottleneck layer in the U-Nets encoding the multimodal images and evaluate three different critic functions. Our results show that representations learned without additional supervision on latent features perform best in the downstream task of registration on two public biomedical datasets. We investigate the performance drop by exploiting recent insights in contrastive learning in classification and self-supervised learning. We visualize the spatial relations of the learned representations by means of multidimensional scaling, and show that additional supervision on the bottleneck layer can lead to partial dimensional collapse of the intermediate embedding space.


AMD RSR vs. FSR: What's the difference, and which should you use?

PCWorld

AMD recently released its Radeon Super Resolution (RSR) feature, which promises to speed up the performance of your games. AMD already offered a similar technology dubbed "FidelityFX Super Resolution," or FSR, and it's been adopted at a blistering rate since its introduction last summer. Don't let the names get lost in translation, however. We'll explain exactly what each one does, along with when you should use them. And trust us--having both of these available means better PC gaming in general. AMD, Intel, and Nvidia all have their own versions of upsampling technology.


Path Symmetries in Undirected Uniform-Cost Grids

Harabor, Daniel Damir (NICTA and The Australian National University) | Botea, Adi (NICTA and The Australian National University) | Kilby, Philip (NICTA and The Australian National University)

AAAI Conferences

We explore a symmetry-based reformulation technique which can speed up optimal pathfinding on undirected uniform-cost grid maps by over 30 times. Our offline approach decomposes grid maps into a set of empty rectangles, removing from each all interior nodes and possibly some from along the perimeter. We then add macro-edges between selected pairs of remaining perimeter nodes to facilitate provably optimal traversal through each rectangle. To further speed up search, we also develop a novel online pruning technique. Our algorithm is fast, memory efficient and retains both optimality and completeness during search.