Goto

Collaborating Authors

 path


Vision GNN: An Image is Worth Graph of Nodes Kai Han 1,2 Yunhe Wang

Neural Information Processing Systems

Given a FFN module, the diversity γ (FFN (X)) of its output features satisfies γ ( FFN( X)) λγ ( X), (2) where λ is the Lipschitz constant of FFN with respect to p-norm for p [1, ].




OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation

Wang, Zilong, Cui, Yuedong, Zhong, Li, Zhang, Zimin, Yin, Da, Lin, Bill Yuchen, Shang, Jingbo

arXiv.org Artificial Intelligence

Office automation significantly enhances human productivity by automatically finishing routine tasks in the workflow. Beyond the basic information extraction studied in much of the prior document AI literature, the office automation research should be extended to more realistic office tasks which require to integrate various information sources in the office system and produce outputs through a series of decision-making processes. We introduce OfficeBench, one of the first office automation benchmarks for evaluating current LLM agents' capability to address office tasks in realistic office workflows. OfficeBench requires LLM agents to perform feasible long-horizon planning, proficiently switch between applications in a timely manner, and accurately ground their actions within a large combined action space, based on the contextual demands of the workflow. Applying our customized evaluation methods on each task, we find that GPT-4 Omni achieves the highest pass rate of 47.00%, demonstrating a decent performance in handling office tasks. However, this is still far below the human performance and accuracy standards required by real-world office workflows. We further observe that most issues are related to operation redundancy and hallucinations, as well as limitations in switching between multiple applications, which may provide valuable insights for developing effective agent frameworks for office automation.


A Collision Cone Approach for Control Barrier Functions

Tayal, Manan, Goswami, Bhavya Giri, Rajgopal, Karthik, Singh, Rajpal, Rao, Tejas, Keshavan, Jishnu, Jagtap, Pushpak, Kolathaya, Shishir

arXiv.org Artificial Intelligence

This work presents a unified approach for collision avoidance using Collision-Cone Control Barrier Functions (CBFs) in both ground (UGV) and aerial (UAV) unmanned vehicles. We propose a novel CBF formulation inspired by collision cones, to ensure safety by constraining the relative velocity between the vehicle and the obstacle to always point away from each other. The efficacy of this approach is demonstrated through simulations and hardware implementations on the TurtleBot, Stoch-Jeep, and Crazyflie 2.1 quadrotor robot, showcasing its effectiveness in avoiding collisions with dynamic obstacles in both ground and aerial settings. The real-time controller is developed using CBF Quadratic Programs (CBF-QPs). Comparative analysis with the state-of-the-art CBFs highlights the less conservative nature of the proposed approach. Overall, this research contributes to a novel control formation that can give a guarantee for collision avoidance in unmanned vehicles by modifying the control inputs from existing path-planning controllers.


Foundation Models and the Path Towards a Universal Algorithm – Towards AI

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.


ChatGPT Will Kill Search and Open a Path to Web3

#artificialintelligence

NFT and open metaverse enthusiasts have debated for some time about what would drive mass adoption of their projects and lead to their longed-for disintermediation of the dominant internet platforms. Would it be the deployment of digital collectibles in gaming? Would it come from household consumer brands and entertainment companies developing direct NFT-based engagement strategies to forge "ownership" relationships with their customers and fans? Would it lie in the new models of collective value creation and shared intellectual property spearheaded by projects such as Yuga Labs' Bored Ape Yacht Club?


A Path Towards Clinical Adaptation of Accelerated MRI

Yao, Michael S., Hansen, Michael S.

arXiv.org Artificial Intelligence

Accelerated MRI reconstructs images of clinical anatomies from sparsely sampled signal data to reduce patient scan times. While recent works have leveraged deep learning to accomplish this task, such approaches have often only been explored in simulated environments where there is no signal corruption or resource limitations. In this work, we explore augmentations to neural network MRI image reconstructors to enhance their clinical relevancy. Namely, we propose a ConvNet model for detecting sources of image artifacts that achieves a classifier $F_2$ score of 79.1%. We also demonstrate that training reconstructors on MR signal data with variable acceleration factors can improve their average performance during a clinical patient scan by up to 2%. We offer a loss function to overcome catastrophic forgetting when models learn to reconstruct MR images of multiple anatomies and orientations. Finally, we propose a method for using simulated phantom data to pre-train reconstructors in situations with limited clinically acquired datasets and compute capabilities. Our results provide a potential path forward for clinical adaptation of accelerated MRI.


Is Artificial Intelligence Going Down the Path of Nuclear Weapons?

#artificialintelligence

This story is syndicated from the Substack newsletter Big Technology; subscribe for free here. In front of a packed house last week at Amsterdam's World Summit AI last week, I asked senior researchers at Meta, Google, IBM, and The University of Sussex to speak up if they did not want AI to mirror human intelligence. After a few silent moments, no hands went up. The response reflected the AI industry's ambition to build human-level cognition, even if it might lose control of it. AI is not sentient now--and won't be for some time, if ever--but a determined AI industry is already releasing programs that can chat, see, and draw like humans as it tries to get there.


Council Post: Translation, Localization And The Many Paths To AI Innovation

#artificialintelligence

Mohammad Omar is cofounder and CEO at LXT, an emerging leader in global AI training data that powers intelligent technology. I believe that artificial intelligence (AI) is one of our most important technological innovations but that we're still in the early stages of AI maturity, with much still to be achieved across industries. This pivotal technology will have an endless number of applications, and there will be many paths for innovators to shape its future. Technology that helps machines understand the way people communicate is one of the most promising new breeds of AI. As globalization continues, the translation and localization industry represents a key area for AI innovation, and several companies in the space have undergone a transformation into AI-powered businesses to inform new language-oriented applications.