AITopics | Chen, Cailian

Collaborating Authors

Chen, Cailian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs

Wang, Siyu, Chen, Cailian, Le, Xinyi, Xu, Qimin, Xu, Lei, Zhang, Yanzhou, Yang, Jie

arXiv.org Artificial IntelligenceDec-27-2024

Computer-aided design (CAD) significantly enhances the efficiency, accuracy, and innovation of design processes by enabling precise 2D and 3D modeling, extensive analysis, and optimization. Existing methods for creating CAD models rely on latent vectors or point clouds, which are difficult to obtain and costly to store. Recent advances in Multimodal Large Language Models (MLLMs) have inspired researchers to use natural language instructions and images for CAD model construction. However, these models still struggle with inferring accurate 3D spatial location and orientation, leading to inaccuracies in determining the spatial 3D starting points and extrusion directions for constructing geometries. This work introduces CAD-GPT, a CAD synthesis method with spatial reasoning-enhanced MLLM that takes either a single image or a textual description as input. To achieve precise spatial inference, our approach introduces a 3D Modeling Spatial Mechanism. This method maps 3D spatial positions and 3D sketch plane rotation angles into a 1D linguistic feature space using a specialized spatial unfolding mechanism, while discretizing 2D sketch coordinates into an appropriate planar space to enable precise determination of spatial starting position, sketch orientation, and 2D sketch coordinate translations. Extensive experiments demonstrate that CAD-GPT consistently outperforms existing state-of-the-art methods in CAD model synthesis, both quantitatively and qualitatively.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.19663

Country: North America > United States (0.28)

Genre: Research Report (0.84)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

GTA: A Benchmark for General Tool Agents

Wang, Jize, Ma, Zerun, Li, Yining, Zhang, Songyang, Chen, Cailian, Chen, Kai, Le, Xinyi

arXiv.org Artificial IntelligenceJul-11-2024

Significant focus has been placed on integrating large language models (LLMs) with various tools in developing general-purpose agents. This poses a challenge to LLMs' tool-use capabilities. However, there are evident gaps between existing tool-use evaluations and real-world scenarios. Current evaluations often use AI-generated queries, single-step tasks, dummy tools, and text-only interactions, failing to reveal the agents' real-world problem-solving abilities effectively. To address this, we propose GTA, a benchmark for General Tool Agents, featuring three main aspects: (i) Real user queries: human-written queries with simple real-world objectives but implicit tool-use, requiring the LLM to reason the suitable tools and plan the solution steps. (ii) Real deployed tools: an evaluation platform equipped with tools across perception, operation, logic, and creativity categories to evaluate the agents' actual task execution performance. (iii) Real multimodal inputs: authentic image files, such as spatial scenes, web page screenshots, tables, code snippets, and printed/handwritten materials, used as the query contexts to align with real-world scenarios closely. We design 229 real-world tasks and executable tool chains to evaluate mainstream LLMs. Our findings show that real-world user queries are challenging for existing LLMs, with GPT-4 completing less than 50% of the tasks and most LLMs achieving below 25%. This evaluation reveals the bottlenecks in the tool-use capabilities of current LLMs in real-world scenarios, which provides future direction for advancing general-purpose tool agents. The code and dataset are available at https://github.com/open-compass/GTA.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.08713

Country:

North America > United States (0.46)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Consumer Products & Services (1.00)
Health & Medicine (0.93)
Transportation > Air (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Preserving Topology of Network Systems: Metric, Analysis, and Optimal Design

Li, Yushan, Wang, Zitong, He, Jianping, Chen, Cailian, Guan, Xinping

arXiv.org Artificial IntelligenceJul-31-2023

Preserving the topology from being inferred by external adversaries has become a paramount security issue for network systems (NSs), and adding random noises to the nodal states provides a promising way. Nevertheless, recent works have revealed that the topology cannot be preserved under i.i.d. noises in the asymptotic sense. How to effectively characterize the non-asymptotic preservation performance still remains an open issue. Inspired by the deviation quantification of concentration inequalities, this paper proposes a novel metric named trace-based variance-expectation ratio. This metric effectively captures the decaying rate of the topology inference error, where a slower rate indicates better non-asymptotic preservation performance. We prove that the inference error will always decay to zero asymptotically, as long as the added noises are non-increasing and independent (milder than the i.i.d. condition). Then, the optimal noise design that produces the slowest decaying rate for the error is obtained. More importantly, we amend the noise design by introducing one-lag time dependence, achieving the zero state deviation and the non-zero topology inference error in the asymptotic sense simultaneously. Extensions to a general class of noises with multi-lag time dependence are provided. Comprehensive simulations verify the theoretical findings.

artificial intelligence, machine learning, noise, (16 more...)

arXiv.org Artificial Intelligence

2307.16705

Country: Asia > China (0.94)

Genre:

Research Report (1.00)
Personal (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Inferring Topology of Networked Dynamical Systems by Active Excitations

Li, Yushan, He, Jianping, Chen, Cailian, Guan, Xinping

arXiv.org Artificial IntelligenceAug-23-2022

Topology inference for networked dynamical systems (NDSs) has received considerable attention in recent years. The majority of pioneering works have dealt with inferring the topology from abundant observations of NDSs, so as to approximate the real one asymptotically. Leveraging the characteristic that NDSs will react to various disturbances and the disturbance's influence will consistently spread, this paper focuses on inferring the topology by a few active excitations. The key challenge is to distinguish different influences of system noises and excitations from the exhibited state deviations, where the influences will decay with time and the exciatation cannot be arbitrarily large. To practice, we propose a one-shot excitation based inference method to infer $h$-hop neighbors of a node. The excitation conditions for accurate one-hop neighbor inference are first derived with probability guarantees. Then, we extend the results to $h$-hop neighbor inference and multiple excitations cases, providing the explicit relationships between the inference accuracy and excitation magnitude. Specifically, the excitation based inference method is not only suitable for scenarios where abundant observations are unavailable, but also can be leveraged as auxiliary means to improve the accuracy of existing methods. Simulations are conducted to verify the analytical results.

artificial intelligence, excitation, probability, (15 more...)

arXiv.org Artificial Intelligence

2208.11276

Genre: Research Report (0.40)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

Efficient Metropolitan Traffic Prediction Based on Graph Recurrent Neural Network

Wang, Xiaoyu, Chen, Cailian, Min, Yang, He, Jianping, Yang, Bo, Zhang, Yang

arXiv.org Artificial IntelligenceNov-2-2018

Traffic prediction is a fundamental and vital task in Intelligence Transportation System (ITS), but it is very challenging to get high accuracy while containing low computational complexity due to the spatiotemporal characteristics of traffic flow, especially under the metropolitan circumstances. In this work, a new topological framework, called Linkage Network, is proposed to model the road networks and present the propagation patterns of traffic flow. Based on the Linkage Network model, a novel online predictor, named Graph Recurrent Neural Network (GRNN), is designed to learn the propagation patterns in the graph. It could simultaneously predict traffic flow for all road segments based on the information gathered from the whole graph, which thus reduces the computational complexity significantly from O(nm) to O(n m), while keeping the high accuracy. Moreover, it can also predict the variations of traffic trends. Experiments based on real-world data demonstrate that the proposed method outperforms the existing prediction methods.

deep learning, neural network, prediction, (21 more...)

arXiv.org Artificial Intelligence

1811.0074

Country:

Asia > China (0.15)
North America (0.14)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback