AITopics | Fang, Xi

Collaborating Authors

Fang, Xi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Intelligent System for Automated Molecular Patent Infringement Assessment

Shi, Yaorui, Li, Sihang, Zhang, Taiyan, Fang, Xi, Wang, Jiankun, Liu, Zhiyuan, Zhao, Guojiang, Zhu, Zhengdan, Gao, Zhifeng, Zhong, Renxin, Zhang, Linfeng, Ke, Guolin, E, Weinan, Cai, Hengxing, Wang, Xiang

arXiv.org Artificial IntelligenceJan-12-2025

Automated drug discovery offers significant potential for accelerating the development of novel therapeutics by substituting labor-intensive human workflows with machine-driven processes. However, molecules generated by artificial intelligence may unintentionally infringe on existing patents, posing legal and financial risks that impede the full automation of drug discovery pipelines. This paper introduces PatentFinder, a novel multi-agent and tool-enhanced intelligence system that can accurately and comprehensively evaluate small molecules for patent infringement. PatentFinder features five specialized agents that collaboratively analyze patent claims and molecular structures with heuristic and model-based tools, generating interpretable infringement reports. To support systematic evaluation, we curate MolPatent-240, a benchmark dataset tailored for patent infringement assessment algorithms. On this benchmark, PatentFinder outperforms baseline methods that rely solely on large language models or specialized chemical tools, achieving a 13.8% improvement in F1-score and a 12% increase in accuracy. Additionally, PatentFinder autonomously generates detailed and interpretable patent infringement reports, showcasing enhanced accuracy and improved interpretability. The high accuracy and interpretability of PatentFinder make it a valuable and reliable tool for automating patent infringement assessments, offering a practical solution for integrating patent protection analysis into the drug discovery pipeline.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.07819

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE (0.14)
Asia > China > Anhui Province (0.14)

Genre: Research Report > New Finding (0.92)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

Fang, Xi, Xu, Weijie, Tan, Fiona Anting, Zhang, Jiani, Hu, Ziqing, Qi, Yanjun, Nickleach, Scott, Socolinsky, Diego, Sengamedu, Srinivasan, Faloutsos, Christos

arXiv.org Artificial IntelligenceJun-21-2024

Recent breakthroughs in large language modeling have facilitated rigorous exploration of their application in diverse tasks related to tabular data modeling, such as prediction, tabular data synthesis, question answering, and table understanding. Each task presents unique challenges and opportunities. However, there is currently a lack of comprehensive review that summarizes and compares the key techniques, metrics, datasets, models, and optimization approaches in this research domain. This survey aims to address this gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized. It identifies strengths, limitations, unexplored territories, and gaps in the existing literature, while providing some insights for future research directions in this vital and rapidly evolving field. It also provides relevant code and datasets references. Through this comprehensive review, we hope to provide interested readers with pertinent references and insightful perspectives, empowering them with the necessary tools and knowledge to effectively navigate and address the prevailing challenges in the field.

large language model, machine learning, tabular data, (18 more...)

arXiv.org Artificial Intelligence

2402.17944

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Pruner: An Efficient Cross-Platform Tensor Compiler with Dual Awareness

Qiao, Liang, Shi, Jun, Hao, Xiaoyu, Fang, Xi, Zhao, Minfan, Zhu, Ziqi, Chen, Junshi, An, Hong, Li, Bing, Yuan, Honghui, Wang, Xinyang

arXiv.org Artificial IntelligenceFeb-4-2024

Tensor program optimization on Deep Learning Accelerators (DLAs) is critical for efficient model deployment. Although search-based Deep Learning Compilers (DLCs) have achieved significant performance gains compared to manual methods, they still suffer from the persistent challenges of low search efficiency and poor cross-platform adaptability. In this paper, we propose $\textbf{Pruner}$, following hardware/software co-design principles to hierarchically boost tensor program optimization. Pruner comprises two primary components: a Parameterized Static Analyzer ($\textbf{PSA}$) and a Pattern-aware Cost Model ($\textbf{PaCM}$). The former serves as a hardware-aware and formulaic performance analysis tool, guiding the pruning of the search space, while the latter enables the performance prediction of tensor programs according to the critical data-flow patterns. Furthermore, to ensure effective cross-platform adaptation, we design a Momentum Transfer Learning ($\textbf{MTL}$) strategy using a Siamese network, which establishes a bidirectional feedback mechanism to improve the robustness of the pre-trained cost model. The extensive experimental results demonstrate the effectiveness and advancement of the proposed Pruner in various tensor program tuning tasks across both online and offline scenarios, with low resource overhead. The code is available at https://github.com/qiaolian9/Pruner.

artificial intelligence, machine learning, tensor program, (15 more...)

arXiv.org Artificial Intelligence

2402.02361

Country: Asia > China (0.31)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent

Xu, Weijie, Huang, Zicheng, Hu, Wenxiang, Fang, Xi, Cherukuri, Rajesh Kumar, Nayyar, Naumaan, Malandri, Lorenzo, Sengamedu, Srinivasan H.

arXiv.org Artificial IntelligenceFeb-1-2024

Recent advancements in Large Language Models (LLMs) have been reshaping Natural Language Processing (NLP) task in several domains. Their use in the field of Human Resources (HR) has still room for expansions and could be beneficial for several time consuming tasks. Examples such as time-off submissions, medical claims filing, and access requests are noteworthy, but they are by no means the sole instances. However, the aforementioned developments must grapple with the pivotal challenge of constructing a high-quality training dataset. On one hand, most conversation datasets are solving problems for customers not employees. On the other hand, gathering conversations with HR could raise privacy concerns. To solve it, we introduce HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR domains to evaluate LLM Agent. Our work has the following contributions: (1) It is the first labeled open-sourced conversation dataset in the HR domain for NLP research. (2) It provides a detailed recipe for the data generation procedure along with data analysis and human evaluations. The data generation pipeline is transferable and can be easily adapted for labeled conversation data generation in other domains. (3) The proposed data-collection pipeline is mostly based on LLMs with minimal human involvement for annotation, which is time and cost-efficient.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2402.01018

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada (0.14)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Soft-tissue Driven Craniomaxillofacial Surgical Planning

Fang, Xi, Kim, Daeseung, Xu, Xuanang, Kuang, Tianshu, Lampen, Nathan, Lee, Jungwook, Deng, Hannah H., Gateno, Jaime, Liebschner, Michael A. K., Xia, James J., Yan, Pingkun

arXiv.org Artificial IntelligenceJul-20-2023

In CMF surgery, the planning of bony movement to achieve a desired facial outcome is a challenging task. Current bone driven approaches focus on normalizing the bone with the expectation that the facial appearance will be corrected accordingly. However, due to the complex non-linear relationship between bony structure and facial soft-tissue, such bone-driven methods are insufficient to correct facial deformities. Despite efforts to simulate facial changes resulting from bony movement, surgical planning still relies on iterative revisions and educated guesses. To address these issues, we propose a soft-tissue driven framework that can automatically create and verify surgical plans. Our framework consists of a bony planner network that estimates the bony movements required to achieve the desired facial outcome and a facial simulator network that can simulate the possible facial changes resulting from the estimated bony movement plans. By combining these two models, we can verify and determine the final bony movement required for planning. The proposed framework was evaluated using a clinical dataset, and our experimental results demonstrate that the soft-tissue driven approach greatly improves the accuracy and efficacy of surgical planning when compared to the conventional bone-driven approach.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2307.10954

Country: North America > United States (0.29)

Genre: Research Report > New Finding (0.49)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.95)
Energy > Oil & Gas (0.95)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Multi-class Active Learning: A Hybrid Informative and Representative Criterion Inspired Approach

Fang, Xi, Wang, Zengmao, Tang, Xinyao, Wu, Chen

arXiv.org Machine LearningMar-6-2018

Labeling each instance in a large dataset is extremely labor- and time- consuming . One way to alleviate this problem is active learning, which aims to which discover the most valuable instances for labeling to construct a powerful classifier. Considering both informativeness and representativeness provides a promising way to design a practical active learning. However, most existing active learning methods select instances favoring either informativeness or representativeness. Meanwhile, many are designed based on the binary class, so that they may present suboptimal solutions on the datasets with multiple classes. In this paper, a hybrid informative and representative criterion based multi-class active learning approach is proposed. We combine the informative informativeness and representativeness into one formula, which can be solved under a unified framework. The informativeness is measured by the margin minimum while the representative information is measured by the maximum mean discrepancy. By minimizing the upper bound for the true risk, we generalize the empirical risk minimization principle to the active learning setting. Simultaneously, our proposed method makes full use of the label information, and the proposed active learning is designed based on multiple classes. So the proposed method is not suitable to the binary class but also the multiple classes. We conduct our experiments on twelve benchmark UCI data sets, and the experimental results demonstrate that the proposed method performs better than some state-of-the-art methods.

active learning, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1803.02222

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback