AITopics | Chen, Yang

Collaborating Authors

Chen, Yang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

xASTNN: Improved Code Representations for Industrial Practice

Xu, Zhiwei, Zhou, Min, Zhao, Xibin, Chen, Yang, Cheng, Xi, Zhang, Hongyu

arXiv.org Artificial IntelligenceNov-5-2023

The application of deep learning techniques in software engineering becomes increasingly popular. One key problem is developing high-quality and easy-to-use source code representations for code-related tasks. The research community has acquired impressive results in recent years. However, due to the deployment difficulties and performance bottlenecks, seldom these approaches are applied to the industry. In this paper, we present xASTNN, an eXtreme Abstract Syntax Tree (AST)-based Neural Network for source code representation, aiming to push this technique to industrial practice. The proposed xASTNN has three advantages. First, xASTNN is completely based on widely-used ASTs and does not require complicated data pre-processing, making it applicable to various programming languages and practical scenarios. Second, three closely-related designs are proposed to guarantee the effectiveness of xASTNN, including statement subtree sequence for code naturalness, gated recursive unit for syntactical information, and gated recurrent unit for sequential information. Third, a dynamic batching algorithm is introduced to significantly reduce the time complexity of xASTNN. Two code comprehension downstream tasks, code classification and code clone detection, are adopted for evaluation. The results demonstrate that our xASTNN can improve the state-of-the-art while being faster than the baselines.

machine learning, natural language, subtree, (18 more...)

arXiv.org Artificial Intelligence

2303.07104

Country:

Asia > China (0.47)
North America > United States > California > San Francisco County > San Francisco (0.16)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review

Yuan, Mingze, Bao, Peng, Yuan, Jiajia, Shen, Yunhao, Chen, Zifan, Xie, Yi, Zhao, Jie, Chen, Yang, Zhang, Li, Shen, Lin, Dong, Bin

arXiv.org Artificial IntelligenceNov-3-2023

With the rapid development of artificial intelligence, large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning. This has sparked significant interest in applying LLMs to enhance various aspects of healthcare, ranging from medical education to clinical decision support. However, medicine involves multifaceted data modalities and nuanced reasoning skills, presenting challenges for integrating LLMs. This paper provides a comprehensive review on the applications and implications of LLMs in medicine. It begins by examining the fundamental applications of general-purpose and specialized LLMs, demonstrating their utilities in knowledge retrieval, research support, clinical workflow automation, and diagnostic assistance. Recognizing the inherent multimodality of medicine, the review then focuses on multimodal LLMs, investigating their ability to process diverse data types like medical imaging and EHRs to augment diagnostic accuracy. To address LLMs' limitations regarding personalization and complex clinical reasoning, the paper explores the emerging development of LLM-powered autonomous agents for healthcare. Furthermore, it summarizes the evaluation methodologies for assessing LLMs' reliability and safety in medical contexts. Overall, this review offers an extensive analysis on the transformative potential of LLMs in modern medicine. It also highlights the pivotal need for continuous optimizations and ethical oversight before these models can be effectively integrated into clinical practice. Visit https://github.com/mingze-yuan/Awesome-LLM-Healthcare for an accompanying GitHub repository containing latest papers.

large language model, machine learning, preprint arxiv, (21 more...)

arXiv.org Artificial Intelligence

2311.01918

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Momentum Gradient-based Untargeted Attack on Hypergraph Neural Networks

Chen, Yang, Picek, Stjepan, Ye, Zhonglin, Wang, Zhaoyang, Zhao, Haixing

arXiv.org Artificial IntelligenceOct-24-2023

Hypergraph Neural Networks (HGNNs) have been successfully applied in various hypergraph-related tasks due to their excellent higher-order representation capabilities. Recent works have shown that deep learning models are vulnerable to adversarial attacks. Most studies on graph adversarial attacks have focused on Graph Neural Networks (GNNs), and the study of adversarial attacks on HGNNs remains largely unexplored. In this paper, we try to reduce this gap. We design a new HGNNs attack model for the untargeted attack, namely MGHGA, which focuses on modifying node features. We consider the process of HGNNs training and use a surrogate model to implement the attack before hypergraph modeling. Specifically, MGHGA consists of two parts: feature selection and feature modification. We use a momentum gradient mechanism to choose the attack node features in the feature selection module. In the feature modification module, we use two feature generation approaches (direct modification and sign gradient) to enable MGHGA to be employed on discrete and continuous datasets. We conduct extensive experiments on five benchmark datasets to validate the attack performance of MGHGA in the node and the visual object classification tasks. The results show that MGHGA improves performance by an average of 2% compared to the than the baselines.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.15656

Country: Asia > China > Qinghai Province (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

A Systematic Evaluation of Large Language Models on Out-of-Distribution Logical Reasoning Tasks

Bao, Qiming, Gendron, Gael, Peng, Alex Yuxuan, Zhong, Wanjun, Tan, Neset, Chen, Yang, Witbrock, Michael, Liu, Jiamou

arXiv.org Artificial IntelligenceOct-18-2023

Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of artificial systems on various natural language processing tasks to human-like levels. However, their generalisation and robustness to perform logical reasoning remain under-evaluated. To probe this ability, we propose three new logical reasoning datasets named "ReClor-plus", "LogiQA-plus" and "LogiQAv2-plus", each featuring three subsets: the first with randomly shuffled options, the second with the correct choices replaced by "none of the other options are correct", and a combination of the previous two subsets. We carry out experiments on these datasets with both discriminative and generative LLMs and show that these simple tricks greatly hinder the performance of the language models. Despite their superior performance on the original publicly available datasets, we find that all models struggle to answer our newly constructed datasets. We show that introducing task variations by perturbing a sizable training set can markedly improve the model's generalisation and robustness in logical reasoning tasks. Moreover, applying logic-driven data augmentation for fine-tuning, combined with prompting can enhance the generalisation performance of both discriminative large language models and generative large language models. These results offer insights into assessing and improving the generalisation and robustness of large language models for logical reasoning tasks. We make our source code and data publicly available \url{https://github.com/Strong-AI-Lab/Logical-and-abstract-reasoning}.

large language model, machine learning, natural language, (4 more...)

arXiv.org Artificial Intelligence

2310.0943

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

Add feedback

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?

Chen, Yang, Hu, Hexiang, Luan, Yi, Sun, Haitian, Changpinyo, Soravit, Ritter, Alan, Chang, Ming-Wei

arXiv.org Artificial IntelligenceOct-17-2023

Pre-trained vision and language models have demonstrated state-of-the-art capabilities over existing tasks involving images and texts, including visual question answering. However, it remains unclear whether these models possess the capability to answer questions that are not only querying visual content but knowledge-intensive and information-seeking. In this study, we introduce InfoSeek, a visual question answering dataset tailored for information-seeking questions that cannot be answered with only common sense knowledge. Using InfoSeek, we analyze various pre-trained visual question answering models and gain insights into their characteristics. Our findings reveal that state-of-the-art pre-trained multi-modal models (e.g., PaLI-X, BLIP2, etc.) face challenges in answering visual information-seeking questions, but fine-tuning on the InfoSeek dataset elicits models to use fine-grained knowledge that was learned during their pre-training. Furthermore, we show that accurate visual entity recognition can be used to improve performance on InfoSeek by retrieving relevant documents, showing a significant space for improvement.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2302.11713

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.76)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Enhancing Logical Reasoning of Large Language Models through Logic-Driven Data Augmentation

Bao, Qiming, Peng, Alex Yuxuan, Deng, Zhenyun, Zhong, Wanjun, Gendron, Gael, Pistotti, Timothy, Tan, Neset, Young, Nathan, Chen, Yang, Zhu, Yonghua, Denny, Paul, Witbrock, Michael, Liu, Jiamou

arXiv.org Artificial IntelligenceOct-14-2023

Combining large language models with logical reasoning enhance their capacity to address problems in a robust and reliable manner. Nevertheless, the intricate nature of logical reasoning poses challenges to gathering reliable data from web for building comprehensive training datasets, subsequently affecting the performance on downstream tasks. To address this, we introduce a novel logic-driven data augmentation approach, AMR-LDA. AMR-LDA converts the original text into an Abstract Meaning Representation (AMR) graph, a structured semantic representation that encapsulates the logic structure of the sentence, upon which operations are performed to generate logically modified AMR graphs. The modified AMR graphs are subsequently converted back into texts to create augmented data. Notably, our methodology is architecture-agnostic and enhances generative large language models, such as GPT-3.5 and GPT-4, through prompt augmentation, and fine-tuning discriminative large language models through contrastive learning with logic-driven data augmentation. Empirical evidence underscores the efficacy of our proposed method with improvement in performance across seven downstream tasks, such as logical reasoning reading comprehension, textual entailment, and natural language inference. Furthermore, our method ranked first on the ReClor leaderboard \url{https://eval.ai/web/challenges/challenge-page/503/leaderboard/1347}. The source code and data are publicly available \url{https://github.com/Strong-AI-Lab/Logical-Equivalence-driven-AMR-Data-Augmentation-for-Representation-Learning}.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.12599

Country:

North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)
Europe > Spain > Canary Islands (0.14)

Genre: Research Report (1.00)

Industry: Education > Assessment & Standards (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Deep Learning Predicts Biomarker Status and Discovers Related Histomorphology Characteristics for Low-Grade Glioma

Fang, Zijie, Liu, Yihan, Wang, Yifeng, Zhang, Xiangyang, Chen, Yang, Cai, Changjing, Lin, Yiyang, Han, Ying, Wang, Zhi, Zeng, Shan, Shen, Hong, Tan, Jun, Zhang, Yongbing

arXiv.org Artificial IntelligenceOct-11-2023

Biomarker detection is an indispensable part in the diagnosis and treatment of low-grade glioma (LGG). However, current LGG biomarker detection methods rely on expensive and complex molecular genetic testing, for which professionals are required to analyze the results, and intra-rater variability is often reported. To overcome these challenges, we propose an interpretable deep learning pipeline, a Multi-Biomarker Histomorphology Discoverer (Multi-Beholder) model based on the multiple instance learning (MIL) framework, to predict the status of five biomarkers in LGG using only hematoxylin and eosin-stained whole slide images and slide-level biomarker status labels. Specifically, by incorporating the one-class classification into the MIL framework, accurate instance pseudo-labeling is realized for instance-level supervision, which greatly complements the slide-level labels and improves the biomarker prediction performance. Multi-Beholder demonstrates superior prediction performance and generalizability for five LGG biomarkers (AUROC=0.6469-0.9735) in two cohorts (n=607) with diverse races and scanning protocols. Moreover, the excellent interpretability of Multi-Beholder allows for discovering the quantitative and qualitative correlations between biomarker status and histomorphology characteristics. Our pipeline not only provides a novel approach for biomarker prediction, enhancing the applicability of molecular treatments for LGG patients but also facilitates the discovery of new mechanisms in molecular functionality and LGG progression.

artificial intelligence, deep learning predict biomarker status, discover related histomorphology characteristic, (1 more...)

arXiv.org Artificial Intelligence

2310.07464

Genre: Research Report (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Automated Bug Generation in the era of Large Language Models

Ibrahimzada, Ali Reza, Chen, Yang, Rong, Ryan, Jabbarvand, Reyhaneh

arXiv.org Artificial IntelligenceOct-3-2023

Bugs are essential in software engineering; many research studies in the past decades have been proposed to detect, localize, and repair bugs in software systems. Effectiveness evaluation of such techniques requires complex bugs, i.e., those that are hard to detect through testing and hard to repair through debugging. From the classic software engineering point of view, a hard-to-repair bug differs from the correct code in multiple locations, making it hard to localize and repair. Hard-to-detect bugs, on the other hand, manifest themselves under specific test inputs and reachability conditions. These two objectives, i.e., generating hard-to-detect and hard-to-repair bugs, are mostly aligned; a bug generation technique can change multiple statements to be covered only under a specific set of inputs. However, these two objectives are conflicting for learning-based techniques: A bug should have a similar code representation to the correct code in the training data to challenge a bug prediction model to distinguish them. The hard-to-repair bug definition remains the same but with a caveat: the more a bug differs from the original code (at multiple locations), the more distant their representations are and easier to be detected. We propose BugFarm, to transform arbitrary code into multiple complex bugs. BugFarm leverages LLMs to mutate code in multiple locations (hard-to-repair). To ensure that multiple modifications do not notably change the code representation, BugFarm analyzes the attention of the underlying model and instructs LLMs to only change the least attended locations (hard-to-detect). Our comprehensive evaluation of 320k+ bugs from over 2.5M mutants generated by BugFarm and two alternative approaches demonstrates our superiority in generating bugs that are hard to detect by learning-based bug prediction approaches and hard to repair by SOTA learning-based program repair technique.

automated bug generation, large language model, natural language, (1 more...)

arXiv.org Artificial Intelligence

2310.02407

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Can Language Models be Instructed to Protect Personal Information?

Chen, Yang, Mendes, Ethan, Das, Sauvik, Xu, Wei, Ritter, Alan

arXiv.org Artificial IntelligenceOct-3-2023

Large multimodal language models have proven transformative in numerous applications. However, these models have been shown to memorize and leak pre-training data, raising serious user privacy and information security concerns. While data leaks should be prevented, it is also crucial to examine the trade-off between the privacy protection and model utility of proposed approaches. In this paper, we introduce PrivQA -- a multimodal benchmark to assess this privacy/utility trade-off when a model is instructed to protect specific categories of personal information in a simulated scenario. We also propose a technique to iteratively self-moderate responses, which significantly improves privacy. However, through a series of red-teaming experiments, we find that adversaries can also easily circumvent these protections with simple jailbreaking methods through textual and/or image inputs. We believe PrivQA has the potential to support the development of new models with improved privacy protections, as well as the adversarial robustness of these protections. We release the entire PrivQA dataset at https://llm-access-control.github.io/.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2310.02224

Country:

Europe (1.00)
North America > United States > Hawaii (0.14)
North America > United States > Louisiana (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > Europe Government (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)

Add feedback

Dronevision: An Experimental 3D Testbed for Flying Light Specks

Alimohammadzadeh, Hamed, Bernard, Rohit, Chen, Yang, Phan, Trung, Singh, Prashant, Zhu, Shuqin, Culbertson, Heather, Ghandeharizadeh, Shahram

arXiv.org Artificial IntelligenceAug-19-2023

Today's robotic laboratories for drones are housed in a large room. At times, they are the size of a warehouse. These spaces are typically equipped with permanent devices to localize the drones, e.g., Vicon Infrared cameras. Significant time is invested to fine-tune the localization apparatus to compute and control the position of the drones. One may use these laboratories to develop a 3D multimedia system with miniature sized drones configured with light sources. As an alternative, this brave new idea paper envisions shrinking these room-sized laboratories to the size of a cube or cuboid that sits on a desk and costs less than 10K dollars. The resulting Dronevision (DV) will be the size of a 1990s Television. In addition to light sources, its Flying Light Specks (FLSs) will be network-enabled drones with storage and processing capability to implement decentralized algorithms. The DV will include a localization technique to expedite development of 3D displays. It will act as a haptic interface for a user to interact with and manipulate the 3D virtual illuminations. It will empower an experimenter to design, implement, test, debug, and maintain software and hardware that realize novel algorithms in the comfort of their office without having to reserve a laboratory. In addition to enhancing productivity, it will improve safety of the experimenter by minimizing the likelihood of accidents. This paper introduces the concept of a DV, the research agenda one may pursue using this device, and our plans to realize one.

artificial intelligence, flss, human computer interaction, (14 more...)

arXiv.org Artificial Intelligence

2308.10121

Country:

Europe (0.93)
North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry:

Information Technology (0.68)
Transportation > Infrastructure & Services (0.46)
Transportation > Air (0.46)
Energy > Energy Storage (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Hardware (0.93)
(2 more...)

Add feedback