AITopics | Bao, Forrest Sheng

Collaborating Authors

Bao, Forrest Sheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs

Bao, Forrest Sheng, Li, Miaoran, Qu, Renyi, Luo, Ge, Wan, Erana, Tang, Yujia, Fan, Weisi, Tamber, Manveer Singh, Kazi, Suleman, Sourabh, Vivek, Qi, Mike, Tu, Ruixuan, Xu, Chenyu, Gonzales, Matthew, Mendelevitch, Ofer, Ahmad, Amin

arXiv.org Artificial IntelligenceOct-17-2024

Summarization is one of the most common tasks performed by large language models (LLMs), especially in applications like Retrieval-Augmented Generation (RAG). However, existing evaluations of hallucinations in LLM-generated summaries, and evaluations of hallucination detection models both suffer from a lack of diversity and recency in the LLM and LLM families considered. This paper introduces FaithBench, a summarization hallucination benchmark comprising challenging hallucinations made by 10 modern LLMs from 8 different families, with ground truth annotations by human experts. ``Challenging'' here means summaries on which popular, state-of-the-art hallucination detection models, including GPT-4o-as-a-judge, disagreed on. Our results show GPT-4o and GPT-3.5-Turbo produce the least hallucinations. However, even the best hallucination detection models have near 50\% accuracies on FaithBench, indicating lots of room for future improvement. The repo is https://github.com/vectara/FaithBench

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.1321

Country:

Asia (0.69)
North America > United States > California (0.68)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

Bao, Forrest Sheng, Tu, Ruixuan, Luo, Ge, Yang, Yinfei, Li, Hebi, Qiu, Minghui, He, Youbiao, Chen, Cen

arXiv.org Artificial IntelligenceNov-26-2023

Automated summary quality assessment falls into two categories: reference-based and reference-free. Reference-based metrics, historically deemed more accurate due to the additional information provided by human-written references, are limited by their reliance on human input. In this paper, we hypothesize that the comparison methodologies used by some reference-based metrics to evaluate a system summary against its corresponding reference can be effectively adapted to assess it against its source document, thereby transforming these metrics into reference-free ones. Experimental results support this hypothesis. After being repurposed reference-freely, the zero-shot BERTScore using the pretrained DeBERTa-large-MNLI model of <0.5B parameters consistently outperforms its original reference-based version across various aspects on the SummEval and Newsroom datasets. It also excels in comparison to most existing reference-free metrics and closely competes with zero-shot summary evaluators based on GPT-3.5.

bertscore, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2212.10013

Country:

Asia (0.68)
Europe (0.67)
North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre:

Research Report (0.65)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Circuit Routing Using Monte Carlo Tree Search and Deep Neural Networks

He, Youbiao, Bao, Forrest Sheng

arXiv.org Artificial IntelligenceJun-24-2020

Circuit routing is a fundamental problem in designing electronic systems such as integrated circuits (ICs) and printed circuit boards (PCBs) which form the hardware of electronics and computers. Like finding paths between pairs of locations, circuit routing generates traces of wires to connect contacts or leads of circuit components. It is challenging because finding paths between dense and massive electronic components involves a very large search space. Existing solutions are either manually designed with domain knowledge or tailored to specific design rules, hence, difficult to adapt to new problems or design needs. Therefore, a general routing approach is highly desired. In this paper, we model the circuit routing as a sequential decision-making problem, and solve it by Monte Carlo tree search (MCTS) with deep neural network (DNN) guided rollout. It could be easily extended to routing cases with more routing constraints and optimization goals. Experiments on randomly generated single-layer circuits show the potential to route complex circuits. The proposed approach can solve the problems that benchmark methods such as sequential A* method and Lee's algorithm cannot solve, and can also outperform the vanilla MCTS approach.

deep learning, neural network, routing, (19 more...)

arXiv.org Artificial Intelligence

2006.13607

Country: North America > United States (0.94)

Genre: Research Report (1.00)

Industry:

Semiconductors & Electronics (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Triaging moderate COVID-19 and other viral pneumonias from routine blood tests

Bao, Forrest Sheng, He, Youbiao, Liu, Jie, Chen, Yuanfang, Li, Qian, Zhang, Christina R., Han, Lei, Zhu, Baoli, Ge, Yaorong, Chen, Shi, Xu, Ming, Ouyang, Liu

arXiv.org Machine LearningMay-13-2020

The COVID-19 is sweeping the world with deadly consequences. Its contagious nature and clinical similarity to other pneumonias make separating subjects contracted with COVID-19 and non-COVID-19 viral pneumonia a priority and a challenge. However, COVID-19 testing has been greatly limited by the availability and cost of existing methods, even in developed countries like the US. Intrigued by the wide availability of routine blood tests, we propose to leverage them for COVID-19 testing using the power of machine learning. Two proven-robust machine learning model families, random forests (RFs) and support vector machines (SVMs), are employed to tackle the challenge. Trained on blood data from 208 moderate COVID-19 subjects and 86 subjects with non-COVID-19 moderate viral pneumonia, the best result is obtained in an SVM-based classifier with an accuracy of 84%, a sensitivity of 88%, a specificity of 80%, and a precision of 92%. The results are found explainable from both machine learning and medical perspectives. A privacy-protected web portal is set up to help medical personnel in their practice and the trained models are released for developers to further build other applications. We hope our results can help the world fight this pandemic and welcome clinical verification of our approach on larger populations.

covid-19, health & medicine, immunology, (20 more...)

arXiv.org Machine Learning

2005.06546

Country:

North America > United States (1.00)
Asia (0.70)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

RLScheduler: Learn to Schedule HPC Batch Jobs Using Deep Reinforcement Learning

Zhang, Di, Dai, Dong, He, Youbiao, Bao, Forrest Sheng

arXiv.org Artificial IntelligenceOct-20-2019

We present RLScheduler, a deep reinforcement learning based job scheduler for scheduling independent batch jobs in high-performance computing (HPC) environment. From knowing nothing about scheduling at beginning, RLScheduler is able to autonomously learn how to effectively schedule HPC batch jobs, targeting a given optimization goal. This is achieved by deep reinforcement learning with the help of specially designed neural network structures and various optimizations to stabilize and accelerate the learning. Our results show that RLScheduler can outperform existing heuristic scheduling algorithms, including a manually fine-tuned machine learning-based scheduler on the same workload. More importantly, we show that RLScheduler does not blindly over-fit the given workload to achieve such optimization, instead, it learns general rules for scheduling batch jobs which can be further applied to different workloads and systems to achieve similarly optimized performance. We also demonstrate that RLScheduler is capable of adjusting itself along with changing goals and workloads, making it an attractive solution for the future autonomous HPC management.

deep learning, neural network, rlscheduler, (18 more...)

arXiv.org Artificial Intelligence

1910.08925

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Accelerating SAT Solving by Common Subclause Elimination

Yan, Yaowei (University of Akron) | Gutierrez, Chris E. (Texas Tech University) | Jn-Charles, Jeriah (Texas Tech University) | Bao, Forrest Sheng (University of Akron) | Zhang, Yuanlin (Texas Tech University)

AAAI ConferencesMar-6-2015

Boolean SATisfiability (SAT) is an important problem in AI. SAT solvers have been effectively used in important industrial applications including automated planning and verification. In this paper, we present novel algorithms for fast SAT solving by employing two common subclause elimination (CSE) approaches. Our motivation is that modern SAT solving techniques can be more efficient on CSE-processed instances. Empirical study shows that CSE can significantly speed up SAT solving.

artificial intelligence, constraint-based reasoning, subclause, (15 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas (0.15)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Temporally Expressive Planning Based on Answer Set Programming with Constraints

Bao, Forrest Sheng (Texas Tech University) | Zhang, Yuanlin (Texas Tech University)

AAAI ConferencesJul-21-2012

Recently, a new language AC(C) was proposed to integrate answer set programming (ASP) and constraint logic programming (CLP). In this paper, we show that temporally expressive planning problems in PDDL2.1 can be translated into AC(C) and solved using AC(C) solvers. Compared with existing approaches, the new approach puts less restrictions on the planning problems and is easy to extend with new features like PDDL axioms. It can also leverage the inference engine for AC(C) which has the potential to exploit the best reasoning mechanisms developed in the ASP, SAT and CP communities.

Add feedback

Combining Probabilistic Planning and Logic Programming on Mobile Robots

Zhang, Shiqi (Texas Tech University) | Bao, Forrest Sheng (Texas Tech University) | Sridharan, Mohan (Texas Tech University)

AAAI ConferencesJul-21-2012

Key challenges to widespread deployment of mobile robots to interact with humans in real-world domains include the ability to: (a) robustly represent and revise domain knowledge; (b) autonomously adapt sensing and processing to the task at hand; and (c) learn from unreliable high-level human feedback. Partially observable Markov decision processes (POMDPs) have been used to plan sensing and navigation in different application domains. It is however a challenge to include common sense knowledge obtained from sensory or human inputs in POMDPs. In addition, information extracted from sensory and human inputs may have varying levels of relevance to current and future tasks. On the other hand, although a non-monotonic logic programming paradigm such as Answer Set Programming (ASP) is wellsuited for common sense reasoning, it is unable to model the uncertainty in real-world sensing and navigation (Gelfond 2008). This paper presents a hybrid framework that integrates ASP, hierarchical POMDPs (Zhang and Sridharan 2012) and psychophysics principles to address the challenges stated above. Experimental results in simulation and on mobile robots deployed in indoor domains show that the framework results in reliable and efficient operation.

Add feedback

Medical Treatment Conflict Resolving in Answer Set Programming

Bao, Forrest Sheng (Texas Tech University) | Zhang, Zhizheng (Southeast University) | Zhang, Yuanlin (Texas Tech University)

AAAI ConferencesAug-4-2011

Medical treatment decision making is a good application of knowledge representation and reasoning. We are particularly interested in using it to resolve treatment conflicts, a complicated condition when two treatments cannot be given simultaneously to a patient of multiple symptoms. The logic system is required to reason on cases with and without treatment conflicts. Thanks to the nonmonotonicity of Answer Set Programming (ASP), we elegantly automate medical treatment conflict resolving on an example problem and show the importance of nonmonotonicity in medical reasoning.

logic programming, treatment conflict, vascular disease, (18 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas (0.18)
Asia > China (0.15)

Industry:

Health & Medicine > Diagnostic Medicine (0.35)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

The AC(C) Language: Integrating Answer Set Programming and Constraint Logic Programming

Bao, Forrest Sheng (Texas Tech University)

AAAI ConferencesAug-4-2011

Combining Answer Set Programming (ASP) and Constraint Logic Programming (CLP) can create a more powerful language for knowledge representation and reasoning. The language AC(C) is designed to integrate ASP and CLP. Compared with existing integration of ASP and CSP, AC(C) allows representing user-defined constraints. Such integration provides great power for applications requiring logical reasoning involving constraints, e.g., temporal planning. In AC(C), user-defined and primitive constraints can be solved by a CLP inference engine while the logical reasoning over those constraints and regular logic literals is solved by an ASP inference engine (i.e., solver). My PhD work includes improving the language AC(C), implementing its faster inference engine and investigating how effective the new system can be used to solve a challenging application, temporal planning.

artificial intelligence, logic programming, solver, (13 more...)

AAAI Conferences

Sixteenth AAAI/SIGART Doctoral Consortium

Country: North America > United States > Texas (0.16)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback