AITopics | Du, Min

Collaborating Authors

Du, Min

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FACTS About Building Retrieval Augmented Generation-based Chatbots

Akkiraju, Rama, Xu, Anbang, Bora, Deepak, Yu, Tan, An, Lu, Seth, Vishal, Shukla, Aaditya, Gundecha, Pritam, Mehta, Hridhay, Jha, Ashwin, Raj, Prithvi, Balasubramanian, Abhinav, Maram, Murali, Muthusamy, Guru, Annepally, Shivakesh Reddy, Knowles, Sidney, Du, Min, Burnett, Nick, Javiya, Sean, Marannan, Ashok, Kumari, Mamta, Jha, Surbhi, Dereszenski, Ethan, Chakraborty, Anupam, Ranjan, Subhash, Terfai, Amina, Surya, Anoop, Mercer, Tracey, Thanigachalam, Vinodh Kumar, Bar, Tamar, Krishnan, Sanjana, Kilaru, Samy, Jaksic, Jasmine, Algarici, Nave, Liberman, Jacob, Conway, Joey, Nayyar, Sonu, Boitano, Justin

arXiv.org Artificial IntelligenceJul-10-2024

Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots."

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2407.07858

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.37)

Add feedback

Temporal-Spatial Entropy Balancing for Causal Continuous Treatment-Effect Estimation

Hu, Tao, Zhang, Honglong, Zeng, Fan, Du, Min, Du, XiangKun, Zheng, Yue, Li, Quanqi, Zhang, Mengran, Yang, Dan, Wu, Jihao

arXiv.org Artificial IntelligenceDec-18-2023

In the field of intracity freight transportation, changes in order volume are significantly influenced by temporal and spatial factors. When building subsidy and pricing strategies, predicting the causal effects of these strategies on order volume is crucial. In the process of calculating causal effects, confounding variables can have an impact. Traditional methods to control confounding variables handle data from a holistic perspective, which cannot ensure the precision of causal effects in specific temporal and spatial dimensions. However, temporal and spatial dimensions are extremely critical in the logistics field, and this limitation may directly affect the precision of subsidy and pricing strategies. To address these issues, this study proposes a technique based on flexible temporal-spatial grid partitioning. Furthermore, based on the flexible grid partitioning technique, we further propose a continuous entropy balancing method in the temporal-spatial domain, which named TS-EBCT (Temporal-Spatial Entropy Balancing for Causal Continue Treatments). The method proposed in this paper has been tested on two simulation datasets and two real datasets, all of which have achieved excellent performance. In fact, after applying the TS-EBCT method to the intracity freight transportation field, the prediction accuracy of the causal effect has been significantly improved. It brings good business benefits to the company's subsidy and pricing strategies.

artificial intelligence, correlation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2312.0867

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > Experimental Study (0.30)

Industry: Transportation > Freight & Logistics Services (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Pseudo Label-Guided Data Fusion and Output Consistency for Semi-Supervised Medical Image Segmentation

Wang, Tao, Chen, Yuanbin, Zhang, Xinlin, Zhou, Yuanbo, Lan, Junlin, Bai, Bizhe, Tan, Tao, Du, Min, Gao, Qinquan, Tong, Tong

arXiv.org Artificial IntelligenceNov-17-2023

Supervised learning algorithms based on Convolutional Neural Networks have become the benchmark for medical image segmentation tasks, but their effectiveness heavily relies on a large amount of labeled data. However, annotating medical image datasets is a laborious and time-consuming process. Inspired by semi-supervised algorithms that use both labeled and unlabeled data for training, we propose the PLGDF framework, which builds upon the mean teacher network for segmenting medical images with less annotation. We propose a novel pseudo-label utilization scheme, which combines labeled and unlabeled data to augment the dataset effectively. Additionally, we enforce the consistency between different scales in the decoder module of the segmentation network and propose a loss function suitable for evaluating the consistency. Moreover, we incorporate a sharpening operation on the predicted results, further enhancing the accuracy of the segmentation. Extensive experiments on three publicly available datasets demonstrate that the PLGDF framework can largely improve performance by incorporating the unlabeled data. Meanwhile, our framework yields superior performance compared to six state-of-the-art semi-supervised learning methods. The codes of this study are available at https://github.com/ortonwang/PLGDF.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.10349

Country:

Asia (0.68)
North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Contrastive Credibility Propagation for Reliable Semi-Supervised Learning

Kutt, Brody, Ramteke, Pralay, Mignot, Xavier, Toman, Pamela, Ramanan, Nandini, Chhetri, Sujit Rokka, Huang, Shan, Du, Min, Hewlett, William

arXiv.org Artificial IntelligenceAug-29-2023

Consequently, such systems necessitate external components like Out-of-Distribution (OOD) A fundamental goal of semi-supervised learning (SSL) is to detectors to prevent failures, albeit at the cost of increased ensure the use of unlabeled data results in a classifier that outperforms complexity. Instead of maximizing the robustness to any one a baseline trained only on labeled data (supervised data variable, we strive to build an SSL algorithm that is baseline). However, this is often not the case (Oliver et al. robust to all data variables, i.e. can match or outperform a 2018). The problem is often overlooked as SSL algorithms supervised baseline. To address this challenge, we first hypothesize are frequently evaluated only on clean and balanced datasets that sensitivity to pseudo-label errors is the root where the sole experimental variable is the number of given cause of all failures. This rationale is based on the simple labels. Worse, in the pursuit of maximizing label efficiency, fact that a hypothetical SSL algorithm consisting of a pseudolabeler many modern SSL algorithms such as (Berthelot et al. 2019; with a rejection option and means to build a classifier Sohn et al. 2020; Zheng et al. 2022; Li, Xiong, and Hoi 2021) could always match or outperform its supervised baseline if and others rely on a mechanism that directly encourages the the pseudo-labeler made no mistakes. Such a pseudo-labeler marginal distribution of label predictions to be close to the is unrealistic, of course. Instead, we build into our solution marginal distribution of ground truth labels (known as distribution means to work around those inevitable errors.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.09929

Country:

North America > United States > Michigan (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems

Guo, Wenbo, Wang, Lun, Xing, Xinyu, Du, Min, Song, Dawn

arXiv.org Artificial IntelligenceAug-8-2019

A trojan backdoor is a hidden pattern typically implanted in a deep neural network. It could be activated and thus forces that infected model behaving abnormally only when an input data sample with a particular trigger present is fed to that model. As such, given a deep neural network model and clean input samples, it is very challenging to inspect and determine the existence of a trojan backdoor. Recently, researchers design and develop several pioneering solutions to address this acute problem. They demonstrate the proposed techniques have a great potential in trojan detection. However, we show that none of these existing techniques completely address the problem. On the one hand, they mostly work under an unrealistic assumption (e.g. assuming availability of the contaminated training database). On the other hand, the proposed techniques cannot accurately detect the existence of trojan backdoors, nor restore high-fidelity trojan backdoor images, especially when the triggers pertaining to the trojan vary in size, shape and position. In this work, we propose TABOR, a new trojan detection technique. Conceptually, it formalizes a trojan detection task as a non-convex optimization problem, and the detection of a trojan backdoor as the task of resolving the optimization through an objective function. Different from the existing technique also modeling trojan detection as an optimization problem, TABOR designs a new objective function--under the guidance of explainable AI techniques as well as heuristics--that could guide optimization to identify a trojan backdoor in a more effective fashion. In addition, TABOR defines a new metric to measure the quality of a trojan backdoor identified. Using an anomaly detection method, we show the new metric could better facilitate TABOR to identify intentionally injected triggers in an infected model and filter out false alarms......

deep learning, ground transportation, neural network, (19 more...)

arXiv.org Artificial Intelligence

1908.01763

Country: North America > United States (0.46)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Curriculum Adversarial Training

Cai, Qi-Zhi, Du, Min, Liu, Chang, Song, Dawn

arXiv.org Machine LearningMay-12-2018

Recently, deep learning has been applied to many security-sensitive applications, such as facial authentication. The existence of adversarial examples hinders such applications. The state-of-the-art result on defense shows that adversarial training can be applied to train a robust model on MNIST against adversarial examples; but it fails to achieve a high empirical worst-case accuracy on a more complex task, such as CIFAR-10 and SVHN. In our work, we propose curriculum adversarial training (CAT) to resolve this issue. The basic idea is to develop a curriculum of adversarial examples generated by attacks with a wide range of strengths. With two techniques to mitigate the forgetting and the generalization issues, we demonstrate that CAT can improve the prior art's empirical worst-case accuracy by a large margin of 25% on CIFAR-10 and 35% on SVHN. At the same, the model's performance on non-adversarial inputs is comparable to the state-of-the-art models.

adversarial example, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1805.04807

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Security & Privacy (0.88)

Add feedback