AITopics | information quality

Collaborating Authors

information quality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Requirements for Aligned, Dynamic Resolution of Conflicts in Operational Constraints

Jones, Steven J., Wray, Robert E., Laird, John E.

arXiv.org Artificial IntelligenceNov-19-2025

Deployed, autonomous AI systems must often evaluate multiple plausible courses of action (extended sequences of behavior) in novel or under-specified contexts. Despite extensive training, these systems will inevitably encounter scenarios where no available course of action fully satisfies all operational constraints (e.g., operating procedures, rules, laws, norms, and goals). To achieve goals in accordance with human expectations and values, agents must go beyond their trained policies and instead construct, evaluate, and justify candidate courses of action. These processes require contextual "knowledge" that may lie outside prior (policy) training. This paper characterizes requirements for agent decision making in these contexts. It also identifies the types of knowledge agents require to make decisions robust to agent goals and aligned with human expectations. Drawing on both analysis and empirical case studies, we examine how agents need to integrate normative, pragmatic, and situational understanding to select and then to pursue more aligned courses of action in complex, real-world environments.

artificial intelligence, conflict, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.10952

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry:

Government > Military > Navy (1.00)
Transportation > Marine (0.69)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

The Real Barrier to LLM Agent Usability is Agentic ROI

Liu, Weiwen, Qin, Jiarui, Huang, Xu, Zeng, Xingshan, Xi, Yunjia, Lin, Jianghao, Wu, Chuhan, Wang, Yasheng, Shang, Lifeng, Tang, Ruiming, Lian, Defu, Yu, Yong, Zhang, Weinan

arXiv.org Artificial IntelligenceMay-26-2025

Large Language Model (LLM) agents represent a promising shift in human-AI interaction, moving beyond passive prompt-response systems to autonomous agents capable of reasoning, planning, and goal-directed action. Despite the widespread application in specialized, high-effort tasks like coding and scientific research, we highlight a critical usability gap in high-demand, mass-market applications. This position paper argues that the limited real-world adoption of LLM agents stems not only from gaps in model capabilities, but also from a fundamental tradeoff between the value an agent can provide and the costs incurred during real-world use. Hence, we call for a shift from solely optimizing model performance to a broader, utility-driven perspective: evaluating agents through the lens of the overall agentic return on investment (Agent ROI). By identifying key factors that determine Agentic ROI--information quality, agent time, and cost--we posit a zigzag development trajectory in optimizing agentic ROI: first scaling up to improve the information quality, then scaling down to minimize the time and cost. We outline the roadmap across different development stages to bridge the current usability gaps, aiming to make LLM agents truly scalable, accessible, and effective in real-world contexts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.17767

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

Certified Mitigation of Worst-Case LLM Copyright Infringement

Zhang, Jingyu, Yu, Jiacan, Marone, Marc, Van Durme, Benjamin, Khashabi, Daniel

arXiv.org Artificial IntelligenceApr-24-2025

The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods, post-training approaches aimed at preventing models from generating content substantially similar to copyrighted ones. While current mitigation approaches are somewhat effective for average-case risks, we demonstrate that they overlook worst-case copyright risks exhibits by the existence of long, verbatim quotes from copyrighted sources. We propose BloomScrub, a remarkably simple yet highly effective inference-time approach that provides certified copyright takedown. Our method repeatedly interleaves quote detection with rewriting techniques to transform potentially infringing segments. By leveraging efficient data sketches (Bloom filters), our approach enables scalable copyright screening even for large-scale real-world corpora. When quotes beyond a length threshold cannot be removed, the system can abstain from responding, offering certified risk reduction. Experimental results show that BloomScrub reduces infringement risk, preserves utility, and accommodates different levels of enforcement stringency with adaptive abstention. Our results suggest that lightweight, inference-time methods can be surprisingly effective for copyright prevention.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2504.16046

Country:

Asia > Middle East > Iraq (0.05)
North America > United States > New Jersey (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Towards Trustable Language Models: Investigating Information Quality of Large Language Models

Rejeleene, Rick, Xu, Xiaowei, Talburt, John

arXiv.org Artificial IntelligenceJan-23-2024

Large language models (LLM) are generating information at a rapid pace, requiring users to increasingly rely and trust the data. Despite remarkable advances of LLM, Information generated by LLM is not completely trustworthy, due to challenges in information quality. Specifically, integrity of Information quality decreases due to unreliable, biased, tokenization during pre-training of LLM. Moreover, due to decreased information quality issues, has led towards hallucination, fabricated information. Unreliable information can lead towards flawed decisions in businesses, which impacts economic activity. In this work, we introduce novel mathematical information quality evaluation of LLM, we furthermore analyze and highlight information quality challenges, scaling laws to systematically scale language models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2401.13086

Country:

North America > United States > Arkansas > Pulaski County > Little Rock (0.04)
North America > United States > Texas > Lavaca County (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SemantIC: Semantic Interference Cancellation Towards 6G Wireless Communications

Lin, Wensheng, Yan, Yuna, Li, Lixin, Han, Zhu, Matsumoto, Tad

arXiv.org Artificial IntelligenceOct-19-2023

This letter proposes a novel anti-interference technique, semantic interference cancellation (SemantIC), for enhancing information quality towards the sixth-generation (6G) wireless networks. SemantIC only requires the receiver to concatenate the channel decoder with a semantic auto-encoder. This constructs a turbo loop which iteratively and alternately eliminates noise in the signal domain and the semantic domain. From the viewpoint of network information theory, the neural network of the semantic auto-encoder stores side information by training, and provides side information in iterative decoding, as an implementation of the Wyner-Ziv theorem. Simulation results verify the performance improvement by SemantIC without extra channel resource cost.

artificial intelligence, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.12768

Country:

North America > United States > Texas > Harris County > Houston (0.14)
North America > Canada > Ontario > Toronto (0.05)
Asia > South Korea > Seoul > Seoul (0.05)
(4 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ForestTrav: Accurate, Efficient and Deployable Forest Traversability Estimation for Autonomous Ground Vehicles

Ruetz, Fabio, Lawrance, Nicholas, Hernández, Emili, Borges, Paulo, Peynot, Thierry

arXiv.org Artificial IntelligenceMay-22-2023

Autonomous navigation in unstructured vegetated environments remains an open challenge. To successfully operate in these settings, ground vehicles must assess the traversability of the environment and determine which vegetation is pliable enough to push through. In this work, we propose a novel method that combines a high-fidelity and feature-rich 3D voxel representation while leveraging the structural context and sparseness of \acfp{SCNN} to assess \ac{TE} in densely vegetated environments. The proposed method is thoroughly evaluated on an accurately-labeled real-world data set that we provide to the community. It is shown to outperform state-of-the-art methods by a significant margin (0.59 vs. 0.39 MCC score at 0.1m voxel resolution) in challenging scenes and to generalize to unseen environments. In addition, the method is economical in the amount of training data and training time required: a model is trained in minutes on a desktop computer. We show that by exploiting the context of the environment, our method can use different feature combinations with only limited performance variations. For example, our approach can be used with lidar-only features, whilst still assessing complex vegetated environments accurately, which was not demonstrated previously in the literature in such environments. In addition, we propose an approach to assess a traversability estimator's sensitivity to information quality and show our method's sensitivity is low.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2305.12705

Country: Oceania > Australia > Queensland > Brisbane (0.04)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Growing Cybersecurity Concerns are Threatening Web 3.0

#artificialintelligenceFeb-22-2022, 08:24:24 GMT

Web 3.0 is the generation of interest where apps and websites can analyze data like a human with the help of Machine Learning, Big Data, and decentralized ledger technologies. Data here is decentralized and open, unlike Web 2.0 and it is autonomous and intelligent. Cybersecurity is the most important in the technology world. And as Web 3.0 develops, more cybersecurity risks will come to light. At present risks like Information quality, Data Availability, Data Confidentiality, and Data Manipulation are being the major concerns.

artificial intelligence, information, machine learning, (11 more...)

#artificialintelligence

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.86)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.54)

Add feedback

Amazon.com: Entity Information Life Cycle for Big Data: Master Data Management and Information Integration (9780128005378): John R. Talburt, Yinle Zhou: Books

@machinelearnbotOct-29-2016, 22:25:54 GMT

Dr. John R. Talburt is Professor of Information Science at the University of Arkansas at Little Rock (UALR) where he is the Coordinator for the Information Quality Graduate Program and the Executive Director of the UALR Center for Advanced Research in Entity Resolution and Information Quality (ERIQ). He is also the Chief Scientist for Black Oak Partners, LLC, an information quality solutions company. Prior to his appointment at UALR he was the leader for research and development and product innovation at Acxiom Corporation, a global leader in information management and customer data integration. Professor Talburt holds several patents related to customer data integration and the author of numerous articles on information quality and entity resolution, and is the author of Entity Resolution and Information Quality (Morgan Kaufmann, 2011). He also holds the IAIDQ Information Quality Certified Professional (IQCP) credential.

artificial intelligence, data quality, information quality, (12 more...)

@machinelearnbot

Country:

North America > United States > Arkansas (0.31)
North America > United States > Texas > Travis County > Austin (0.08)
Asia > China > Jiangsu Province > Nanjing (0.08)

Industry: Retail > Online (0.40)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.97)

Add feedback