AITopics | itemset

Collaborating Authors

itemset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Interpretable Data Mining of Follicular Thyroid Cancer Ultrasound Features Using Enhanced Association Rules

Zhou, Songlin, Zhou, Tao, Li, Xin, Yau, Stephen Shing-Toung

arXiv.org Artificial IntelligenceSep-17-2025

Purpose: Thyroid cancer has been a common cancer. Papillary thyroid cancer and follicular thyroid cancer are the two most common types of thyroid cancer. Follicular thyroid cancer lacks distinctive ultrasound signs and is more difficult to diagnose preoperatively than the more prevalent papillary thyroid cancer, and the clinical studies associated with it are less well established. We aimed to analyze the clinical data of follicular thyroid cancer based on a novel data mining tool to identify some clinical indications that may help in preoperative diagnosis. Methods: We performed a retrospective analysis based on case data collected by the Department of General Surgery of Peking University Third Hospital between 2010 and 2023. Unlike traditional statistical methods, we improved the association rule mining, a classical data mining method, and proposed new analytical metrics reflecting the malignant association between clinical indications and cancer with the help of the idea of SHAP method in interpretable machine learning. Results: The dataset was preprocessed to contain 1673 cases (in terms of nodes rather than patients), of which 1414 were benign and 259 were malignant nodes. Our analysis pointed out that in addition to some common indicators (e.g., irregular or lobulated nodal margins, uneven thickness halo, hypoechogenicity), there were also some indicators with strong malignant associations, such as nodule-in-nodule pattern, trabecular pattern, and low TSH scores. In addition, our results suggest that the combination of Hashimoto's thyroiditis may also have a strong malignant association. Conclusion: In the preoperative diagnosis of nodules suspected of follicular thyroid cancer, multiple clinical indications should be considered for a more accurate diagnosis. The diverse malignant associations identified in our study may serve as a reference for clinicians in related fields.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.12238

Country: Asia > China (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Thyroid Cancer (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Attack Pattern Mining to Discover Hidden Threats to Industrial Control Systems

Umer, Muhammad Azmi, Ahmed, Chuadhry Mujeeb, Mathur, Aditya, Jilani, Muhammad Taha

arXiv.org Artificial IntelligenceAug-7-2025

This work focuses on validation of attack pattern mining in the context of Industrial Control System (ICS) security. A comprehensive security assessment of an ICS requires generating a large and variety of attack patterns. For this purpose we have proposed a data driven technique to generate attack patterns for an ICS. The proposed technique has been used to generate over 100,000 attack patterns from data gathered from an operational water treatment plant. In this work we present a detailed case study to validate the attack patterns.

attack pattern, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2508.04561

Country:

Asia (0.29)
Europe (0.28)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Water & Waste Management > Water Management > Water Supplies & Services (1.00)
Water & Waste Management > Water Management > Lifecycle > Treatment (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.70)
(2 more...)

Add feedback

Mining Voter Behaviour and Confidence: A Rule-Based Analysis of the 2022 U.S. Elections

Jubair, Md Al, Arefin, Mohammad Shamsul, Reza, Ahmed Wasif

arXiv.org Artificial IntelligenceJul-22-2025

This study explores the relationship between voter trust and their experiences during elections by applying a rule-based data mining technique to the 2022 Survey of the Performance of American Elections (SPAE). Using the Apriori algorithm and setting parameters to capture meaningful associations (support >= 3%, confidence >= 60%, and lift > 1.5), the analysis revealed a strong connection between demographic attributes and voting-related challenges, such as registration hurdles, accessibility issues, and queue times. For instance, respondents who indicated that accessing polling stations was "very easy" and who reported moderate confidence were found to be over six times more likely (lift = 6.12) to trust their county's election outcome and experience no registration issues. A further analysis, which adjusted the support threshold to 2%, specifically examined patterns among minority voters. It revealed that 98.16 percent of Black voters who reported easy access to polling locations also had smooth registration experiences. Additionally, those who had high confidence in the vote-counting process were almost two times as likely to identify as Democratic Party supporters. These findings point to the important role that enhancing voting access and offering targeted support can play in building trust in the electoral system, particularly among marginalized communities.

artificial intelligence, machine learning, registration issue, (14 more...)

arXiv.org Artificial Intelligence

2507.14236

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.90)

Add feedback

Hashing for Fast Pattern Set Selection

Karjalainen, Maiju, Miettinen, Pauli

arXiv.org Artificial IntelligenceJul-14-2025

Pattern set mining, which is the task of finding a good set of patterns instead of all patterns, is a fundamental problem in data mining. Many different definitions of what constitutes a good set have been proposed in recent years. In this paper, we consider the reconstruction error as a proxy measure for the goodness of the set, and concentrate on the adjacent problem of how to find a good set efficiently. We propose a method based on bottom-k hashing for efficiently selecting the set and extend the method for the common case where the patterns might only appear in approximate form in the data. Our approach has applications in tiling databases, Boolean matrix factorization, and redescription mining, among others. We show that our hashing-based approach is significantly faster than the standard greedy algorithm while obtaining almost equally good results in both synthetic and real-world data sets.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2507.08745

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)

Add feedback

Alice and the Caterpillar: A more descriptive null model for assessing data mining results

Preti, Giulia, Morales, Gianmarco De Francisci, Riondato, Matteo

arXiv.org Artificial IntelligenceJun-12-2025

We introduce novel null models for assessing the results obtained from observed binary transactional and sequence datasets, using statistical hypothesis testing. Our null models maintain more properties of the observed dataset than existing ones. Specifically, they preserve the Bipartite Joint Degree Matrix of the bipartite (multi-)graph corresponding to the dataset, which ensures that the number of caterpillars, i.e., paths of length three, is preserved, in addition to other properties considered by other models. We describe Alice, a suite of Markov chain Monte Carlo algorithms for sampling datasets from our null models, based on a carefully defined set of states and efficient operations to move between them. The results of our experimental evaluation show that Alice mixes fast and scales well, and that our null model finds different significant results than ones previously considered in the literature.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10115-023-02001-6

2506.09764

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Identifying and Characterising Higher Order Interactions in Mobility Networks Using Hypergraphs

Sambaturu, Prathyush, Gutierrez, Bernardo, Kraemer, Moritz U. G.

arXiv.org Artificial IntelligenceMar-24-2025

Human mobility data is crucial for understanding patterns of movement across geographical regions, with applications spanning urban planning[1], transportation systems design[2], infectious disease modeling and control [3, 4], and social dynamics studies [5]. Traditionally, mobility data has been represented using flow networks[6, 7] or colocation matrices [8], where the primary representation is via pairwise interactions. In flow networks, this means directed edges represent the movement of individuals between two locations; colocation matrices measure the probability that a random individual from a region is colocated with a random individual from another region at the same location. These data types and their pairwise representation structure have been used to identify the spatial scales and regularity of human mobility, but have inherent limitations in their capacity to capture more complex patterns of human movement involving higher-order interactions between locations - that is, group of locations that are frequently visited by many individuals within a period of time (e.g., a week) and revisited regularly over time. Higher-order interactions between locations can contain crucial information under certain scenarios.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2503.18572

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Transportation > Infrastructure & Services (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.46)

Add feedback

Meaningless is better: hashing bias-inducing words in LLM prompts improves performance in logical reasoning and statistical learning

Chadimová, Milena, Jurášek, Eduard, Kliegr, Tomáš

arXiv.org Artificial IntelligenceNov-26-2024

This paper introduces a novel method, referred to as "hashing", which involves masking potentially bias-inducing words in large language models (LLMs) with hash-like meaningless identifiers to reduce cognitive biases and reliance on external knowledge. The method was tested across three sets of experiments involving a total of 490 prompts. Statistical analysis using chi-square tests showed significant improvements in all tested scenarios, which covered LLama, ChatGPT, Copilot, Gemini and Mixtral models. In the first experiment, hashing decreased the fallacy rate in a modified version of the "Linda" problem aimed at evaluating susceptibility to cognitive biases. In the second experiment, it improved LLM results on the frequent itemset extraction task. In the third experiment, we found hashing is also effective when the Linda problem is presented in a tabular format rather than text, indicating that the technique works across various input representations. Overall, the method was shown to improve bias reduction and incorporation of external knowledge. Despite bias reduction, hallucination rates were inconsistently reduced across types of LLM models. These findings suggest that masking bias-inducing terms can improve LLM performance, although its effectiveness is model- and task-dependent.

conjunction fallacy, experiment, itemset, (15 more...)

arXiv.org Artificial Intelligence

2411.17304

Country:

Europe > Czechia > Prague (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RPS: A Generic Reservoir Patterns Sampler

Diop, Lamine, Plantevit, Marc, Soulet, Arnaud

arXiv.org Artificial IntelligenceOct-31-2024

Efficient learning from streaming data is important for modern data analysis due to the continuous and rapid evolution of data streams. Despite significant advancements in stream pattern mining, challenges persist, particularly in managing complex data streams like sequential and weighted itemsets. While reservoir sampling serves as a fundamental method for randomly selecting fixed-size samples from data streams, its application to such complex patterns remains largely unexplored. In this study, we introduce an approach that harnesses a weighted reservoir to facilitate direct pattern sampling from streaming batch data, thus ensuring scalability and efficiency. We present a generic algorithm capable of addressing temporal biases and handling various pattern types, including sequential, weighted, and unweighted itemsets. Through comprehensive experiments conducted on real-world datasets, we evaluate the effectiveness of our method, showcasing its ability to construct accurate incremental online classifiers for sequential data. Our approach not only enables previously unusable online machine learning models for sequential data to achieve accuracy comparable to offline baselines but also represents significant progress in the development of incremental online sequential itemset classifiers.

data mining, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2411.00074

Genre: Research Report (1.00)

Industry:

Education > Educational Setting > Online (0.55)
Energy > Oil & Gas > Upstream (0.41)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.87)

Add feedback

Geo-FuB: A Method for Constructing an Operator-Function Knowledge Base for Geospatial Code Generation Tasks Using Large Language Models

Hou, Shuyang, Zhao, Anqi, Liang, Jianyuan, Shen, Zhangxiao, Wu, Huayi

arXiv.org Artificial IntelligenceOct-28-2024

The rise of spatiotemporal data and the need for efficient geospatial modeling have spurred interest in automating these tasks with large language models (LLMs). However, general LLMs often generate errors in geospatial code due to a lack of domain-specific knowledge on functions and operators. To address this, a retrieval-augmented generation (RAG) approach, utilizing an external knowledge base of geospatial functions and operators, is proposed. This study introduces a framework to construct such a knowledge base, leveraging geospatial script semantics. The framework includes: Function Semantic Framework Construction (Geo-FuSE), Frequent Operator Combination Statistics (Geo-FuST), and Semantic Mapping (Geo-FuM). Techniques like Chain-of-Thought, TF-IDF, and the APRIORI algorithm are utilized to derive and align geospatial functions. An example knowledge base, Geo-FuB, built from 154,075 Google Earth Engine scripts, is available on GitHub. Evaluation metrics show a high accuracy, reaching 88.89% overall, with structural and semantic accuracies of 92.03% and 86.79% respectively. Geo-FuB's potential to optimize geospatial code generation through the RAG and fine-tuning paradigms is highlighted.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.20975

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > China > Shanxi Province > Taiyuan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explainability of Highly Associated Fuzzy Churn Patterns in Binary Classification

Wang, D. Y. C., Jordanger, Lars Arne, Lin, Jerry Chun-Wei

arXiv.org Artificial IntelligenceOct-21-2024

Customer churn, particularly in the telecommunications sector, influences both costs and profits. As the explainability of models becomes increasingly important, this study emphasizes not only the explainability of customer churn through machine learning models, but also the importance of identifying multivariate patterns and setting soft bounds for intuitive interpretation. The main objective is to use a machine learning model and fuzzy-set theory with top-\textit{k} HUIM to identify highly associated patterns of customer churn with intuitive identification, referred to as Highly Associated Fuzzy Churn Patterns (HAFCP). Moreover, this method aids in uncovering association rules among multiple features across low, medium, and high distributions. Such discoveries are instrumental in enhancing the explainability of findings. Experiments show that when the top-5 HAFCPs are included in five datasets, a mixture of performance results is observed, with some showing notable improvements. It becomes clear that high importance features enhance explanatory power through their distribution and patterns associated with other features. As a result, the study introduces an innovative approach that improves the explainability and effectiveness of customer churn prediction models.

artificial intelligence, fuzzy logic, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.15827

Country:

Europe > Norway > Western Norway > Vestland > Bergen (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Banking & Finance (0.68)
Telecommunications > Networks (0.34)
Information Technology > Networks (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback