AITopics | Wellington Region

Community detection, a vital technology for real-world applications, uncovers cohesive node groups (communities) by leveraging both topological and attribute similarities in social networks. However, existing Graph Convolutional Networks (GCNs) trained to maximize modularity often converge to suboptimal solutions. Additionally, directly using human-labeled communities for training can undermine topological cohesiveness by grouping disconnected nodes based solely on node attributes. We address these issues by proposing a novel Topological and Attributive Similarity-based Community detection (TAS-Com) method. TAS-Com introduces a novel loss function that exploits the highly effective and scalable Leiden algorithm to detect community structures with global optimal modularity. Leiden is further utilized to refine human-labeled communities to ensure connectivity within each community, enabling TAS-Com to detect community structures with desirable trade-offs between modularity and compliance with human labels. Experimental results on multiple benchmark networks confirm that TAS-Com can significantly outperform several state-of-the-art algorithms.

community structure, evolutionary algorithm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.10197

Country:

Europe > Netherlands > South Holland > Leiden (0.47)
Asia > China > Beijing > Beijing (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.49)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Generalized Phase Pressure Control Enhanced Reinforcement Learning for Traffic Signal Control

Liao, Xiao-Cheng, Mei, Yi, Zhang, Mengjie, Chen, Xiang-Ling

arXiv.org Artificial IntelligenceMar-26-2025

Appropriate traffic state representation is crucial for learning traffic signal control policies. However, most of the current traffic state representations are heuristically designed, with insufficient theoretical support. In this paper, we (1) develop a flexible, efficient, and theoretically grounded method, namely generalized phase pressure (G2P) control, which takes only simple lane features into consideration to decide which phase to be actuated; 2) extend the pressure control theory to a general form for multi-homogeneous-lane road networks based on queueing theory; (3) design a new traffic state representation based on the generalized phase state features from G2P control; and 4) develop a reinforcement learning (RL)-based algorithm template named G2P-XLight, and two RL algorithms, G2P-MPLight and G2P-CoLight, by combining the generalized phase state representation with MPLight and CoLight, two well-performed RL methods for learning traffic signal control policies. Extensive experiments conducted on multiple real-world datasets demonstrate that G2P control outperforms the state-of-the-art (SOTA) heuristic method in the transportation field and other recent human-designed heuristic methods; and that the newly proposed G2P-XLight significantly outperforms SOTA learning-based approaches. Our code is available online.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2503.20205

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

EDCA -- An Evolutionary Data-Centric AutoML Framework for Efficient Pipelines

Simões, Joana, Correia, João

arXiv.org Artificial IntelligenceMar-6-2025

Automated Machine Learning (AutoML) gained popularity due to the increased demand for Machine Learning (ML) specialists, allowing them to apply ML techniques effortlessly and quickly. AutoML implementations use optimisation methods to identify the most effective ML solution for a given dataset, aiming to improve one or more predefined metrics. However, most implementations focus on model selection and hyperparameter tuning. Despite being an important factor in obtaining high-performance ML systems, data quality is usually an overlooked part of AutoML and continues to be a manual and time-consuming task. This work presents EDCA, an Evolutionary Data Centric AutoML framework. In addition to the traditional tasks such as selecting the best models and hyperparameters, EDCA enhances the given data by optimising data processing tasks such as data reduction and cleaning according to the problems' needs. All these steps create an ML pipeline that is optimised by an evolutionary algorithm. To assess its effectiveness, EDCA was compared to FLAML and TPOT, two frameworks at the top of the AutoML benchmarks. The frameworks were evaluated in the same conditions using datasets from AMLB classification benchmarks. EDCA achieved statistically similar results in performance to FLAML and TPOT but used significantly less data to train the final solutions. Moreover, EDCA experimental results reveal that a good performance can be achieved using less data and efficient ML algorithm aspects that align with Green AutoML guidelines

automl framework, edca, pipeline, (14 more...)

arXiv.org Artificial Intelligence

2503.0435

Country:

Europe > Portugal > Coimbra > Coimbra (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(13 more...)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

A Scoresheet for Explainable AI

Winikoff, Michael, Thangarajah, John, Rodriguez, Sebastian

arXiv.org Artificial IntelligenceFeb-13-2025

Explainability is important for the transparency of autonomous and intelligent systems and for helping to support the development of appropriate levels of trust. There has been considerable work on developing approaches for explaining systems and there are standards that specify requirements for transparency. However, there is a gap: the standards are too high-level and do not adequately specify requirements for explainability. This paper develops a scoresheet that can be used to specify explainability requirements or to assess the explainability aspects provided for particular applications. The scoresheet is developed by considering the requirements of a range of stakeholders and is applicable to Multiagent Systems as well as other AI technologies. We also provide guidance for how to use the scoresheet and illustrate its generality and usefulness by applying it to a range of applications.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.09861

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Greater London > London (0.14)
North America > Cuba (0.05)
(17 more...)

Genre: Research Report (0.40)

Industry:

Health & Medicine (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (1.00)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(2 more...)

Add feedback

CapyMOA: Efficient Machine Learning for Data Streams in Python

Gomes, Heitor Murilo, Lee, Anton, Gunasekara, Nuwan, Sun, Yibin, Cassales, Guilherme Weigert, Liu, Justin, Heyden, Marco, Cerqueira, Vitor, Bahri, Maroua, Koh, Yun Sing, Pfahringer, Bernhard, Bifet, Albert

arXiv.org Artificial IntelligenceFeb-11-2025

CapyMOA is an open-source library designed for efficient machine learning on streaming data. It provides a structured framework for real-time learning and evaluation, featuring a flexible data representation. CapyMOA includes an extensible architecture that allows integration with external frameworks such as MOA and PyTorch, facilitating hybrid learning approaches that combine traditional online algorithms with deep learning techniques. By emphasizing adaptability, scalability, and usability, CapyMOA allows researchers and practitioners to tackle dynamic learning challenges across various domains.

artificial intelligence, evaluation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.07432

Country:

Oceania > New Zealand > North Island > Waikato (0.08)
Africa > Cameroon > Far North Region > Maroua (0.05)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.05)
(5 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Evaluation for Regression Analyses on Evolving Data Streams

Sun, Yibin, Gomes, Heitor Murilo, Pfahringer, Bernhard, Bifet, Albert

arXiv.org Artificial IntelligenceFeb-10-2025

The paper explores the challenges of regression analysis in evolving data streams, an area that remains relatively underexplored compared to classification. We propose a standardized evaluation process for regression and prediction interval tasks in streaming contexts. Additionally, we introduce an innovative drift simulation strategy capable of synthesizing various drift types, including the less-studied incremental drift. Comprehensive experiments with state-of-the-art methods, conducted under the proposed process, validate the effectiveness and robustness of our approach.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.07213

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.84)
Research Report > Experimental Study (0.60)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Evaluating Inter-Column Logical Relationships in Synthetic Tabular Data Generation

Long, Yunbo, Xu, Liming, Brintrup, Alexandra

arXiv.org Artificial IntelligenceFeb-6-2025

To evaluate the fidelity of synthetic tabular data, numerous metrics have been proposed to assess accuracy and diversity, including both low-order statistics (e.g., Density Estimation and Correlation Score (Zhang et al., 2023), Average Coverage Scores (Zein & Urvoy, 2022)) and high-order statistics (e.g., α-Precision and β-Recall (Alaa et al., 2022)). However, these metrics operate at a high level and fail to evaluate whether synthetic data preserves logical relationships, such as hierarchical or semantic dependencies between features. This highlights the need for a more fine-grained, context-aware evaluation of multivariate dependencies. To address this, we propose three evaluation metrics: Hierarchical Consistency Score (HCS), Multivariate Dependency Index (MDI), and Distributional Similarity Index (DSI). To assess the effectiveness of these metrics in quantifying inter-column relationships, we select five representative tabular data generation methods from different categories for evaluation. Their performance is measured using both existing and our proposed metrics on a real-world dataset rich in logical consistency and dependency constraints. Experimental results validate the effectiveness of our proposed metrics and reveal the limitations of existing approaches in preserving logical relationships in synthetic tabular data. Additionally, we discuss potential pathways to better capture logical constraints within joint distributions, paying the way for future advancements in synthetic tabular data generation.

machine learning, natural language, tabular data, (17 more...)

arXiv.org Artificial Intelligence

2502.04055

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Southeast Asia (0.06)
(49 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

Add feedback

LIBRA: Measuring Bias of Large Language Model from a Local Context

Pang, Bo, Qiao, Tingrui, Walker, Caroline, Cunningham, Chris, Koh, Yun Sing

arXiv.org Artificial IntelligenceFeb-1-2025

Large Language Models (LLMs) have significantly advanced natural language processing applications, yet their widespread use raises concerns regarding inherent biases that may reduce utility or harm for particular social groups. Despite the advancement in addressing LLM bias, existing research has two major limitations. First, existing LLM bias evaluation focuses on the U.S. cultural context, making it challenging to reveal stereotypical biases of LLMs toward other cultures, leading to unfair development and use of LLMs. Second, current bias evaluation often assumes models are familiar with the target social groups. When LLMs encounter words beyond their knowledge boundaries that are unfamiliar in their training data, they produce irrelevant results in the local context due to hallucinations and overconfidence, which are not necessarily indicative of inherent bias. This research addresses these limitations with a Local Integrated Bias Recognition and Assessment Framework (LIBRA) for measuring bias using datasets sourced from local corpora without crowdsourcing. Implementing this framework, we develop a dataset comprising over 360,000 test cases in the New Zealand context. Furthermore, we propose the Enhanced Idealized CAT Score (EiCAT), integrating the iCAT score with a beyond knowledge boundary score (bbs) and a distribution divergence-based bias measurement to tackle the challenge of LLMs encountering words beyond knowledge boundaries. Our results show that the BERT family, GPT-2, and Llama-3 models seldom understand local words in different contexts. While Llama-3 exhibits larger bias, it responds better to different cultural contexts. The code and dataset are available at: https://github.com/ipangbo/LIBRA.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.01679

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
South America > Colombia > Meta Department > Villavicencio (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback