AITopics | Demartini, Gianluca

Collaborating Authors

Demartini, Gianluca

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fairness without Sensitive Attributes via Knowledge Sharing

Ni, Hongliang, Han, Lei, Chen, Tong, Sadiq, Shazia, Demartini, Gianluca

arXiv.org Artificial IntelligenceSep-27-2024

While model fairness improvement has been explored previously, existing methods invariably rely on adjusting explicit sensitive attribute values in order to improve model fairness in downstream tasks. However, we observe a trend in which sensitive demographic information becomes inaccessible as public concerns around data privacy grow. In this paper, we propose a confidence-based hierarchical classifier structure called "Reckoner" for reliable fair model learning under the assumption of missing sensitive attributes. We first present results showing that if the dataset contains biased labels or other hidden biases, classifiers significantly increase the bias gap across different demographic groups in the subset with higher prediction confidence. Inspired by these findings, we devised a dual-model system in which a version of the model initialised with a high-confidence data subset learns from a version of the model initialised with a low-confidence data subset, enabling it to avoid biased predictions. Our experimental results show that Reckoner consistently outperforms state-of-the-art baselines in COMPAS dataset and New Adult dataset, considering both accuracy and fairness metrics.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2409.1847

Country:

South America > Brazil > Rio de Janeiro (0.17)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Hate Speech Detection with Generalizable Target-aware Fairness

Chen, Tong, Wang, Danny, Liang, Xurong, Risius, Marten, Demartini, Gianluca, Yin, Hongzhi

arXiv.org Artificial IntelligenceJun-11-2024

To counter the side effect brought by the proliferation of social media platforms, hate speech detection (HSD) plays a vital role in halting the dissemination of toxic online posts at an early stage. However, given the ubiquitous topical communities on social media, a trained HSD classifier easily becomes biased towards specific targeted groups (e.g., female and black people), where a high rate of false positive/negative results can significantly impair public trust in the fairness of content moderation mechanisms, and eventually harm the diversity of online society. Although existing fairness-aware HSD methods can smooth out some discrepancies across targeted groups, they are mostly specific to a narrow selection of targets that are assumed to be known and fixed. This inevitably prevents those methods from generalizing to real-world use cases where new targeted groups constantly emerge over time. To tackle this defect, we propose Generalizable target-aware Fairness (GetFair), a new method for fairly classifying each post that contains diverse and even unseen targets during inference. To remove the HSD classifier's spurious dependence on target-related features, GetFair trains a series of filter functions in an adversarial pipeline, so as to deceive the discriminator that recovers the targeted group from filtered post embeddings. To maintain scalability and generalizability, we innovatively parameterize all filter functions via a hypernetwork that is regularized by the semantic affinity among targets. Taking a target's pretrained word embedding as input, the hypernetwork generates the weights used by each target-specific filter on-the-fly without storing dedicated filter parameters. Finally, comparative experiments on two HSD datasets have shown advantageous performance of GetFair on out-of-sample targets.

data mining, getfair, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.00046

Country:

Europe (0.69)
North America > United States (0.28)
Oceania > Australia > Queensland (0.15)

Genre:

Overview (0.93)
Research Report > New Finding (0.67)

Industry:

Energy > Oil & Gas (0.46)
Social Sector (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Identification of Regulatory Requirements Relevant to Business Processes: A Comparative Study on Generative AI, Embedding-based Ranking, Crowd and Expert-driven Methods

Sai, Catherine, Sadiq, Shazia, Han, Lei, Demartini, Gianluca, Rinderle-Ma, Stefanie

arXiv.org Artificial IntelligenceJan-2-2024

Organizations face the challenge of ensuring compliance with an increasing amount of requirements from various regulatory documents. Which requirements are relevant depends on aspects such as the geographic location of the organization, its domain, size, and business processes. Considering these contextual factors, as a first step, relevant documents (e.g., laws, regulations, directives, policies) are identified, followed by a more detailed analysis of which parts of the identified documents are relevant for which step of a given business process. Nowadays the identification of regulatory requirements relevant to business processes is mostly done manually by domain and legal experts, posing a tremendous effort on them, especially for a large number of regulatory documents which might frequently change. Hence, this work examines how legal and domain experts can be assisted in the assessment of relevant requirements. For this, we compare an embedding-based NLP ranking method, a generative AI method using GPT-4, and a crowdsourced method with the purely manual method of creating relevancy labels by experts. The proposed methods are evaluated based on two case studies: an Australian insurance case created with domain experts and a global banking use case, adapted from SAP Signavio's workflow example of an international guideline. A gold standard is created for both BPMN2.0 processes and matched to real-world textual requirements from multiple regulatory documents. The evaluation and discussion provide insights into strengths and weaknesses of each method regarding applicability, automation, transparency, and reproducibility and provide guidelines on which method combinations will maximize benefits for given characteristics such as process usage, impact, and dynamics of an application scenario.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.02986

Country: Oceania > Australia > Queensland (0.14)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance > Insurance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.71)

Add feedback

Data Bias Management

Demartini, Gianluca, Roitero, Kevin, Mizzaro, Stefano

arXiv.org Artificial IntelligenceMay-15-2023

Due to the widespread use of data-powered systems in our everyday lives, concepts like bias and fairness gained significant attention among researchers and practitioners, in both industry and academia. Such issues typically emerge from the data, which comes with varying levels of quality, used to train supervised machine learning systems. With the commercialization and deployment of such systems that are sometimes delegated to make life-changing decisions, significant efforts are being made towards the identification and removal of possible sources of data bias that may resurface to the final end user or in the decisions being made. In this paper, we present research results that show how bias in data affects end users, where bias is originated, and provide a viewpoint about what we should do about it. We argue that data bias is not something that should necessarily be removed in all cases, and that research attention should instead shift from bias removal towards the identification, measurement, indexing, surfacing, and adapting for bias, which we name bias management.

artificial intelligence, gianluca demartini, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.09686

Country: North America > United States (0.49)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Impact of Data Quality on Image Classification Fairness

Barry, Aki, Han, Lei, Demartini, Gianluca

arXiv.org Artificial IntelligenceMay-2-2023

Answering these questions will help guide decision-making on both the data and model selection when factoring fairness into account. With the proliferation of algorithmic decision-making, increased The contributions that this paper make are: (i) provide experimental scrutiny has been placed on these systems. This paper explores results over different metrics of fairness across different models the relationship between the quality of the training data and the and datasets; (ii) answer questions related to the impact of data overall fairness of the models trained with such data in the context quality on fairness (e.g., Does label accuracy increase fairness?); of supervised classification. We measure key fairness metrics across and (iii) provide a starting point and datasets for future research a range of algorithms over multiple image classification datasets into the impact of data quality on supervised classification fairness.

artificial intelligence, fairness, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3539618.3592031

2305.01595

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Scaling-Up the Crowd: Micro-Task Pricing Schemes for Worker Retention and Latency Improvement

Difallah, Djellel Eddine (University of Fribourg) | Catasta, Michele (EPFL) | Demartini, Gianluca (University of Fribourg) | Cudré-Mauroux, Philippe (University of Fribourg)

AAAI ConferencesOct-31-2014

Retaining workers on micro-task crowdsourcing platforms is essential in order to guarantee the timely completion of batches of Human Intelligence Tasks (HITs). Worker retention is also a necessary condition for the introduction of SLAs on crowdsourcing platforms. In this paper, we introduce novel pricing schemes aimed at improving the retention rate of workers working on long batches of similar tasks. We show how increasing or decreasing the monetary reward over time influences the number of tasks a worker is willing to complete in a batch, as well as how it influences the overall latency. We compare our new pricing schemes against traditional pricing methods (e.g., constant reward for all the HITs in a batch) and empirically show how certain schemes effectively function as an incentive for workers to keep working longer on a given batch of HITs. Our experimental results show that the best pricing scheme in terms of worker retention is based on punctual bonuses paid whenever the workers reach predefined milestones.

batch, crowdsourcing, social media, (17 more...)

AAAI Conferences

Second AAAI Conference on Human Computation and Crowdsourcing

Country:

Europe > Switzerland (0.28)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.76)

Add feedback

Analyzing Political Trends in the Blogosphere

Demartini, Gianluca (L3S Research Center) | Siersdorfer, Stefan (L3S Research Center) | Chelaru, Sergiu (L3S Research Center) | Nejdl, Wolfgang (L3S Research Center)

AAAI ConferencesJul-12-2011

In the last years, the blogosphere has become a vital part of the web, covering a variety of different points of view and opinions on political and event-related topics such as immigration, election campaigns, or economic developments. Tracking the public opinion is usually done by conducting surveys resulting in significant costs both for interviewers and persons consulted. In this paper, we propose a method for extracting political trends in the blogosphere.To this end, we apply sentiment and time series analysis techniques in combination with aggregation methods on blog data to estimate the temporal development of opinions on politicians.

artificial intelligence, natural language, sentiment, (16 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre:

Questionnaire & Opinion Survey (0.51)
Research Report (0.47)

Industry:

Government > Voting & Elections (0.89)
Media > News (0.83)
Government > Immigration & Customs (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Social Media (0.90)

Add feedback