AITopics

2211.06502

Country:

Europe > Switzerland > Vaud > Lausanne (0.06)
North America > United States > New York (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.54)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.84)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

arXiv.org Artificial IntelligenceNov-11-2022

Identifying, measuring, and mitigating individual unfairness for supervised learning models and application to credit risk models

Shahsavarifar, Rasoul, Chandran, Jithu, Inchiosa, Mario, Deshpande, Amit, Schlener, Mario, Gossain, Vishal, Elias, Yara, Murali, Vinaya

In the past few years, Artificial Intelligence (AI) has garnered attention from various industries including financial services (FS). AI has made a positive impact in financial services by enhancing productivity and improving risk management. While AI can offer efficient solutions, it has the potential to bring unintended consequences. One such consequence is the pronounced effect of AI-related unfairness and attendant fairness-related harms. These fairness-related harms could involve differential treatment of individuals; for example, unfairly denying a loan to certain individuals or groups of individuals. In this paper, we focus on identifying and mitigating individual unfairness and leveraging some of the recently published techniques in this domain, especially as applicable to the credit adjudication use case. We also investigate the extent to which techniques for achieving individual fairness are effective at achieving group fairness. Our main contribution in this work is functionalizing a two-step training process which involves learning a fair similarity metric from a group sense using a small portion of the raw data and training an individually "fair" classifier using the rest of the data where the sensitive features are excluded. The key characteristic of this two-step technique is related to its flexibility, i.e., the fair metric obtained in the first step can be used with any other individual fairness algorithms in the second step. Furthermore, we developed a second metric (distinct from the fair similarity metric) to determine how fairly a model is treating similar individuals. We use this metric to compare a "fair" model against its baseline model in terms of their individual fairness value. Finally, some experimental results corresponding to the individual unfairness mitigation techniques are presented.

artificial intelligence, inductive learning, machine learning, (18 more...)

2211.06106

Genre: Research Report (0.64)

Industry:

Banking & Finance > Credit (0.64)
Information Technology > Security & Privacy (0.52)
Banking & Finance > Risk Management (0.50)
Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

#artificialintelligenceNov-10-2022, 01:31:01 GMT

Machine Learning Algorithms: Supervised Learning Tip to Tail

This course takes you from understanding the fundamentals of a machine learning project. Learners will understand and implement supervised learning techniques on real case studies to analyze business case scenarios where decision trees, k-nearest neighbours and support vector machines are optimally used. Learners will also gain skills to contrast the practical consequences of different data preparation steps and describe common production issues in applied ML. To be successful, you should have at least beginner-level background in Python programming (e.g., be able to read and code trace existing code, be comfortable with conditionals, loops, variables, lists, dictionaries and arrays). You should have a basic understanding of linear algebra (vector notation) and statistics (probability distributions and mean/median/mode).

learner, machine learning algorithm, supervised learning tip

#artificialintelligence

Country: North America > Canada > Alberta (0.11)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.47)
Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.67)

arXiv.org Artificial IntelligenceNov-10-2022

CCPrompt: Counterfactual Contrastive Prompt-Tuning for Many-Class Classification

Li, Yang, Xu, Canran, Shen, Tao, Jiang, Jing, Long, Guodong

With the success of the prompt-tuning paradigm in Natural Language Processing (NLP), various prompt templates have been proposed to further stimulate specific knowledge for serving downstream tasks, e.g., machine translation, text generation, relation extraction, and so on. Existing prompt templates are mainly shared among all training samples with the information of task description. However, training samples are quite diverse. The sharing task description is unable to stimulate the unique task-related information in each training sample, especially for tasks with the finite-label space. To exploit the unique task-related information, we imitate the human decision process which aims to find the contrastive attributes between the objective factual and their potential counterfactuals. Thus, we propose the \textbf{C}ounterfactual \textbf{C}ontrastive \textbf{Prompt}-Tuning (CCPrompt) approach for many-class classification, e.g., relation classification, topic classification, and entity typing. Compared with simple classification tasks, these tasks have more complex finite-label spaces and are more rigorous for prompts. First of all, we prune the finite label space to construct fact-counterfactual pairs. Then, we exploit the contrastive attributes by projecting training instances onto every fact-counterfactual pair. We further set up global prototypes corresponding with all contrastive attributes for selecting valid contrastive attributes as additional tokens in the prompt template. Finally, a simple Siamese representation learning is employed to enhance the robustness of the model. We conduct experiments on relation classification, topic classification, and entity typing tasks in both fully supervised setting and few-shot setting. The results indicate that our model outperforms former baselines.

computational linguistic, machine learning, natural language, (16 more...)

2211.05987

Country:

North America > United States > Texas > Harris County > Houston (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
(17 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

arXiv.org Artificial IntelligenceNov-10-2022

The CRINGE Loss: Learning what language not to model

Adolphs, Leonard, Gao, Tianyu, Xu, Jing, Shuster, Kurt, Sukhbaatar, Sainbayar, Weston, Jason

Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples. Growing evidence shows that even with very large amounts of positive training data, issues remain that can be alleviated with relatively small amounts of negative data -- examples of what the model should not do. In this work, we propose a novel procedure to train with such data called the CRINGE loss (ContRastive Iterative Negative GEneration). We show the effectiveness of this approach across three different experiments on the tasks of safe generation, contradiction avoidance, and open-domain dialogue. Our models outperform multiple strong baselines and are conceptually simple, easy to train and implement.

artificial intelligence, machine learning, natural language, (18 more...)

2211.05826

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Philippines (0.04)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.30)

Sehanobish, Arijit, Kannan, Kawshik, Abraham, Nabila, Das, Anasuya, Odry, Benjamin

Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks

arXiv.org Artificial IntelligenceNov-10-2022

Large pretrained Transformer-based language models like BERT and GPT have changed the landscape of Natural Language Processing (NLP). However, fine tuning such models still requires a large number of training examples for each target task, thus annotating multiple datasets and training these models on various downstream tasks becomes time consuming and expensive. In this work, we propose a simple extension of the Prototypical Networks for few-shot text classification. Our main idea is to replace the class prototypes by Gaussians and introduce a regularization term that encourages the examples to be clustered near the appropriate class centroids. Experimental results show that our method outperforms various strong baselines on 13 public and 4 internal datasets. Furthermore, we use the class distributions as a tool for detecting potential out-of-distribution (OOD) data points during deployment.

large language model, machine learning, natural language, (22 more...)

2210.13979

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > Richmond County > New York City (0.04)
(9 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.64)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Debiased Self-Training for Semi-Supervised Learning

Chen, Baixu, Jiang, Junguang, Wang, Ximei, Wan, Pengfei, Wang, Jianmin, Long, Mingsheng

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement for labeled data, self-training is widely used in semi-supervised learning by iteratively assigning pseudo labels to unlabeled samples. Despite its popularity, self-training is well-believed to be unreliable and often leads to training instability. Our experimental studies further reveal that the bias in semi-supervised learning arises from both the problem itself and the inappropriate training with potentially incorrect pseudo labels, which accumulates the error in the iterative self-training process. To reduce the above bias, we propose Debiased Self-Training (DST). First, the generation and utilization of pseudo labels are decoupled by two parameter-independent classifier heads to avoid direct error accumulation. Second, we estimate the worst case of self-training bias, where the pseudo labeling function is accurate on labeled samples, yet makes as many mistakes as possible on unlabeled samples. We then adversarially optimize the representations to improve the quality of pseudo labels by avoiding the worst case. Extensive experiments justify that DST achieves an average improvement of 6.3% against state-of-the-art methods on standard semi-supervised learning benchmark datasets and 18.9%$ against FixMatch on 13 diverse tasks. Furthermore, DST can be seamlessly adapted to other self-training methods and help stabilize their training and balance performance across classes in both cases of training from scratch and finetuning from pre-trained models.

artificial intelligence, machine learning, pseudo label, (17 more...)

2202.07136

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California (0.04)
Asia > China > Guangxi Province > Nanning (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Accountable and Explainable Methods for Complex Reasoning over Text

Atanasova, Pepa

A major concern of Machine Learning (ML) models is their opacity. They are deployed in an increasing number of applications where they often operate as black boxes that do not provide explanations for their predictions. Among others, the potential harms associated with the lack of understanding of the models' rationales include privacy violations, adversarial manipulations, and unfair discrimination. As a result, the accountability and transparency of ML models have been posed as critical desiderata by works in policy and law, philosophy, and computer science. In computer science, the decision-making process of ML models has been studied by developing accountability and transparency methods. Accountability methods, such as adversarial attacks and diagnostic datasets, expose vulnerabilities of ML models that could lead to malicious manipulations or systematic faults in their predictions. Transparency methods explain the rationales behind models' predictions gaining the trust of relevant stakeholders and potentially uncovering mistakes and unfairness in models' decisions. To this end, transparency methods have to meet accountability requirements as well, e.g., being robust and faithful to the underlying rationales of a model. This thesis presents my research that expands our collective knowledge in the areas of accountability and transparency of ML models developed for complex reasoning tasks over text.

information retrieval, large language model, machine learning, (25 more...)

2211.04946

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maryland > Baltimore (0.13)
Asia > China > Hong Kong (0.04)
(46 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(7 more...)

Lingg, Nico, Sarabia, Miguel, Zappella, Luca, Theobald, Barry-John

Contrastive Self-Supervised Learning for Skeleton Representations

Human skeleton point clouds are commonly used to automatically classify and predict the behaviour of others. In this paper, we use a contrastive self-supervised learning method, SimCLR, to learn representations that capture the semantics of skeleton point clouds. This work focuses on systematically evaluating the effects that different algorithmic decisions (including augmentations, dataset partitioning and backbone architecture) have on the learned skeleton representations. To pre-train the representations, we normalise six existing datasets to obtain more than 40 million skeleton frames. We evaluate the quality of the learned representations with three downstream tasks: skeleton reconstruction, motion prediction, and activity classification. Our results demonstrate the importance of 1) combining spatial and temporal augmentations, 2) including additional datasets for encoder training, and 3) and using a graph neural network as an encoder.

artificial intelligence, machine learning, representation, (15 more...)

2211.05304

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Hasanain, Maram, Elsayed, Tamer

Cross-lingual Transfer Learning for Check-worthy Claim Identification over Twitter

Misinformation spread over social media has become an undeniable infodemic. However, not all spreading claims are made equal. If propagated, some claims can be destructive, not only on the individual level, but to organizations and even countries. Detecting claims that should be prioritized for fact-checking is considered the first step to fight against spread of fake news. With training data limited to a handful of languages, developing supervised models to tackle the problem over lower-resource languages is currently infeasible. Therefore, our work aims to investigate whether we can use existing datasets to train models for predicting worthiness of verification of claims in tweets in other languages. We present a systematic comparative study of six approaches for cross-lingual check-worthiness estimation across pairs of five diverse languages with the help of Multilingual BERT (mBERT) model. We run our experiments using a state-of-the-art multilingual Twitter dataset. Our results show that for some language pairs, zero-shot cross-lingual transfer is possible and can perform as good as monolingual models that are trained on the target language. We also show that in some languages, this approach outperforms (or at least is comparable to) state-of-the-art models.

large language model, machine learning, natural language, (20 more...)

2211.05087

Country:

North America > United States (0.67)
Europe > Spain > Galicia > Madrid (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)