AITopics

The ubiquity of Machine Learning (ML) models, and more specifically deep neural network (NN) models, in all sorts of applications has become undeniable in recent years. From classifying images [1, 2, 3], detecting objects [4, 1] and performing semantic segmentation [5, 4] to translating from one human language to another [6] and doing sentiment analysis [7], the advances in different subfields of ML can be attributed mostly to the explosion of computing power and their ability to speed up the training process of artificial NNs. Most famously, AlexNet [8] allowed for an impressive jump in performance in the challenging ILSVRC2012 image classification dataset [1], also known as ImageNet, permanently cementing deep convolutional NN (CNN) architectures in the field of computer vision. Since then, architectures have gotten more refined [9, 10], training procedures have gotten increasingly more complex [11], and their performance and robustness have greatly improved as a consequence. Namely, the success of these deep CNN models is related to their ability to treat high-dimensional and complex data such as images or natural language. The impressive performance of NNs for machine learning tasks can be explained by the ability of their flexible architecture to capture meaningful information on various kinds of complex data and the fact that they are potentially composed of millions of parameters. However, this poses a major challenge: deciphering the reasoning behind the model's predictions. For instance, typical NN architectures for classification or regression problems incrementally transform the representation of the input data in the so-called latent space (or feature space) and then use this transformed representation to make their predictions, as summarized in Figure 1. Each step of this incremental data processing pipeline (or feature extraction chain) is carried out by a so-called layer, which is mathematically a non-linear function (blue rectangle in Figure 1).

algorithmic bias, artificial intelligence, machine learning, (14 more...)

2210.04491

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Law (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Government (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Kabir, Mohsinul, Ahmed, Tasnim, Hasan, Md. Bakhtiar, Laskar, Md Tahmid Rahman, Joarder, Tarun Kumar, Mahmud, Hasan, Hasan, Kamrul

DEPTWEET: A Typology for Social Media Texts to Detect Depression Severities

Mental health research through data-driven methods has been hindered by a lack of standard typology and scarcity of adequate data. In this study, we leverage the clinical articulation of depression to build a typology for social media texts for detecting the severity of depression. It emulates the standard clinical assessment procedure Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and Patient Health Questionnaire (PHQ-9) to encompass subtle indications of depressive disorders from tweets. Along with the typology, we present a new dataset of 40191 tweets labeled by expert annotators. Each tweet is labeled as 'non-depressed' or 'depressed'. Moreover, three severity levels are considered for 'depressed' tweets: (1) mild, (2) moderate, and (3) severe. An associated confidence score is provided with each label to validate the quality of annotation. We examine the quality of the dataset via representing summary statistics while setting strong baseline results using attention-based models like BERT and DistilBERT. Finally, we extensively address the limitations of the study to provide directions for further research.

annotator, data mining, machine learning, (23 more...)

doi: 10.1016/j.chb.2022.107503

2210.05372

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
North America > United States > New York > New York County > New York City (0.04)
(16 more...)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Basterrech, Sebastián, Woźniak, Michal

Tracking changes using Kullback-Leibler divergence for the continual learning

Recently, continual learning has received a lot of attention. One of the significant problems is the occurrence of \emph{concept drift}, which consists of changing probabilistic characteristics of the incoming data. In the case of the classification task, this phenomenon destabilizes the model's performance and negatively affects the achieved prediction quality. Most current methods apply statistical learning and similarity analysis over the raw data. However, similarity analysis in streaming data remains a complex problem due to time limitation, non-precise values, fast decision speed, scalability, etc. This article introduces a novel method for monitoring changes in the probabilistic distribution of multi-dimensional data streams. As a measure of the rapidity of changes, we analyze the popular Kullback-Leibler divergence. During the experimental study, we show how to use this metric to predict the concept drift occurrence and understand its nature. The obtained results encourage further work on the proposed methods and its application in the real tasks where the prediction of the future appearance of concept drift plays a crucial role, such as predictive maintenance.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2210.04865

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > Czechia > Moravian-Silesian Region > Ostrava (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea

Yoo, Haneul, Jin, Jiho, Son, Juhee, Bak, JinYeong, Cho, Kyunghyun, Oh, Alice

Historical records in Korea before the 20th century were primarily written in Hanja, an extinct language based on Chinese characters and not understood by modern Korean or Chinese speakers. Historians with expertise in this time period have been analyzing the documents, but that process is very difficult and time-consuming, and language models would significantly speed up the process. Toward building and evaluating language models for Hanja, we release the Hanja Understanding Evaluation dataset consisting of chronological attribution, topic classification, named entity recognition, and summary retrieval tasks. We also present BERT-based models continued training on the two major corpora from the 14th to the 19th centuries: the Annals of the Joseon Dynasty and Diaries of the Royal Secretariats. We compare the models with several baselines on all tasks and show there are significant improvements gained by training on the two corpora. Additionally, we run zero-shot experiments on the Daily Records of the Royal Court and Important Officials (DRRI). The DRRI dataset has not been studied much by the historians, and not at all by the NLP community.

artificial intelligence, machine learning, natural language, (19 more...)

doi: 10.18653/v1/2022.findings-naacl.140

2210.05112

Country:

Asia > China (0.04)
Asia > South Korea (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)

Asymptotically Unbiased Instance-wise Regularized Partial AUC Optimization: Theory and Algorithm

Shao, Huiyang, Xu, Qianqian, Yang, Zhiyong, Bao, Shilong, Huang, Qingming

The Partial Area Under the ROC Curve (PAUC), typically including One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC), measures the average performance of a binary classifier within a specific false positive rate and/or true positive rate interval, which is a widely adopted measure when decision constraints must be considered. Consequently, PAUC optimization has naturally attracted increasing attention in the machine learning community within the last few years. Nonetheless, most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable. Fortunately, a recent work presents an unbiased formulation of the PAUC optimization problem via distributional robust optimization. However, it is based on the pair-wise formulation of AUC, which suffers from the limited scalability w.r.t. sample size and a slow convergence rate, especially for TPAUC. To address this issue, we present a simpler reformulation of the problem in an asymptotically unbiased and instance-wise manner. For both OPAUC and TPAUC, we come to a nonconvex strongly concave minimax regularized problem of instance-wise functions. On top of this, we employ an efficient solver enjoys a linear per-iteration computational complexity w.r.t. the sample size and a time-complexity of $O(\epsilon^{-1/3})$ to reach a $\epsilon$ stationary point. Furthermore, we find that the minimax reformulation also facilitates the theoretical analysis of generalization error as a byproduct. Compared with the existing results, we present new error bounds that are much easier to prove and could deal with hypotheses with real-valued outputs. Finally, extensive experiments on several benchmark datasets demonstrate the effectiveness of our method.

artificial intelligence, machine learning, optimization, (15 more...)

2210.03967

Country:

Asia > China (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Normandy > Seine-Maritime > Rouen (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Using Whole Slide Image Representations from Self-Supervised Contrastive Learning for Melanoma Concordance Regression

Grullon, Sean, Spurrier, Vaughn, Zhao, Jiayi, Chivers, Corey, Jiang, Yang, Motaparthi, Kiran, Bonham, Michael, Ianni, Julianna

Although melanoma occurs more rarely than several other skin cancers, patients' long term survival rate is extremely low if the diagnosis is missed. Diagnosis is complicated by a high discordance rate among pathologists when distinguishing between melanoma and benign melanocytic lesions. A tool that provides potential concordance information to healthcare providers could help inform diagnostic, prognostic, and therapeutic decision-making for challenging melanoma cases. We present a melanoma concordance regression deep learning model capable of predicting the concordance rate of invasive melanoma or melanoma in-situ from digitized Whole Slide Images (WSIs). The salient features corresponding to melanoma concordance were learned in a self-supervised manner with the contrastive learning method, SimCLR. We trained a SimCLR feature extractor with 83,356 WSI tiles randomly sampled from 10,895 specimens originating from four distinct pathology labs. We trained a separate melanoma concordance regression model on 990 specimens with available concordance ground truth annotations from three pathology labs and tested the model on 211 specimens. We achieved a Root Mean Squared Error (RMSE) of 0.28 +/- 0.01 on the test set. We also investigated the performance of using the predicted concordance rate as a malignancy classifier, and achieved a precision and recall of 0.85 +/- 0.05 and 0.61 +/- 0.06, respectively, on the test set. These results are an important first step for building an artificial intelligence (AI) system capable of predicting the results of consulting a panel of experts and delivering a score based on the degree to which the experts would agree on a particular diagnosis. Such a system could be used to suggest additional testing or other action such as ordering additional stains or genetic tests.

artificial intelligence, machine learning, specimen, (15 more...)

2210.04803

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Western Europe (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

FOX NewsOct-9-2022, 14:14:56 GMT

Bray Wyatt makes shocking return at WWE's Extreme Rules PPV

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Weeks of teases and vignettes featuring a white rabbit and cryptic messages paid off Saturday night at WWE's Extreme Rules pay-per-view at the Wells Fargo Center in Philadelphia. After Riddle defeated Seth Rollings in the fight pit, WWE announcers Michael Cole and Corey Graves were about to sign off the broadcast when the screen went black and shady characters began to appear in the crowd. "He's got the whole world in his hands," blared over the speakers and characters from Bray Wyatt's Firefly Fun House showed up in the crowd.

bray wyatt make shocking return, wwe, wyatt, (6 more...)

FOX News

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.17)
North America > United States > Nevada > Clark County > Paradise (0.06)

Industry:

Leisure & Entertainment (0.53)
Media > News (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Rosa, Nicholas, Drummond, Tom, Harandi, Mehrtash

A Differentiable Distance Approximation for Fairer Image Classification

arXiv.org Artificial IntelligenceOct-9-2022

Naïvely trained AI models can be heavily biased. This can be particularly problematic when the biases involve legally or morally protected attributes such as ethnic background, age or gender. Existing solutions to this problem come at the cost of extra computation, unstable adversarial optimisation or have losses on the feature space structure that are disconnected from fairness measures and only loosely generalise to fairness. In this work we propose a differentiable approximation of the variance of demographics, a metric that can be used to measure the bias, or unfairness, in an AI model. Our approximation can be optimised alongside the regular training objective which eliminates the need for any extra models during training and directly improves the fairness of the regularised models. We demonstrate that our approach improves the fairness of AI models in varied task and dataset scenarios, whilst still maintaining a high level of classification accuracy.

accuracy, artificial intelligence, machine learning, (15 more...)

2210.04369

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.47)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

arXiv.org Artificial IntelligenceOct-9-2022

Improving Fake News Detection of Influential Domain via Domain- and Instance-Level Transfer

Nan, Qiong, Wang, Danding, Zhu, Yongchun, Sheng, Qiang, Shi, Yuhui, Cao, Juan, Li, Jintao

Both real and fake news in various domains, such as politics, health, and entertainment are spread via online social media every day, necessitating fake news detection for multiple domains. Among them, fake news in specific domains like politics and health has more serious potential negative impacts on the real world (e.g., the infodemic led by COVID-19 misinformation). Previous studies focus on multi-domain fake news detection, by equally mining and modeling the correlation between domains. However, these multi-domain methods suffer from a seesaw problem: the performance of some domains is often improved at the cost of hurting the performance of other domains, which could lead to an unsatisfying performance in specific domains. To address this issue, we propose a Domain- and Instance-level Transfer Framework for Fake News Detection (DITFEND), which could improve the performance of specific target domains. To transfer coarse-grained domain-level knowledge, we train a general model with data of all domains from the meta-learning perspective. To transfer fine-grained instance-level knowledge and adapt the general model to a target domain, we train a language model on the target domain to evaluate the transferability of each data instance in source domains and re-weigh each instance's contribution. Offline experiments on two datasets demonstrate the effectiveness of DITFEND. Online experiments show that DITFEND brings additional improvements over the base models in a real-world scenario.

artificial intelligence, machine learning, natural language, (15 more...)

2209.08902

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.88)
Health & Medicine > Therapeutic Area > Immunology (0.88)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Seshadri, Preethi, Pezeshkpour, Pouya, Singh, Sameer

Quantifying Social Biases Using Templates is Unreliable

arXiv.org Artificial IntelligenceOct-9-2022

Recently, there has been an increase in efforts to understand how large language models (LLMs) propagate and amplify social biases. Several works have utilized templates for fairness evaluation, which allow researchers to quantify social biases in the absence of test sets with protected attribute labels. While template evaluation can be a convenient and helpful diagnostic tool to understand model deficiencies, it often uses a simplistic and limited set of templates. In this paper, we study whether bias measurements are sensitive to the choice of templates used for benchmarking. Specifically, we investigate the instability of bias measurements by manually modifying templates proposed in previous works in a semantically-preserving manner and measuring bias across these modifications. We find that bias values and resulting conclusions vary considerably across template modifications on four tasks, ranging from an 81% reduction (NLI) to a 162% increase (MLM) in (task-specific) bias measurements. Our results indicate that quantifying fairness in LLMs, as done in current practice, can be brittle and needs to be approached with more care and caution.

large language model, machine learning, natural language, (17 more...)

2210.04337

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(6 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)