AITopics

Existing studies often lack Anomaly detection is a critical task in machine systematic evaluations of how different embeddings learning, with applications ranging from fraud detection perform across diverse anomaly types, raising and content moderation to user behavior questions about their generalization capabilities analysis (Pang et al., 2021). Within natural language in complex, real-world scenarios such as multilingual processing (NLP), anomaly detection has become settings or domain-specific anomalies. Recent increasingly relevant for identifying outliers efforts, such as AD-NLP (Bejan et al., 2023) such as harmful content, phishing attempts, and and NLP-ADBench (Li et al., 2024), have significantly spam reviews. However, while AD tasks in structured advanced anomaly detection in NLP. ADdata (e.g., tabular, time series, graphs) (Steinbuss NLP provides valuable insights into different types and Böhm, 2021; Blázquez-García et al., 2021; of anomalies, while NLP-ADBench expands evaluations Qiao et al., 2024) have achieved significant maturity, to a wide range of algorithms and datasets.

data mining, detection, machine learning, (21 more...)

2501.1196

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Hosen, MD Mehraz, Islam, Md. Hasibul

Aggrotech: Leveraging Deep Learning for Sustainable Tomato Disease Management

Tomato crop health plays a critical role in ensuring agricultural productivity and food security. Timely and accurate detection of diseases affecting tomato plants is vital for effective disease management. In this study, we propose a deep learning-based approach for Tomato Leaf Disease Detection using two well-established convolutional neural networks (CNNs), namely VGG19 and Inception v3. The experiment is conducted on the Tomato Villages Dataset, encompassing images of both healthy tomato leaves and leaves afflicted by various diseases. The VGG19 model is augmented with fully connected layers, while the Inception v3 model is modified to incorporate a global average pooling layer and a dense classification layer. Both models are trained on the prepared dataset, and their performances are evaluated on a separate test set. This research employs VGG19 and Inception v3 models on the Tomato Villages dataset (4525 images) for tomato leaf disease detection. The models' accuracy of 93.93% with dropout layers demonstrates their usefulness for crop health monitoring. The paper suggests a deep learning-based strategy that includes normalization, resizing, dataset preparation, and unique model architectures. During training, VGG19 and Inception v3 serve as feature extractors, with possible data augmentation and fine-tuning. Metrics like accuracy, precision, recall, and F1 score are obtained through evaluation on a test set and offer important insights into the strengths and shortcomings of the model. The method has the potential for practical use in precision agriculture and could help tomato crops prevent illness early on.

artificial intelligence, deep learning, machine learning, (16 more...)

2501.12052

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Asia > Vietnam (0.04)
Asia > India > Rajasthan > Jaipur (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Spratling, Michael W., Schütt, Heiko H.

A margin-based replacement for cross-entropy loss

Cross-entropy (CE) loss is the de-facto standard for training deep neural networks to perform classification. However, CE-trained deep neural networks struggle with robustness and generalisation issues. To alleviate these issues, we propose high error margin (HEM) loss, a variant of multi-class margin loss that overcomes the training issues of other margin-based losses. We evaluate HEM extensively on a range of architectures and datasets. We find that HEM loss is more effective than cross-entropy loss across a wide range of tasks: unknown class rejection, adversarial robustness, learning with imbalanced data, continual learning, and semantic segmentation (a pixel-level classification task). Despite all training hyper-parameters being chosen for CE loss, HEM is inferior to CE only in terms of clean accuracy and this difference is insignificant. We also compare HEM to specialised losses that have previously been proposed to improve performance on specific tasks. LogitNorm, a loss achieving state-of-the-art performance on unknown class rejection, produces similar performance to HEM for this task, but is much poorer for continual learning and semantic segmentation. Logit-adjusted loss, designed for imbalanced data, has superior results to HEM for that task, but performs more poorly on unknown class rejection and semantic segmentation. DICE, a popular loss for semantic segmentation, is inferior to HEM loss on all tasks, including semantic segmentation. Thus, HEM often out-performs specialised losses, and in contrast to them, is a general-purpose replacement for CE loss.

artificial intelligence, machine learning, proceedings, (14 more...)

2501.12191

Country:

Europe > Luxembourg (0.14)
North America > United States (0.14)
Oceania > New Zealand (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Ito, Takumi, van Deemter, Kees, Suzuki, Jun

Reference-free Evaluation Metrics for Text Generation: A Survey

A number of automatic evaluation metrics have been proposed for natural language generation systems. The most common approach to automatic evaluation is the use of a reference-based metric that compares the model's output with gold-standard references written by humans. However, it is expensive to create such references, and for some tasks, such as response generation in dialogue, creating references is not a simple matter. Therefore, various reference-free metrics have been developed in recent years. In this survey, which intends to cover the full breadth of all NLG tasks, we investigate the most commonly used approaches, their application, and their other uses beyond evaluating models. The survey concludes by highlighting some promising directions for future research.

computational linguistic, machine learning, natural language, (15 more...)

2501.12011

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
(27 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Implementation of an Asymmetric Adjusted Activation Function for Class Imbalance Credit Scoring

Li, Xia, Zheng, Hanghang, Tao, Kunpeng, Mao, Mao

Credit scoring is a systematic approach to evaluate a borrower's probability of default (PD) on a bank loan. The data associated with such scenarios are characteristically imbalanced, complicating binary classification owing to the often-underestimated cost of misclassification during the classifier's learning process. Considering the high imbalance ratio (IR) of these datasets, we introduce an innovative yet straightforward optimized activation function by incorporating an IR-dependent asymmetric adjusted factor embedded Sigmoid activation function (ASIG). The embedding of ASIG makes the sensitive margin of the Sigmoid function auto-adjustable, depending on the imbalance nature of the datasets distributed, thereby giving the activation function an asymmetric characteristic that prevents the underrepresentation of the minority class (positive samples) during the classifier's learning process. The experimental results show that the ASIG-embedded-classifier outperforms traditional classifiers on datasets across wide-ranging IRs in the downstream credit-scoring task. The algorithm also shows robustness and stability, even when the IR is ultra-high. Therefore, the algorithm provides a competitive alternative in the financial industry, especially in credit scoring, possessing the ability to effectively process highly imbalanced distribution data.

artificial intelligence, deep learning, machine learning, (19 more...)

2501.12285

Country:

Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > Queensland (0.04)
North America > United States > Florida > Hillsborough County > Tampa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Banking & Finance > Loans (1.00)
Banking & Finance > Credit (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

CroMe: Multimodal Fake News Detection using Cross-Modal Tri-Transformer and Metric Learning

Choi, Eunjee, Ahn, Junhyun, Piao, XinYu, Kim, Jong-Kook

Multimodal Fake News Detection has received increasing attention recently. Existing methods rely on independently encoded unimodal data and overlook the advantages of capturing intra-modality relationships and integrating inter-modal similarities using advanced techniques. To address these issues, Cross-Modal Tri-Transformer and Metric Learning for Multimodal Fake News Detection (CroMe) is proposed. CroMe utilizes Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (BLIP2) as encoders to capture detailed text, image and combined image-text representations. The metric learning module employs a proxy anchor method to capture intra-modality relationships while the feature fusion module uses a Cross-Modal and Tri-Transformer for effective integration. The final fake news detector processes the fused features through a classifier to predict the authenticity of the content. Experiments on datasets show that CroMe excels in multimodal fake news detection.

detection, machine learning, natural language, (18 more...)

2501.12422

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Europe > Greece (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Towards autonomous photogrammetric forest inventory using a lightweight under-canopy robotic drone

Karjalainen, Väinö, Koivumäki, Niko, Hakala, Teemu, Muhojoki, Jesse, Hyyppä, Eric, George, Anand, Suomalainen, Juha, Honkavaara, Eija

Drones are increasingly used in forestry to capture high-resolution remote sensing data. While operations above the forest canopy are already highly automated, flying inside forests remains challenging, primarily relying on manual piloting. Inside dense forests, reliance on the Global Navigation Satellite System (GNSS) for localization is not feasible. Additionally, the drone must autonomously adjust its flight path to avoid collisions. Recently, advancements in robotics have enabled autonomous drone flights in GNSS-denied obstacle-rich areas. In this article, a step towards autonomous forest data collection is taken by building a prototype of a robotic under-canopy drone utilizing state-of-the-art open-source methods and validating its performance for data collection inside forests. The autonomous flight capability was evaluated through multiple test flights in two boreal forest test sites. The tree parameter estimation capability was studied by conducting diameter at breast height (DBH) estimation using onboard stereo camera data and photogrammetric methods. The prototype conducted flights in selected challenging forest environments, and the experiments showed excellent performance in forest reconstruction with a miniaturized stereoscopic photogrammetric system. The stem detection algorithm managed to identify 79.31 % of the stems. The DBH estimation had a root mean square error (RMSE) of 3.33 cm (12.79 %) and a bias of 1.01 cm (3.87 %) across all trees. For trees with a DBH less than 30 cm, the RMSE was 1.16 cm (5.74 %), and the bias was 0.13 cm (0.64 %). When considering the overall performance in terms of DBH accuracy, autonomy, and forest complexity, the proposed approach was superior compared to methods proposed in the scientific literature. Results provided valuable insights into autonomous forest reconstruction using drones, and several further development topics were proposed.

drone, flight, trajectory, (15 more...)

2501.12073

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Finland (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Air (1.00)
Information Technology > Robotics & Automation (1.00)
Aerospace & Defense > Aircraft (0.90)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Neural Information Processing SystemsJan-20-2025, 03:35:57 GMT

Auslan-Daily: Australian Sign Language Translation for Daily Communication and News

Sign language translation (SLT) aims to convert a continuous sign language video clip into a spoken language. Considering different geographic regions generally have their own native sign languages, it is valuable to establish corresponding SLT datasets to support related communication and research. Auslan, as a sign language specific to Australia, still lacks a dedicated large-scale dataset for SLT.To fill this gap, we curate an Australian Sign Language translation dataset, dubbed Auslan-Daily, which is collected from the Auslan educational TV series and Auslan TV programs. The former involves daily communications among multiple signers in the wild, while the latter comprises sign language videos for up-to-date news, weather forecasts, and documentaries. In particular, Auslan-Daily has two main features: (1) the topics are diverse and signed by multiple signers, and (2) the scenes in our dataset are more complex, e.g., captured in various environments, gesture interference during multi-signers' interactions and various camera positions.

auslan-daily, australian sign language translation, dataset, (5 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.27)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.89)

Balepur, Nishant, Padmakumar, Vishakh, Yang, Fumeng, Feng, Shi, Rudinger, Rachel, Boyd-Graber, Jordan Lee

Whose Boat Does it Float? Improving Personalization in Preference Tuning via Inferred User Personas

arXiv.org Artificial IntelligenceJan-20-2025

LLMs are tuned to follow instructions (aligned) by learning which of two outputs users prefer for a prompt. However, this preference data format does not convey why users prefer responses that are chosen or rejected, so LLMs trained on these datasets cannot tailor responses to varied user needs. To surface these parameters of personalization, we apply abductive reasoning to preference data, inferring needs and interests of users, i.e. personas, that may prefer each output. We test this idea in two steps: Persona Inference (PI)-abductively inferring personas of users who prefer chosen or rejected outputs-and Persona Tailoring (PT)-training models to tailor responses to personas from PI. We find: 1) LLMs infer personas accurately explaining why different users may prefer both chosen or rejected outputs; 2) Training on preference data augmented with PI personas via PT boosts personalization, enabling models to support user-written personas; and 3) Rejected response personas form harder personalization evaluations, showing PT better aids users with uncommon preferences versus typical alignment methods. We argue for an abductive view of preferences for personalization, asking not only which response is better but when, why, and for whom.

large language model, machine learning, persona, (17 more...)

2501.11549

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Singapore (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(12 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Environmental Medicine > Snake Bites (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Falconer, Julia R., Frank, Eibe, Polaschek, Devon L. L., Joshi, Chaitanya

Utilising Deep Learning to Elicit Expert Uncertainty

arXiv.org Artificial IntelligenceJan-20-2025

Recent work [ 14 ] has introduced a method for prior elicitation that utilizes records of expert decisions to infer a prior distribution. While this method provides a promising approach to eliciting expert uncertainty, it has only been demonstrated using tabular data, which may not entirely represent the information used by experts to make decisions. In this paper, we demonstrate how analysts can adopt a deep learning approach to utilize the method proposed in [14 ] with the actual information experts use. We provide an overview of deep learning models that can effectively model expert decision-making to elicit distributions that capture expert uncertainty and present an example examining the risk of colon cancer to show in detail how these models can be used.

artificial intelligence, elicit expert uncertainty, machine learning, (12 more...)

2501.11813

Country:

Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)