AITopics

2501.03257

Country: Asia > China (0.47)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Goyal, Harshika, Wajid, Mohammad Saif, Wajid, Mohd Anas, Khanday, Akib Mohi Ud Din, Neshat, Mehdi, Gandomi, Amir

State-of-the-art AI-based Learning Approaches for Deepfake Generation and Detection, Analyzing Opportunities, Threading through Pros, Cons, and Future Prospects

arXiv.org Artificial IntelligenceJan-1-2025

The rapid advancement of deepfake technologies, specifically designed to create incredibly lifelike facial imagery and video content, has ignited a remarkable level of interest and curiosity across many fields, including forensic analysis, cybersecurity and the innovative creation of digital characters. By harnessing the latest breakthroughs in deep learning methods, such as Generative Adversarial Networks, Variational Autoencoders, Few-Shot Learning Strategies, and Transformers, the outcomes achieved in generating deepfakes have been nothing short of astounding and transformative. Also, the ongoing evolution of detection technologies is being developed to counteract the potential for misuse associated with deepfakes, effectively addressing critical concerns that range from political manipulation to the dissemination of fake news and the ever-growing issue of cyberbullying. This comprehensive review paper meticulously investigates the most recent developments in deepfake generation and detection, including around 400 publications, providing an in-depth analysis of the cutting-edge innovations shaping this rapidly evolving landscape. Starting with a thorough examination of systematic literature review methodologies, we embark on a journey that delves into the complex technical intricacies inherent in the various techniques used for deepfake generation, comprehensively addressing the challenges faced, potential solutions available, and the nuanced details surrounding manipulation formulations. Subsequently, the paper is dedicated to accurately benchmarking leading approaches against prominent datasets, offering thorough assessments of the contributions that have significantly impacted these vital domains. Ultimately, we engage in a thoughtful discussion of the existing challenges, paving the way for continuous advancements in this critical and ever-dynamic study area.

artificial intelligence, detection, machine learning, (14 more...)

2501.01029

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.45)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview > Innovation (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

GV, Hirthik Mathesh, M, Kavin Chakravarthy, S, Sentil Pandi

A Novel Approach using CapsNet and Deep Belief Network for Detection and Identification of Oral Leukopenia

arXiv.org Artificial IntelligenceJan-1-2025

Oral cancer constitutes a significant global health concern, resulting in 277,484 fatalities in 2023, with the highest prevalence observed in low- and middle-income nations. Facilitating automation in the detection of possibly malignant and malignant lesions in the oral cavity could result in cost-effective and early disease diagnosis. Establishing an extensive repository of meticulously annotated oral lesions is essential. In this research photos are being collected from global clinical experts, who have been equipped with an annotation tool to generate comprehensive labelling. This research presents a novel approach for integrating bounding box annotations from various doctors. Additionally, Deep Belief Network combined with CAPSNET is employed to develop automated systems that extracted intricate patterns to address this challenging problem. This study evaluated two deep learning-based computer vision methodologies for the automated detection and classification of oral lesions to facilitate the early detection of oral cancer: image classification utilizing CAPSNET. Image classification attained an F1 score of 94.23% for detecting photos with lesions 93.46% for identifying images necessitating referral. Object detection attained an F1 score of 89.34% for identifying lesions for referral. Subsequent performances are documented about classification based on the sort of referral decision. Our preliminary findings indicate that deep learning possesses the capability to address this complex problem.

artificial intelligence, machine learning, survey article, (18 more...)

2501.00876

Country: Asia > India (0.15)

Genre:

Research Report > Promising Solution (0.61)
Overview > Innovation (0.61)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

Anidjar, Or Haim, Marbel, Revital, Yozevitch, Roi

Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages

Approaching Speech-to-Text and Automatic Speech Recognition problems in low-resource languages is notoriously challenging due to the scarcity of validated datasets and the diversity of dialects. Arabic, Russian, and Portuguese exemplify these difficulties, being low-resource languages due to the many dialects of these languages across different continents worldwide. Moreover, the variety of accents and pronunciations of such languages complicate ASR models' success. With the increasing popularity of Deep Learning and Transformers, acoustic models like the renowned Wav2Vec2 have achieved superior performance in the Speech Recognition field compared to state-of-the-art approaches. However, despite Wav2Vec2's improved efficiency over traditional methods, its performance significantly declines for under-represented languages, even though it requires significantly less labeled data. This paper introduces an end-to-end framework that enhances ASR systems fine-tuned on Wav2Vec2 through data augmentation techniques. To validate our framework's effectiveness, we conducted a detailed experimental evaluation using three datasets from Mozilla's Common Voice project in Arabic, Russian, and Portuguese. Additionally, the framework presented in this paper demonstrates robustness to different diacritics. Ultimately, our approach outperforms two previous baseline models, which are the pre-trained Wav2Vec2 and the well-known Whisper ASR model, resulting in an average relative improvement of 33.9\% in Word Error Rate and a 53.2\% relative improvement in Character Error Rate.

artificial intelligence, deep learning, machine learning, (15 more...)

2501.00425

Country:

Asia > Middle East > Israel (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Leaf diseases detection using deep learning methods

Fatimi, El Houcine El

classification and visualization technique, convolutional neural network model, interpretability and explainability, (16 more...)

This study, our main topic is to devlop a new deep-learning approachs for plant leaf disease identification and detection using leaf image datasets. We also discussed the challenges facing current methods of leaf disease detection and how deep learning may be used to overcome these challenges and enhance the accuracy of disease detection. Therefore, we have proposed a novel method for the detection of various leaf diseases in crops, along with the identification and description of an efficient network architecture that encompasses hyperparameters and optimization methods. The effectiveness of different architectures was compared and evaluated to see the best architecture configuration and to create an effective model that can quickly detect leaf disease. In addition to the work done on pre-trained models, we proposed a new model based on CNN, which provides an efficient method for identifying and detecting plant leaf disease. Furthermore, we evaluated the efficacy of our model and compared the results to those of some pre-trained state-of-the-art architectures.

2501.00669

Country:

Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
Asia > India (0.04)
Asia > Bangladesh (0.04)
(32 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Food & Agriculture > Agriculture (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting a Petroglyph

Irie, Kazuki

Do autoregressive Transformer language models require explicit positional encodings (PEs)? The answer is "no" as long as they have more than one layer -- they can distinguish sequences with permuted tokens without requiring explicit PEs. This property has been known since early efforts (those contemporary with GPT-2) adopting the Transformer for language modeling. However, this result does not appear to have been well disseminated and was even rediscovered recently. This may be partially due to a sudden growth of the language modeling community after the advent of GPT-2, but perhaps also due to the lack of a clear explanation in prior publications, despite being commonly understood by practitioners in the past. Here we review this long-forgotten explanation why explicit PEs are nonessential for multi-layer autoregressive Transformers (in contrast, one-layer models require PEs to discern order information of their input tokens). We also review the origin of this result, and hope to re-establish it as a common knowledge.

language model, proc, transformer, (12 more...)

2501.00659

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(19 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Low-Rank Adaptation for Foundation Models: A Comprehensive Review

Yang, Menglin, Chen, Jialin, Zhang, Yifei, Liu, Jiahong, Zhang, Jiasheng, Ma, Qiyao, Verma, Harshit, Zhang, Qianru, Zhou, Min, King, Irwin, Ying, Rex

The rapid advancement of foundation modelslarge-scale neural networks trained on diverse, extensive datasetshas revolutionized artificial intelligence, enabling unprecedented advancements across domains such as natural language processing, computer vision, and scientific discovery. However, the substantial parameter count of these models, often reaching billions or trillions, poses significant challenges in adapting them to specific downstream tasks. Low-Rank Adaptation (LoRA) has emerged as a highly promising approach for mitigating these challenges, offering a parameter-efficient mechanism to fine-tune foundation models with minimal computational overhead. This survey provides the first comprehensive review of LoRA techniques beyond large Language Models to general foundation models, including recent techniques foundations, emerging frontiers and applications of low-rank adaptation across multiple domains. Finally, this survey discusses key challenges and future research directions in theoretical understanding, scalability, and robustness. This survey serves as a valuable resource for researchers and practitioners working with efficient foundation model adaptation.

adaptation, arxiv, lora, (15 more...)

2501.00365

Country:

Asia > China > Hong Kong (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Manna, Supriya, Sett, Niladri

A Tale of Two Imperatives: Privacy and Explainability

Deep learning's preponderance across scientific domains has reshaped high-stakes decision-making, making it essential to follow rigorous operational frameworks that include both Right-to-Privacy (RTP) and Right-to-Explanation (RTE). This paper examines the complexities of combining these two requirements. For RTP, we focus on `Differential privacy' (DP), which is considered the current \textit{gold standard} for privacy-preserving machine learning due to its strong quantitative guarantee of privacy. For RTE, we focus on post-hoc explainers: they are the \textit{go-to} option for model auditing as they operate independently of model training. We formally investigate DP models and various commonly-used post-hoc explainers: how to evaluate these explainers subject to RTP, and analyze the intrinsic interactions between DP models and these explainers. Furthermore, our work throws light on how RTP and RTE can be effectively combined in high-stakes applications. Our study concludes by outlining an industrial software pipeline, with the example of a wildly used use-case, that respects both RTP and RTE requirements.

arxiv preprint arxiv, explainer, explanation, (13 more...)

2412.20798

Country:

Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Mozambique > Gaza Province > Xai-Xai (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Towards Real-Time 2D Mapping: Harnessing Drones, AI, and Computer Vision for Advanced Insights

Agnur, Bharath Kumar

This paper presents an advanced mapping system that combines drone imagery with machine learning and computer vision to overcome challenges in speed, accuracy, and adaptability across diverse terrains. By automating processes like feature detection, image matching, and stitching, the system produces seamless, high-resolution maps with minimal latency, offering strategic advantages in defense operations. Developed in Python, the system utilizes OpenCV for image processing, NumPy for efficient computations, and Concurrent[dot]futures for parallel execution. ORB (Oriented FAST and Rotated BRIEF) is employed for feature detection, while FLANN (Fast Library for Approximate Nearest Neighbors) ensures accurate keypoint matching. Homography transformations align overlapping images, resulting in distortion-free maps in real time. This automation eliminates manual intervention, enabling live updates essential in rapidly changing environments. Designed for versatility, the system performs reliably under various lighting conditions and rugged terrains, making it highly suitable for aerospace and defense applications. Testing has shown notable improvements in processing speed and accuracy compared to conventional methods, enhancing situational awareness and informed decision-making. This scalable solution leverages cutting-edge technologies to provide actionable, reliable data for mission-critical operations.

computer vision, feature detection, matching, (14 more...)

2412.2021

Country: Asia > India > Telangana > Hyderabad (0.04)

Genre:

Research Report (0.50)
Overview > Innovation (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
(2 more...)

Ezeakunne, Uzoamaka, Eze, Chrisantus, Liu, Xiuwen

Data-Driven Fairness Generalization for Deepfake Detection

Despite the progress made in deepfake detection research, recent studies have shown that biases in the training data for these detectors can result in varying levels of performance across different demographic groups, such as race and gender. These disparities can lead to certain groups being unfairly targeted or excluded. Traditional methods often rely on fair loss functions to address these issues, but they under-perform when applied to unseen datasets, hence, fairness generalization remains a challenge. In this work, we propose a data-driven framework for tackling the fairness generalization problem in deepfake detection by leveraging synthetic datasets and model optimization. Our approach focuses on generating and utilizing synthetic data to enhance fairness across diverse demographic groups. By creating a diverse set of synthetic samples that represent various demographic groups, we ensure that our model is trained on a balanced and representative dataset. This approach allows us to generalize fairness more effectively across different domains. We employ a comprehensive strategy that leverages synthetic data, a loss sharpness-aware optimization pipeline, and a multi-task learning framework to create a more equitable training environment, which helps maintain fairness across both intra-dataset and cross-dataset evaluations. Extensive experiments on benchmark deepfake detection datasets demonstrate the efficacy of our approach, surpassing state-of-the-art approaches in preserving fairness during cross-dataset evaluation. Our results highlight the potential of synthetic datasets in achieving fairness generalization, providing a robust solution for the challenges faced in deepfake detection.

dataset, deepfake detection, detection, (14 more...)

2412.16428

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > Florida > Leon County > Tallahassee (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)