AITopics

ABSTRACT In the domain of Music Information Retrieval (MIR), Automatic Music Transcription (AMT) emerges as a central challenge, aiming to convert audio signals into symbolic Figure 1. This review critically evaluates both Loudness estimation and quantization, Instrument recognition, fully automatic and semi-automatic AMT systems, emphasizing Extraction of rhythmic information, Time quantization, the importance of minimal user intervention and examining Extraction of velocity and dynamic various methodologies proposed to date. By addressing Figure 1 (represented in [7]), illustrates the data representations the limitations of prior techniques and suggesting in an AMT system. AMT system takes an audio avenues for improvement, our objective is to steer future waveform as input, computes a time-frequency representation research towards fully automated AMT systems capable of the audio, outputs a representation of pitches of accurately and efficiently translating intricate audio signals over time in a spectrogram, and generates a typeset music into precise symbolic representations. Previous studies have tackled Automatic Music only synthesizes the latest advancements but also lays out a Transcription (AMT) using two main approaches: Nonnegative road-map for overcoming existing challenges in AMT, providing Matrix Factorization (NMF) [8], and Neural Networks valuable insights for researchers aiming to narrow (NNs) [9] [2].

dataset, music transcription, transcription, (13 more...)

2406.15249

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Germany > Saarland (0.04)
Europe > Germany > Berlin (0.04)

Genre: Overview (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Few-shot Knowledge Graph Relational Reasoning via Subgraph Adaptation

Liu, Haochen, Wang, Song, Chen, Chen, Li, Jundong

Few-shot Knowledge Graph (KG) Relational Reasoning aims to predict unseen triplets (i.e., query triplets) for rare relations in KGs, given only several triplets of these relations as references (i.e., support triplets). This task has gained significant traction due to the widespread use of knowledge graphs in various natural language processing applications. Previous approaches have utilized meta-training methods and manually constructed meta-relation sets to tackle this task. Recent efforts have focused on edge-mask-based methods, which exploit the structure of the contextualized graphs of target triplets (i.e., a subgraph containing relevant triplets in the KG). However, existing edge-mask-based methods have limitations in extracting insufficient information from KG and are highly influenced by spurious information in KG. To overcome these challenges, we propose SAFER (Subgraph Adaptation for Few-shot Relational Reasoning), a novel approach that effectively adapts the information in contextualized graphs to various subgraphs generated from support and query triplets to perform the prediction. Specifically, SAFER enables the extraction of more comprehensive information from support triplets while minimizing the impact of spurious information when predicting query triplets. Experimental results on three prevalent datasets demonstrate the superiority of our proposed framework SAFER.

graph, information, relation, (17 more...)

2406.15507

Country: North America > United States > Virginia (0.05)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.83)

ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems

Jia, Pengyue, Wang, Yejing, Du, Zhaocheng, Zhao, Xiangyu, Wang, Yichao, Chen, Bo, Wang, Wanyu, Guo, Huifeng, Tang, Ruiming

Deep Recommender Systems (DRS) are increasingly dependent on a large number of feature fields for more precise recommendations. Effective feature selection methods are consequently becoming critical for further enhancing the accuracy and optimizing storage efficiencies to align with the deployment demands. This research area, particularly in the context of DRS, is nascent and faces three core challenges. Firstly, variant experimental setups across research papers often yield unfair comparisons, obscuring practical insights. Secondly, the existing literature's lack of detailed analysis on selection attributes, based on large-scale datasets and a thorough comparison among selection techniques and DRS backbones, restricts the generalizability of findings and impedes deployment on DRS. Lastly, research often focuses on comparing the peak performance achievable by feature selection methods, an approach that is typically computationally infeasible for identifying the optimal hyperparameters and overlooks evaluating the robustness and stability of these methods. To bridge these gaps, this paper presents ERASE, a comprehensive bEnchmaRk for feAture SElection for DRS. ERASE comprises a thorough evaluation of eleven feature selection methods, covering both traditional and deep learning approaches, across four public datasets, private industrial datasets, and a real-world commercial platform, achieving significant enhancement. Our code is available online for ease of reproduction.

dataset, feature selection method, selection method, (11 more...)

2403.1266

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
Asia > China > Hong Kong (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Learning with 3D rotations, a hitchhiker's guide to SO(3)

Geist, A. René, Frey, Jonas, Zobro, Mikel, Levina, Anna, Martius, Georg

Many settings in machine learning require the selection of a rotation representation. However, choosing a suitable representation from the many available options is challenging. This paper acts as a survey and guide through rotation representations. We walk through their properties that harm or benefit deep learning with gradient-based optimization. By consolidating insights from rotation-based learning, we provide a comprehensive overview of learning functions with rotation representations. We provide guidance on selecting representations based on whether rotations are in the model's input or output and whether the data primarily comprises small angles.

representation, rotation, rotation representation, (16 more...)

2404.11735

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre:

Overview (0.68)
Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Mahaut, Matéo, Aina, Laura, Czarnowska, Paula, Hardalov, Momchil, Müller, Thomas, Màrquez, Lluís

Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators

Large Language Models (LLMs) tend to be unreliable in the factuality of their answers. To address this problem, NLP researchers have proposed a range of techniques to estimate LLM's confidence over facts. However, due to the lack of a systematic comparison, it is not clear how the different methods compare to one another. To fill this gap, we present a survey and empirical comparison of estimators of factual confidence. We define an experimental framework allowing for fair comparison, covering both fact-verification and question answering. Our experiments across a series of LLMs indicate that trained hidden-state probes provide the most reliable confidence estimates, albeit at the expense of requiring access to weights and training data. We also conduct a deeper assessment of factual confidence by measuring the consistency of model behavior under meaning-preserving variations in the input. We find that the confidence of LLMs is often unstable across semantically equivalent inputs, suggesting that there is much room for improvement of the stability of models' parametric knowledge. Our code is available at (https://github.com/amazon-science/factual-confidence-of-llms).

dataset, factual confidence, probe, (14 more...)

2406.13415

Country:

Europe > France (0.29)
Europe > Austria > Vienna (0.14)
Asia > Singapore (0.05)
(10 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Painter, Jeffery L., Chalamalasetti, Venkateswara Rao, Kassekert, Raymond, Bate, Andrew

Bridging the Gap in Drug Safety Data Analysis: Large Language Models for SQL Query Generation

Pharmacovigilance (PV) is essential for drug safety, primarily focusing on adverse event monitoring. Traditionally, accessing safety data required database expertise, limiting broader use. This paper introduces a novel application of Large Language Models (LLMs) to democratize database access for non-technical users. Utilizing OpenAI's GPT-4, we developed a chatbot that generates structured query language (SQL) queries from natural language, bridging the gap between domain knowledge and technical requirements. The proposed application aims for more inclusive and efficient data access, enhancing decision making in drug safety. By providing LLMs with plain language summaries of expert knowledge, our approach significantly improves query accuracy over methods relying solely on database schemas. The application of LLMs in this context not only optimizes PV data analysis, ensuring timely and precise drug safety reporting -- a crucial component in adverse drug reaction monitoring -- but also promotes safer pharmacological practices and informed decision making across various data intensive fields.

business context document, llm, query, (15 more...)

2406.1069

Country:

North America > United States > Texas > Collin County > Plano (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Overview > Innovation (0.48)
Research Report > Promising Solution (0.46)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Jeong, Eunjeong, Kountouris, Marios

DRACO: Decentralized Asynchronous Federated Learning over Continuous Row-Stochastic Network Matrices

Recent advancements in machine learning, networked intelligent systems, and wireless connectivity have paved the way for various innovative applications and use cases across various sectors, including the Internet of Things (IoT), consumer robotics, autonomous transportation, and edge computing. These systems increasingly rely on decentralized learning architectures for processing data where generated, minimizing latency and bandwidth usage while enhancing privacy . However, these benefits come with significant challenges, particularly in terms of ensuring efficient and reliable communication and processing within inherently unstable and diverse network environments. Addressing these challenges requires novel approaches that adapt to the unique demands of decentralized architectures, fostering robust and expandable solutions for real-time data processing and learning. In this work, we consider the problem of communication efficiency in federated learning (FL) [1] and in particular in serverless (fully decentralized) learning settings that operate without a central coordinating server [2-6]. Asynchronous learning, empowering each participant to conduct local training and data transmission at their own pace, is a standard and relevant design choice in decentralized network schemes [7-12]. Asynchronous and decentralized learning have an advantage when used separately from each other, manifesting as adaptability to limited resources and downsized communication overhead. Y et unfortunately, when these two paradigms are combined, their integration poses a greater challenge in achieving a unanimous global consensus, as required for instance in the development of sophisticated navigation algorithms [13]. Decentralized optimization studies in the literature often involve high "synchronization costs" due to the complexity of ensuring consensus.

inequality, learning, proceedings, (14 more...)

2406.13533

Country:

North America > United States (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > France (0.04)

Genre:

Overview > Innovation (0.68)
Research Report > Promising Solution (0.48)

Industry:

Information Technology (0.68)
Education (0.66)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

CH, Nadeem Jabbar, Saghir, Aqib, Meer, Ayaz Ahmad, Sahi, Salman Ahmad, Hassan, Bilal, Yasir, Siddiqui Muhammad

Media Forensics and Deepfake Systematic Survey

Deepfake is a generative deep learning algorithm that creates or changes facial features in a very realistic way making it hard to differentiate the real from the fake features It can be used to make movies look better as well as to spread false information by imitating famous people In this paper many different ways to make a Deepfake are explained analyzed and separated categorically Using Deepfake datasets models are trained and tested for reliability through experiments Deepfakes are a type of facial manipulation that allow people to change their entire faces identities attributes and expressions The trends in the available Deepfake datasets are also discussed with a focus on how they have changed Using Deep learning a general Deepfake detection model is made Moreover the problems in making and detecting Deepfakes are also mentioned As a result of this survey it is expected that the development of new Deepfake based imaging tools will speed up in the future This survey gives indepth review of methods for manipulating images of face and various techniques to spot altered face images Four types of facial manipulation are specifically discussed which are attribute manipulation expression swap entire face synthesis and identity swap Across every manipulation category we yield information on manipulation techniques significant benchmarks for technical evaluation of counterfeit detection techniques available public databases and a summary of the outcomes of all such analyses From all of the topics in the survey we focus on the most recent development of Deepfake showing its advances and obstacles in detecting fake images

forensic and deepfake systematic survey, media forensic

doi: 10.46470/03d8ffbd.7351a3bb

2406.13295

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Yasir, Siddiqui Muhammad, Ahn, Hyunsik

Deep Learning-Based 3D Instance and Semantic Segmentation: A Review

The process of segmenting point cloud data into several homogeneous areas with points in the same region having the same attributes is known as 3D segmentation. Segmentation is challenging with point cloud data due to substantial redundancy, fluctuating sample density and lack of apparent organization. The research area has a wide range of robotics applications, including intelligent vehicles, autonomous mapping and navigation. A number of researchers have introduced various methodologies and algorithms. Deep learning has been successfully used to a spectrum of 2D vision domains as a prevailing A.I. methods. However, due to the specific problems of processing point clouds with deep neural networks, deep learning on point clouds is still in its initial stages. This study examines many strategies that have been presented to 3D instance and semantic segmentation and gives a complete assessment of current developments in deep learning-based 3D segmentation. In these approaches benefits, draw backs, and design mechanisms are studied and addressed. This study evaluates the impact of various segmentation algorithms on competitiveness on various publicly accessible datasets, as well as the most often used pipelines, their advantages and limits, insightful findings and intriguing future research directions.

point cloud, segmentation, semantic segmentation, (12 more...)

doi: 10.32604/jai.2022.031235

2406.13308

Country:

Asia > South Korea > Busan > Busan (0.05)
North America > United States > Washington > King County > Seattle (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.05)
(20 more...)

Genre:

Overview (0.93)
Research Report > Experimental Study (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Recent advances in text embedding: A Comprehensive Review of Top-Performing Methods on the MTEB Benchmark

Cao, Hongliu

Text embedding methods have become increasingly popular in both industrial and academic fields due to their critical role in a variety of natural language processing tasks. The significance of universal text embeddings has been further highlighted with the rise of Large Language Models (LLMs) applications such as Retrieval-Augmented Systems (RAGs). While previous models have attempted to be general-purpose, they often struggle to generalize across tasks and domains. However, recent advancements in training data quantity, quality and diversity; synthetic data generation from LLMs as well as using LLMs as backbones encourage great improvements in pursuing universal text embeddings. In this paper, we provide an overview of the recent advances in universal text embedding models with a focus on the top performing text embeddings on Massive Text Embedding Benchmark (MTEB). Through detailed comparison and analysis, we highlight the key contributions and limitations in this area, and propose potentially inspiring future research directions.

arxiv preprint arxiv, dataset, universal text, (14 more...)

2406.01607

Country:

Asia > Middle East > UAE (0.05)
Asia > China > Beijing > Beijing (0.04)
North America > Canada > Quebec (0.04)
(3 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)