AITopics

2410.22492

Country:

Europe > Switzerland (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Machine LearningNov-9-2024

Mutual-energy inner product optimization method for constructing feature coordinates and image classification in Machine Learning

Wang, Yuanxiu

As a key task in machine learning, data classification is essentially to find a suitable coordinate system to represent data features of different classes of samples. This paper proposes the mutual-energy inner product optimization method for constructing a feature coordinate system. First, by analyzing the solution space and eigenfunctions of partial differential equations describing a non-uniform membrane, the mutual-energy inner product is defined. Second, by expressing the mutual-energy inner product as a series of eigenfunctions, it shows a significant advantage of enhancing low-frequency features and suppressing high-frequency noise, compared with the Euclidean inner product. And then, a mutual-energy inner product optimization model is built to extract data features, and convexity and concavity properties of its objective function are discussed. Next, by combining the finite element method, a stable and efficient sequential linearization algorithm is constructed to solve the optimization model. This algorithm only solves equations including positive definite symmetric matrix and linear programming with a few constraints, and its vectorized implementation is discussed. Finally, the mutual-energy inner product optimization method is used to construct feature coordinates, and multi-class Gaussian classifiers are trained on the MINST training set. Good prediction results of Gaussian classifiers are achieved on the MINST test set.

artificial intelligence, expression, machine learning, (16 more...)

arXiv.org Machine Learning

2411.061

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
South America (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceNov-9-2024

M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework

Chia, Yew Ken, Cheng, Liying, Chan, Hou Pong, Liu, Chaoqun, Song, Maojia, Aljunied, Sharifah Mahani, Poria, Soujanya, Bing, Lidong

The ability to understand and answer questions over documents can be useful in many business and practical applications. However, documents often contain lengthy and diverse multimodal contents such as texts, figures, and tables, which are very time-consuming for humans to read thoroughly. Hence, there is an urgent need to develop effective and automated methods to aid humans in this task. In this work, we introduce M-LongDoc, a benchmark of 851 samples, and an automated framework to evaluate the performance of large multimodal models. We further propose a retrieval-aware tuning approach for efficient and effective multimodal document reading. Compared to existing works, our benchmark consists of more recent and lengthy documents with hundreds of pages, while also requiring open-ended solutions and not just extractive answers. To our knowledge, our training framework is the first to directly address the retrieval setting for multimodal long documents. To enable tuning open-source models, we construct a training corpus in a fully automatic manner for the question-answering task over such documents. Experiments show that our tuning approach achieves a relative improvement of 4.6% for the correctness of model responses, compared to the baseline open-source models. Our data, code, and models are available at https://multimodal-documents.github.io.

benchmark, large language model, machine learning, (22 more...)

2411.06176

Country:

Asia > Singapore (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

arXiv.org Artificial IntelligenceNov-9-2024

Multimodal Contrastive Learning of Urban Space Representations from POI Data

Wang, Xinglei, Cheng, Tao, Law, Stephen, Zeng, Zichao, Yin, Lu, Liu, Junyuan

Existing methods for learning urban space representations from Point-of-Interest (POI) data face several limitations, including issues with geographical delineation, inadequate spatial information modelling, underutilisation of POI semantic attributes, and computational inefficiencies. To address these issues, we propose CaLLiPer (Contrastive Language-Location Pre-training), a novel representation learning model that directly embeds continuous urban spaces into vector representations that can capture the spatial and semantic distribution of urban environment. This model leverages a multimodal contrastive learning objective, aligning location embeddings with textual POI descriptions, thereby bypassing the need for complex training corpus construction and negative sampling. We validate CaLLiPer's effectiveness by applying it to learning urban space representations in London, UK, where it demonstrates 5-15% improvement in predictive performance for land use classification and socioeconomic mapping tasks compared to state-of-the-art methods. Visualisations of the learned representations further illustrate our model's advantages in capturing spatial variations in urban semantics with high accuracy and fine resolution. Additionally, CaLLiPer achieves reduced training time, showcasing its efficiency and scalability. This work provides a promising pathway for scalable, semantically rich urban space representation learning that can support the development of geospatial foundation models. The implementation code is available at https://github.com/xlwang233/CaLLiPer.

encoder, representation, urban space representation, (14 more...)

2411.06229

Country:

Europe > United Kingdom > England > Greater London > London (0.24)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Colombia (0.04)
(2 more...)

Genre: Research Report > Promising Solution (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.88)
(2 more...)

BBC NewsNov-8-2024, 11:55:55 GMT

Fake paramedic guilty of Tinder date rapes

A man who pretended to be a paramedic has been found guilty of raping and sexually assaulting women he met on an online dating website. Jamie Kadolski, 24, of Ladysmith Road, Norwich, was found guilty of committing nine sexual offences over an 18-month period. During the trial at Norwich Crown Court he denied the charges made by four different women, which he met on Tinder. The court had previously heard how the former ambulance call handler had told the women he was a paramedic and had used stickers to hide his real role on his work ID card.SuppliedKadolski worked in medical sector but never as a paramedic Kadolski worked as a call handler for the East of England Ambulance Service. The prosecution told the jury that he used stickers to hide his more junior role, so he could claim to the women he met that he was a paramedic.

artificial intelligence, fake paramedic guilty, social media, (9 more...)

BBC News

Country:

Europe > United Kingdom > England (0.57)
North America > United States (0.56)
South America (0.17)
(14 more...)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.37)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.91)

Pollock, Joseph, Shilov, Igor, Dodd, Euodia, de Montjoye, Yves-Alexandre

Free Record-Level Privacy Risk Evaluation Through Artifact-Based Methods

Membership inference attacks (MIAs) are widely used to empirically assess the privacy risks of samples used to train a target machine learning model. State-of-the-art methods however require training hundreds of shadow models, with the same size and architecture of the target model, solely to evaluate the privacy risk. While one might be able to afford this for small models, the cost often becomes prohibitive for medium and large models. We here instead propose a novel approach to identify the at-risk samples using only artifacts available during training, with little to no additional computational overhead. Our method analyzes individual per-sample loss traces and uses them to identify the vulnerable data samples. We demonstrate the effectiveness of our artifact-based approach through experiments on the CIFAR10 dataset, showing high precision in identifying vulnerable samples as determined by a SOTA shadow model-based MIA (LiRA). Impressively, our method reaches the same precision as another SOTA MIA when measured against LiRA, despite it being orders of magnitude cheaper. We then show LT-IQR to outperform alternative loss aggregation methods, perform ablation studies on hyperparameters, and validate the robustness of our method to the target metric. Finally, we study the evolution of the vulnerability score distribution throughout training as a metric for model-level risk assessment.

inference attack, threshold, vulnerable point, (16 more...)

2411.05743

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science (0.93)

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Huang, Chien-yu, Chen, Wei-Chih, Yang, Shu-wen, Liu, Andy T., Li, Chen-An, Lin, Yu-Xiang, Tseng, Wei-Cheng, Diwan, Anuj, Shih, Yi-Jen, Shi, Jiatong, Chen, William, Chen, Xuanjun, Hsiao, Chi-Yuan, Peng, Puyuan, Wang, Shih-Heng, Kuan, Chun-Yi, Lu, Ke-Han, Chang, Kai-Wei, Yang, Chih-Kai, Ritter-Gutierrez, Fabian, Chuang, Ming To, Huang, Kuan-Po, Arora, Siddhant, Lin, You-Kuan, Yeo, Eunjung, Chang, Kalvin, Chien, Chung-Ming, Choi, Kwanghee, Hsieh, Cheng-Hsiu, Lin, Yi-Cheng, Yu, Chee-En, Chiu, I-Hsiang, Guimarães, Heitor R., Han, Jionghao, Lin, Tzu-Quan, Lin, Tzu-Yuan, Chang, Homu, Chang, Ting-Wu, Chen, Chun Wei, Chen, Shou-Jen, Chen, Yu-Hua, Cheng, Hsi-Chun, Dhawan, Kunal, Fang, Jia-Lin, Fang, Shi-Xin, Chiang, Kuan-Yu Fang, Fu, Chi An, Hsiao, Hsien-Fu, Hsu, Ching Yu, Huang, Shao-Syuan, Wei, Lee Chen, Lin, Hsi-Che, Lin, Hsuan-Hao, Lin, Hsuan-Ting, Lin, Jian-Ren, Liu, Ting-Chun, Lu, Li-Chun, Pai, Tsung-Min, Pasad, Ankita, Kuan, Shih-Yun Shan, Shon, Suwon, Tang, Yuxun, Tsai, Yun-Shao, Wei, Jui-Chiang, Wei, Tzu-Chieh, Wu, Chengxi, Wu, Dien-Ruei, Yang, Chao-Han Huck, Yang, Chieh-Chi, Yip, Jia Qi, Yuan, Shao-Xiang, Noroozi, Vahid, Chen, Zhehuai, Wu, Haibin, Livescu, Karen, Harwath, David, Watanabe, Shinji, Lee, Hung-yi

Multimodal foundation models, such as Gemini and ChatGPT, have revolutionized human-machine interactions by seamlessly integrating various forms of data. Developing a universal spoken language model that comprehends a wide range of natural language instructions is critical for bridging communication gaps and facilitating more intuitive interactions. However, the absence of a comprehensive evaluation benchmark poses a significant challenge. We present Dynamic-SUPERB Phase-2, an open and evolving benchmark for the comprehensive evaluation of instruction-based universal speech models. Building upon the first generation, this second version incorporates 125 new tasks contributed collaboratively by the global research community, expanding the benchmark to a total of 180 tasks, making it the largest benchmark for speech and audio evaluation. While the first generation of Dynamic-SUPERB was limited to classification tasks, Dynamic-SUPERB Phase-2 broadens its evaluation capabilities by introducing a wide array of novel and diverse tasks, including regression and sequence generation, across speech, music, and environmental audio. Evaluation results indicate that none of the models performed well universally. SALMONN-13B excelled in English ASR, while WavLLM demonstrated high accuracy in emotion recognition, but current models still require further innovations to handle a broader range of tasks. We will soon open-source all task data and the evaluation pipeline.

benchmark, large language model, machine learning, (20 more...)

2411.05361

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Taiwan (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)
(12 more...)

Genre: Research Report (0.63)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Prajapat, Dharmendra, Toshniwal, Durga

Improving Multi-Domain Task-Oriented Dialogue System with Offline Reinforcement Learning

Task-oriented dialogue (TOD) system is designed to accomplish user-defined tasks through dialogues. The TOD system has progressed towards end-to-end modeling by leveraging pre-trained large language models. Fine-tuning the pre-trained language models using only supervised learning leads to the exposure bias and token loss problem and it deviates the models from completing the user's task. To address these issues, we propose a TOD system that leverages a unified pre-trained language model, GPT2, as a base model. It is optimized using supervised learning and reinforcement learning (RL). The issues in the TOD system are mitigated using a non-differentiable reward function. The reward is calculated using the weighted sum of the success rate and BLEU evaluation metrics. The success rate and BLEU metrics in reward calculation guide the language model for user task completion while ensuring a coherent and fluent response. Our model is acquired by fine-tuning a pre-trained model on the dialogue-session level which comprises user utterance, belief state, system act, and system response. Experimental results on MultiWOZ2.1 demonstrate that our model increases the inform rate by 1.60% and the success rate by 3.17% compared to the baseline.

large language model, machine learning, reinforcement learning, (20 more...)

2411.0534

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > India > Uttarakhand > Roorkee (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Farfan-Escobedo, Jeanfranco D., Reis, Julio C. Dos

Improved intent classification based on context information using a windows-based approach

Conversational systems have a Natural Language Understanding (NLU) module. In this module, there is a task known as an intent classification that aims at identifying what a user is attempting to achieve from an utterance. Previous works use only the current utterance to predict the intent of a given query and they do not consider the role of the context (one or a few previous utterances) in the dialog flow for this task. In this work, we propose several approaches to investigate the role of contextual information for the intent classification task. Each approach is used to carry out a concatenation between the dialogue history and the current utterance. Our intent classification method is based on a convolutional neural network that obtains effective vector representations from BERT to perform accurate intent classification using an approach window-based. Our experiments were carried out on a real-world Brazilian Portuguese corpus with dialog flows provided by Wavy global company. Our results achieved substantial improvements over the baseline, isolated utterances (without context), in three approaches using the user's utterance and system's response from previous messages as dialogue context.

machine learning, natural language, utterance, (19 more...)

2411.06022

Country:

South America > Brazil > São Paulo > Campinas (0.05)
South America > Peru > Cusco Department > Cusco Province > Cusco (0.04)
North America > Central America (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

arXiv.org Machine LearningNov-8-2024

Towards Harmless Rawlsian Fairness Regardless of Demographic Prior

Wang, Xuanqian, Li, Jing, Tsang, Ivor W., Ong, Yew-Soon

Due to privacy and security concerns, recent advancements in group fairness advocate for model training regardless of demographic information. However, most methods still require prior knowledge of demographics. In this study, we explore the potential for achieving fairness without compromising its utility when no prior demographics are provided to the training set, namely \emph{harmless Rawlsian fairness}. We ascertain that such a fairness requirement with no prior demographic information essential promotes training losses to exhibit a Dirac delta distribution. To this end, we propose a simple but effective method named VFair to minimize the variance of training losses inside the optimal set of empirical losses. This problem is then optimized by a tailored dynamic update approach that operates in both loss and gradient dimensions, directing the model towards relatively fairer solutions while preserving its intact utility. Our experimental findings indicate that regression tasks, which are relatively unexplored from literature, can achieve significant fairness improvement through VFair regardless of any prior, whereas classification tasks usually do not because of their quantized utility measurements. The implementation of our method is publicly available at \url{https://github.com/wxqpxw/VFair}.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2411.02467

Country:

Asia > Singapore (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > California (0.04)
Asia > China (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (0.95)
Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)