AITopics

2410.12883

Country:

Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceDec-3-2024

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

Kirk, Hannah Rose, Whitefield, Alexander, Röttger, Paul, Bean, Andrew, Margatina, Katerina, Ciro, Juan, Mosquera, Rafael, Bartolo, Max, Williams, Adina, He, He, Vidgen, Bertie, Hale, Scott A.

Human feedback is central to the alignment of Large Language Models (LLMs). However, open questions remain about methods (how), domains (where), people (who) and objectives (to what end) of feedback processes. To navigate these questions, we introduce PRISM, a dataset that maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 LLMs. With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts. We target subjective and multicultural perspectives on value-laden and controversial issues, where we expect interpersonal and cross-cultural disagreement. We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.

idiosyncratic variance, individualised human feedback reveal, transition probability, (17 more...)

2404.16019

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Russia (0.14)
(98 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-3-2024

CultureLLM: Incorporating Cultural Differences into Large Language Models

Li, Cheng, Chen, Mengzhou, Wang, Jindong, Sitaram, Sunayana, Xie, Xing

Large language models (LLMs) are reported to be partial to certain cultures owing to the training data dominance from the English corpora. Since multilingual cultural data are often expensive to collect, existing efforts handle this by prompt engineering or culture-specific pre-training. However, they might overlook the knowledge deficiency of low-resource culture and require extensive computing resources. In this paper, we propose CultureLLM, a cost-effective solution to incorporate cultural differences into LLMs. CultureLLM adopts World Value Survey (WVS) as seed data and generates semantically equivalent training data via the proposed semantic data augmentation. Using only 50 seed samples from WVS with augmented data, we fine-tune culture-specific LLMs and one unified model (CultureLLM-One) for 9 cultures covering rich and low-resource languages. Extensive experiments on 60 culture-related datasets demonstrate that CultureLLM significantly outperforms various counterparts such as GPT-3.5 (by 8.1%) and Gemini Pro (by 9.5%) with comparable performance to GPT-4 or even better. Our human study shows that the generated samples are semantically equivalent to the original samples, providing an effective solution for LLMs augmentation. Code is released at https://github.com/Scarelette/CultureLLM.

culturellm, dataset, detection, (13 more...)

2402.10946

Country:

Asia > Middle East > Republic of Türkiye (0.14)
Europe > Portugal (0.04)
Asia > China (0.04)
(36 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Media > News (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningDec-3-2024

The Broader Landscape of Robustness in Algorithmic Statistics

Kamath, Gautam

Mean estimation is one of the most fundamental statistical tasks: given samples from a probability distribution, output an estimate of that distribution's mean. It is the prototypical question in statistical inference, and an important primitive that underlies a variety of more complex procedures (e.g., gradient descent, linear regression, etc.). In other words, understanding mean estimation is a prerequisite for understanding essentially any other inference task. As we will see, even in this basic setting, introducing new constraints or desiderata significantly affects the algorithmic ideas needed to solve the problem, particularly with a focus on computational efficiency. More precisely, mean estimation refers to the following statistical problem.

algorithm, estimation, mean estimation, (17 more...)

arXiv.org Machine Learning

2412.0267

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Tenorio, Victor M., Navarro, Madeline, Rey, Samuel, Segarra, Santiago, Marques, Antonio G.

Structure-Guided Input Graph for GNNs facing Heterophily

Graph Neural Networks (GNNs) have emerged as a promising tool to handle data exhibiting an irregular structure. However, most GNN architectures perform well on homophilic datasets, where the labels of neighboring nodes are likely to be the same. In recent years, an increasing body of work has been devoted to the development of GNN architectures for heterophilic datasets, where labels do not exhibit this low-pass behavior. In this work, we create a new graph in which nodes are connected if they share structural characteristics, meaning a higher chance of sharing their labels, and then use this new graph in the GNN architecture. To do this, we compute the k-nearest neighbors graph according to distances between structural features, which are either (i) role-based, such as degree, or (ii) global, such as centrality measures. Experiments show that the labels are smoother in this newly defined graph and that the performance of GNN architectures improves when using this alternative structure.

artificial intelligence, graph, machine learning, (18 more...)

2412.01757

Country:

North America > United States > Texas (0.06)
North America > United States > Wisconsin (0.05)
Europe > Spain > Galicia > Madrid (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Garg, Mallika, Ghosh, Debashis, Pradhan, Pyari Mohan

ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition

Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision. With the growing interest in transformer-based architectures, they are now utilized for gesture recognition. So, we also explore and devise a novel ConvMixFormer architecture for dynamic hand gestures. The transformers use quadratic scaling of the attention features with the sequential data, due to which these models are computationally complex and heavy. We have considered this drawback of the transformer and designed a resource-efficient model that replaces the self-attention in the transformer with the simple convolutional layer-based token mixer. The computational cost and the parameters used for the convolution-based mixer are comparatively less than the quadratic self-attention. Convolution-mixer helps the model capture the local spatial features that self-attention struggles to capture due to their sequential processing nature. Further, an efficient gate mechanism is employed instead of a conventional feed-forward network in the transformer to help the model control the flow of features within different stages of the proposed model. This design uses fewer learnable parameters which is nearly half the vanilla transformer that helps in fast and efficient training. The proposed method is evaluated on NVidia Dynamic Hand Gesture and Briareo datasets and our model has achieved state-of-the-art results on single and multimodal inputs. We have also shown the parameter efficiency of the proposed ConvMixFormer model compared to other methods. The source code is available at https://github.com/mallikagarg/ConvMixFormer.

machine learning, natural language, recognition, (18 more...)

2411.07118

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(13 more...)

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

NLLG Quarterly arXiv Report 09/24: What are the most influential current AI Papers?

Leiter, Christoph, Belouadi, Jonas, Chen, Yanran, Zhang, Ran, Larionov, Daniil, Kostikova, Aida, Eger, Steffen

The NLLG (Natural Language Learning & Generation) arXiv reports assist in navigating the rapidly evolving landscape of NLP and AI research across cs.CL, cs.CV, cs.AI, and cs.LG categories. This fourth installment captures a transformative period in AI history - from January 1, 2023, following ChatGPT's debut, through September 30, 2024. Our analysis reveals substantial new developments in the field - with 45% of the top 40 most-cited papers being new entries since our last report eight months ago and offers insights into emerging trends and major breakthroughs, such as novel multimodal architectures, including diffusion and state space models. Natural Language Processing (NLP; cs.CL) remains the dominant main category in the list of our top-40 papers but its dominance is on the decline in favor of Computer vision (cs.CV) and general machine learning (cs.LG). This report also presents novel findings on the integration of generative AI in academic writing, documenting its increasing adoption since 2022 while revealing an intriguing pattern: top-cited papers show notably fewer markers of AI-generated content compared to random samples. Furthermore, we track the evolution of AI-associated language, identifying declining trends in previously common indicators such as "delve".

large language model, machine learning, natural language, (22 more...)

2412.12121

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
(6 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Government (0.46)
Leisure & Entertainment (0.35)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Grunewald, Marla, Bensalem, Mounir, Jukan, Admela

Optimizing LoRa for Edge Computing with TinyML Pipeline for Channel Hopping

We propose to integrate long-distance LongRange (LoRa) communication solution for sending the data from IoT to the edge computing system, by taking advantage of its unlicensed nature and the potential for open source implementations that are common in edge computing. We propose a channel hoping optimization model and apply TinyML-based channel hoping model based for LoRa transmissions, as well as experimentally study a fast predictive algorithm to find free channels between edge and IoT devices. In the open source experimental setup that includes LoRa, TinyML and IoT-edge-cloud continuum, we integrate a novel application workflow and cloud-friendly protocol solutions in a case study of plant recommender application that combines concepts of microfarming and urban computing. In a LoRa-optimized edge computing setup, we engineer the application workflow, and apply collaborative filtering and various machine learning algorithms on application data collected to identify and recommend the planting schedule for a specific microfarm in an urban area. In the LoRa experiments, we measure the occurrence of packet loss, RSSI, and SNR, using a random channel hoping scheme to compare with our proposed TinyML method. The results show that it is feasible to use TinyML in microcontrollers for channel hopping, while proving the effectiveness of TinyML in learning to predict the best channel to select for LoRa transmission, and by improving the RSSI by up to 63 %, SNR by up to 44 % in comparison with a random hopping mechanism.

cloud computing, gateway, machine learning, (18 more...)

2412.01609

Country:

South America > Colombia > Antioquia Department > Medellín (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Bas, Maria C., Bolos, Vicente J., Prieto, Alvaro E., Rodriguez-Echeverria, Roberto, Sanchez-Figueroa, Fernando

A multi-criteria decision support system to evaluate the effectiveness of training courses on citizens' employability

This study examines the impact of lifelong learning on the professional lives of employed and unemployed individuals. Lifelong learning is a crucial factor in securing employment or enhancing one's existing career prospects. To achieve this objective, this study proposes the implementation of a multi-criteria decision support system for the evaluation of training courses in accordance with their capacity to enhance the employability of the students. The methodology is delineated in four stages. Firstly, a `working life curve' was defined to provide a quantitative description of an individual's working life. Secondly, an analysis based on K-medoids clustering defined a control group for each individual for comparison. Thirdly, the performance of a course according to each of the four predefined criteria was calculated using a t-test to determine the mean performance value of those who took the course. Ultimately, the unweighted TOPSIS method was used to evaluate the efficacy of the various training courses in relation to the four criteria. This approach effectively addresses the challenge of using extensive datasets within a system while facilitating the application of a multi-criteria unweighted TOPSIS method. The results of the multi-criteria TOPSIS method indicated that training courses related to the professional fields of administration and management, hostel and tourism and community and sociocultural services have positive impact on employability and improving the working conditions of citizens. However, courses that demonstrate the greatest effectiveness in ranking are the least demanded by citizens. The results will help policymakers evaluate the effectiveness of each training course offered by the regional government.

artificial intelligence, decision support system, machine learning, (17 more...)

doi: 10.1007/s10489-024-05967-0

2412.01351

Country:

Europe > Austria > Vienna (0.14)
Europe > Spain > Extremadura (0.05)
South America > Colombia > Santander Department (0.04)
(9 more...)

Genre:

Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Government (1.00)
Education > Educational Setting (1.00)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Seal, Daniel, Arcucci, Rossella, Rühling-Cachay, Salva, Quilodrán-Casas, César

DYffCast: Regional Precipitation Nowcasting Using IMERG Satellite Data. A case study over South America

Climate change is increasing the frequency of extreme precipitation events, making weather disasters such as flooding and landslides more likely. The ability to accurately nowcast precipitation is therefore becoming more critical for safeguarding society by providing immediate, accurate information to decision makers. Motivated by the recent success of generative models at precipitation nowcasting, this paper: extends the DYffusion framework to this task and evaluates its performance at forecasting IMERG satellite precipitation data up to a 4-hour horizon; modifies the DYffusion framework to improve its ability to model rainfall data; and introduces a novel loss function that combines MSE, MAE and the LPIPS perceptual score. In a quantitative evaluation of forecasts up to a 4-hour horizon, the modified DYffusion framework trained with the novel loss outperforms four competitor models. It has the highest CSI scores for weak, moderate, and heavy rain thresholds and retains an LPIPS score $<$ 0.2 for the entire roll-out, degrading the least as lead-time increases. The proposed nowcasting model demonstrates visually stable and sharp forecasts up to a 2-hour horizon on a heavy rain case study. Code is available at https://github.com/Dseal95/DYffcast.

artificial intelligence, machine learning, precipitation, (17 more...)

2412.02723

Country:

South America > Colombia (0.14)
South America > Peru (0.05)
North America > Central America (0.05)
(5 more...)

Genre: Research Report (0.51)

Industry: Government (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)