AITopics

The dramatic increase in consumption of ultra-processed food has been associated with numerous adverse health effects. Given the public health consequences linked to ultra-processed food consumption, it is highly relevant to build computational models to predict the processing of food products. We created a range of machine learning, deep learning, and NLP models to predict the extent of food processing by integrating the FNDDS dataset of food products and their nutrient profiles with their reported NOVA processing level. Starting with the full nutritional panel of 102 features, we further implemented coarse-graining of features to 65 and 13 nutrients by dropping flavonoids and then by considering the 13-nutrient panel of FDA, respectively. LGBM Classifier and Random Forest emerged as the best model for 102 and 65 nutrients, respectively, with an F1-score of 0.9411 and 0.9345 and MCC of 0.8691 and 0.8543. For the 13-nutrient panel, Gradient Boost achieved the best F1-score of 0.9284 and MCC of 0.8425. We also implemented NLP based models, which exhibited state-of-the-art performance.

large language model, machine learning, nutrient, (17 more...)

2412.17217

Country:

North America > United States > North Carolina (0.04)
North America > Mexico (0.04)
North America > Guatemala (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Consumer Health (1.00)
Food & Agriculture (1.00)
Education > Health & Safety > School Nutrition (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Machine LearningDec-22-2024

Rethinking Cancer Gene Identification through Graph Anomaly Analysis

Zang, Yilong, Ren, Lingfei, Li, Yue, Wang, Zhikang, Selby, David Antony, Wang, Zheng, Vollmer, Sebastian Josef, Yin, Hongzhi, Song, Jiangning, Wu, Junhang

Graph neural networks (GNNs) have shown promise in integrating protein-protein interaction (PPI) networks for identifying cancer genes in recent studies. However, due to the insufficient modeling of the biological information in PPI networks, more faithfully depiction of complex protein interaction patterns for cancer genes within the graph structure remains largely unexplored. This study takes a pioneering step toward bridging biological anomalies in protein interactions caused by cancer genes to statistical graph anomaly. We find a unique graph anomaly exhibited by cancer genes, namely weight heterogeneity, which manifests as significantly higher variance in edge weights of cancer gene nodes within the graph. Additionally, from the spectral perspective, we demonstrate that the weight heterogeneity could lead to the "flattening out" of spectral energy, with a concentration towards the extremes of the spectrum. Building on these insights, we propose the HIerarchical-Perspective Graph Neural Network (HIPGNN) that not only determines spectral energy distribution variations on the spectral perspective, but also perceives detailed protein interaction context on the spatial perspective. Extensive experiments are conducted on two reprocessed datasets STRINGdb and CPDB, and the experimental results demonstrate the superiority of HIPGNN.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2412.1724

Country:

Oceania > Australia > Queensland (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Luong, Kevin, Thielscher, Michael

Hierarchically Gated Experts for Efficient Online Continual Learning

Continual Learning models aim to learn a set of tasks under the constraint that the tasks arrive sequentially with no way to access data from previous tasks. The Online Continual Learning framework poses a further challenge where the tasks are unknown and instead the data arrives as a single stream. Building on existing work, we propose a method for identifying these underlying tasks: the Gated Experts (GE) algorithm, where a dynamically growing set of experts allows for new knowledge to be acquired without catastrophic forgetting. Furthermore, we extend GE to Hierarchically Gated Experts (HGE), a method which is able to efficiently select the best expert for each data sample by organising the experts into a hierarchical structure. On standard Continual Learning benchmarks, GE and HGE are able to achieve results comparable with current methods, with HGE doing so more efficiently.

artificial intelligence, continual learning, machine learning, (16 more...)

2412.17188

Country:

North America (0.46)
Oceania > Australia (0.28)

Genre:

Instructional Material > Online (0.62)
Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective

Wang, Hankun, Wang, Haoran, Guo, Yiwei, Li, Zhihan, Du, Chenpeng, Chen, Xie, Yu, Kai

Although text-based large language models exhibit human-level writing ability and remarkable intelligence, speech language models (SLMs) still struggle to generate semantically coherent outputs. There are several potential reasons for this performance degradation: (A) speech tokens mainly provide phonetic information rather than semantic information, (B) the length of speech sequences is much longer than that of text sequences, and (C) paralinguistic information, such as prosody, introduces additional complexity and variability. In this paper, we explore the influence of three key factors separately by transiting the modality from text to speech in an evolving manner. Our findings reveal that the impact of the three factors varies. Factor A has a relatively minor impact, factor B influences syntactical and semantic modeling more obviously, and factor C exerts the most significant impact, particularly in the basic lexical modeling. Based on these findings, we provide insights into the unique challenges of training SLMs and highlight pathways to develop more effective end-to-end SLMs.

large language model, machine learning, natural language, (19 more...)

2412.17048

Country:

Europe > Spain (0.28)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Rathnayake, Charitha, Thilakarathna, P. R. S., Nethmini, Uthpala, Kaur, Rishemjith, Ranathunga, Surangika

Unsupervised Bilingual Lexicon Induction for Low Resource Languages

Bilingual lexicons play a crucial role in various Natural Language Processing tasks. However, many low-resource languages (LRLs) do not have such lexicons, and due to the same reason, cannot benefit from the supervised Bilingual Lexicon Induction (BLI) techniques. To address this, unsupervised BLI (UBLI) techniques were introduced. A prominent technique in this line is structure-based UBLI. It is an iterative method, where a seed lexicon, which is initially learned from monolingual embeddings is iteratively improved. There have been numerous improvements to this core idea, however they have been experimented with independently of each other. In this paper, we investigate whether using these techniques simultaneously would lead to equal gains. We use the unsupervised version of VecMap, a commonly used structure-based UBLI framework, and carry out a comprehensive set of experiments using the LRL pairs, English-Sinhala, English-Tamil, and English-Punjabi. These experiments helped us to identify the best combination of the extensions. We also release bilingual dictionaries for English-Sinhala and English-Punjabi.

bilingual lexicon induction, machine learning, natural language, (14 more...)

2412.16894

Country:

Asia (0.46)
Oceania (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

The GuardianDec-21-2024, 20:33:31 GMT

OpenAI whistleblower who died was being considered as witness against company

Balaji worked at OpenAI for nearly four years before quitting in August. He had been well-regarded by colleagues at the San Francisco company, where a co-founder this week called him one of OpenAI's strongest contributors who was essential to developing some of its products. "We are devastated to learn of this incredibly sad news and our hearts go out to Suchir's loved ones during this difficult time," said a statement from OpenAI. Balaji was found dead in his San Francisco apartment on 26 November in what police said "appeared to be a suicide. No evidence of foul play was found during the initial investigation."

balaji, large language model, machine learning, (17 more...)

The Guardian

Country:

North America > United States > California > San Francisco County > San Francisco (0.49)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.05)
Oceania > Australia (0.05)
(5 more...)

Genre: Research Report (0.35)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Pranto, Rotan Hawlader, Siddique, Shahnewaz

Real-time Bangla Sign Language Translator

The human body communicates through various meaningful gestures, with sign language using hands being a prominent example. Bangla Sign Language Translation (BSLT) aims to bridge communication gaps for the deaf and mute community. Our approach involves using Mediapipe Holistic to gather key points, LSTM architecture for data training, and Computer Vision for realtime sign language detection with an accuracy of 94%. Keywords=Recurrent Neural Network, LSTM, Computer Vision, Bangla font.

artificial intelligence, machine learning, recognition, (14 more...)

2412.16497

Country:

Asia > China > Anhui Province > Hefei (0.05)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
Asia > China > Tianjin Province > Tianjin (0.05)
(8 more...)

Genre: Research Report (0.64)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

GME: Improving Universal Multimodal Retrieval by Multimodal LLMs

Zhang, Xin, Zhang, Yanzhao, Xie, Wen, Li, Mingxin, Dai, Ziqi, Long, Dingkun, Xie, Pengjun, Zhang, Meishan, Li, Wenjie, Zhang, Min

Universal Multimodal Retrieval (UMR) aims to enable search across various modalities using a unified model, where queries and candidates can consist of pure text, images, or a combination of both. Previous work has attempted to adopt multimodal large language models (MLLMs) to realize UMR using only text data. However, our preliminary experiments demonstrate that more diverse multimodal training data can further unlock the potential of MLLMs. Despite its effectiveness, the existing multimodal training data is highly imbalanced in terms of modality, which motivates us to develop a training data synthesis pipeline and construct a large-scale, high-quality fused-modal training dataset. Based on the synthetic training data, we develop the General Multimodal Embedder (GME), an MLLM-based dense retriever designed for UMR. Furthermore, we construct a comprehensive UMR Benchmark (UMRB) to evaluate the effectiveness of our approach. Experimental results show that our method achieves state-of-the-art performance among existing UMR methods. Last, we provide in-depth analyses of model scaling, training strategies, and perform ablation studies on both the model and synthetic data.

large language model, machine learning, natural language, (20 more...)

2412.16855

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
North America > United States > Florida > Miami-Dade County > Miami (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(43 more...)

Genre: Research Report > New Finding (0.65)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Autoregressive Speech Synthesis with Next-Distribution Prediction

Zhu, Xinfa, Tian, Wenjie, Xie, Lei

We introduce KALL-E, a novel autoregressive (AR) language modeling approach with next-distribution prediction for text-to-speech (TTS) synthesis. Unlike existing methods, KALL-E directly models and predicts the continuous speech distribution conditioned on text without relying on VAE- or diffusion-based components. Specifically, we use WaveVAE to extract continuous speech distributions from waveforms instead of using discrete speech tokens. A single AR language model predicts these continuous speech distributions from text, with a Kullback-Leibler divergence loss as the constraint. Experimental results show that KALL-E outperforms open-source implementations of YourTTS, VALL-E, NaturalSpeech 2, and CosyVoice in terms of naturalness and speaker similarity in zero-shot TTS scenarios. Moreover, KALL-E demonstrates exceptional zero-shot capabilities in emotion and accent cloning. Importantly, KALL-E presents a more straightforward and effective paradigm for using continuous speech representations in TTS. Audio samples are available at: \url{https://zxf-icpc.github.io/kalle/}.

large language model, machine learning, natural language, (20 more...)

2412.16846

Country:

Europe > Austria > Vienna (0.15)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(18 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)

Fröhling, Leon, Bernardelle, Pietro, Demartini, Gianluca

SubData: A Python Library to Collect and Combine Datasets for Evaluating LLM Alignment on Downstream Tasks

With the release of ever more capable large language models (LLMs), researchers in NLP and related disciplines have started to explore the usability of LLMs for a wide variety of different annotation tasks. Very recently, a lot of this attention has shifted to tasks that are subjective in nature. Given that the latest generations of LLMs have digested and encoded extensive knowledge about different human subpopulations and individuals, the hope is that these models can be trained, tuned or prompted to align with a wide range of different human perspectives. While researchers already evaluate the success of this alignment via surveys and tests, there is a lack of resources to evaluate the alignment on what oftentimes matters the most in NLP; the actual downstream tasks. To fill this gap we present SubData, a Python library that offers researchers working on topics related to subjectivity in annotation tasks a convenient way of collecting, Figure 1: The SubData library allows to combine relevant combining and using a range of suitable datasets.

artificial intelligence, large language model, natural language, (16 more...)

2412.16783

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)