AITopics

2410.00083

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

arXiv.org Artificial IntelligenceSep-30-2024

Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges

Liu, Qin, Mo, Wenjie, Tong, Terry, Xu, Jiashu, Wang, Fei, Xiao, Chaowei, Chen, Muhao

The advancement of Large Language Models (LLMs) has significantly impacted various domains, including Web search, healthcare, and software development. However, as these models scale, they become more vulnerable to cybersecurity risks, particularly backdoor attacks. By exploiting the potent memorization capacity of LLMs, adversaries can easily inject backdoors into LLMs by manipulating a small portion of training data, leading to malicious behaviors in downstream applications whenever the hidden backdoor is activated by the pre-defined triggers. Moreover, emerging learning paradigms like instruction tuning and reinforcement learning from human feedback (RLHF) exacerbate these risks as they rely heavily on crowdsourced data and human feedback, which are not fully controlled. In this paper, we present a comprehensive survey of emerging backdoor threats to LLMs that appear during LLM development or inference, and cover recent advancement in both defense and detection strategies for mitigating backdoor threats to LLMs. We also outline key challenges in addressing these threats, highlighting areas for future research.

arxiv preprint arxiv, backdoor, language model, (14 more...)

2409.19993

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Nepal (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pham, Duy-Tung, Vu, Thien Trang Nguyen, Nguyen, Tung, Van, Linh Ngo, Nguyen, Duc Anh, Nguyen, Thien Huu

NeuroMax: Enhancing Neural Topic Modeling via Maximizing Mutual Information and Group Topic Regularization

Recent advances in neural topic models have concentrated on two primary directions: the integration of the inference network (encoder) with a pre-trained language model (PLM) and the modeling of the relationship between words and topics in the generative model (decoder). However, the use of large PLMs significantly increases inference costs, making them less practical for situations requiring low inference times. Furthermore, it is crucial to simultaneously model the relationships between topics and words as well as the interrelationships among topics themselves. In this work, we propose a novel framework called NeuroMax (Neural Topic Model with Maximizing Mutual Information with Pretrained Language Model and Group Topic Regularization) to address these challenges. NeuroMax maximizes the mutual information between the topic representation obtained from the encoder in neural topic models and the representation derived from the PLM. Additionally, NeuroMax employs optimal transport to learn the relationships between topics by analyzing how information is transported among them. Experimental results indicate that NeuroMax reduces inference time, generates more coherent topics and topic groups, and produces more representative document embeddings, thereby enhancing performance on downstream tasks.

information, modeling, topic model, (15 more...)

2409.19749

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon (0.04)
(10 more...)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Hoque, Enamul, Islam, Mohammed Saidul

Natural Language Generation for Visualizations: State of the Art, Challenges and Future Directions

Natural language and visualization are two complementary modalities of human communication that play a crucial role in conveying information effectively. While visualizations help people discover trends, patterns, and anomalies in data, natural language descriptions help explain these insights. Thus, combining text with visualizations is a prevalent technique for effectively delivering the core message of the data. Given the rise of natural language generation (NLG), there is a growing interest in automatically creating natural language descriptions for visualizations, which can be used as chart captions, answering questions about charts, or telling data-driven stories. In this survey, we systematically review the state of the art on NLG for visualizations and introduce a taxonomy of the problem. The NLG tasks fall within the domain of Natural Language Interfaces (NLI) for visualization, an area that has garnered significant attention from both the research community and industry. To narrow down the scope of the survey, we primarily concentrate on the research works that focus on text generation for visualizations. To characterize the NLG problem and the design space of proposed solutions, we pose five Wh-questions, why and how NLG tasks are performed for visualizations, what the task inputs and outputs are, as well as where and when the generated texts are integrated with visualizations. We categorize the solutions used in the surveyed papers based on these "five Wh-questions." Finally, we discuss the key challenges and potential avenues for future research in this domain.

eurographic association and john wiley, proceedings, visualization, (9 more...)

2409.19747

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(12 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Media (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Peled-Cohen, Lotem, Reichart, Roi

A Systematic Review of NLP for Dementia- Tasks, Datasets and Opportunities

The close link between cognitive decline and language has fostered long-standing collaboration between the NLP and medical communities in dementia research. To examine this, we reviewed over 200 papers applying NLP to dementia related efforts, drawing from medical, technological, and NLP-focused literature. We identify key research areas, including dementia detection, linguistic biomarker extraction, caregiver support, and patient assistance, showing that half of all papers focus solely on dementia detection using clinical data. However, many directions remain unexplored: artificially degraded language models, synthetic data, digital twins, and more. We highlight gaps and opportunities around trust, scientific rigor, applicability, and cross-community collaboration, and showcase the diverse datasets encountered throughout our review: recorded, written, structured, spontaneous, synthetic, clinical, social media based, and more. This review aims to inspire more creative approaches to dementia research within the medical and NLP communities.

alzheimer, dementia, detection, (16 more...)

2409.19737

Country:

North America > United States > Wisconsin (0.04)
Asia > Middle East > Israel (0.04)
Asia > China > Heilongjiang Province > Daqing (0.04)
Africa > Middle East > Morocco > Fès-Meknès Region > Fez (0.04)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology > Dementia (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
(6 more...)

A Survey on Graph Neural Networks for Remaining Useful Life Prediction: Methodologies, Evaluation and Future Trends

Wang, Yucheng, Wu, Min, Li, Xiaoli, Xie, Lihua, Chen, Zhenghua

The prediction of Remaining Useful Life (RUL) is a critical component in the field of Prognostics and Health Management (PHM), which aims to predict the future state of a system to ensure timely maintenance and prevent unexpected failures (Wang, Xu, Li, Ren, Dong, Chen, Du, Wang, Shi and Zhang, 2024f; Karatzinis, Boutalis and Van Vaerenbergh, 2024; Zhang, Yuan, Jiang and Zhao, 2024b). Accurate RUL prediction enable predictive maintenance, which can significantly reduce downtime, improve safety, and optimize the lifecycle management of machinery and equipment. Additionally, effective RUL prediction can enhance decision-making processes, improve resource allocation, and reduce maintenance costs. In recent years, deep learning has become increasingly important in RUL prediction due to its ability to model complex patterns and dependencies, providing more accurate and reliable predictions compared to traditional methods, such as statistical approaches (Si, Wang, Hu and Zhou, 2011) and physicsbased models (Lei, Li, Gontarz, Lin, Radkowski and Dybala, 2016; Sikorska, Hodkiewicz and Ma, 2011; Li, Zhang, Li and Si, 2024). Existing studies in RUL prediction have primarily focused on utilizing temporal encoders such as Temporal Convolutional Networks (TCN) (Qiu, Niu, Shang, Gao and Xu, 2023), Gated Recurrent Units (GRU), Convolutional Neural Networks (CNN) (Shang, Xu, Qiu, Gao, Jiang and Yi, 2024), and Long Short-Term Memory (LSTM) networks. These methods have achieved strong performance due to their ability to capture temporal information, which refers to the time-based patterns and sequences within the data, such as trends and periodic behaviors. However, they are not effective at capturing spatial information, which limits their performance in RUL prediction.

information, prediction, rul prediction, (14 more...)

2409.19629

Country:

Asia > Singapore (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Consumer Health (0.66)
Aerospace & Defense (0.46)
Information Technology (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Dhakal, Prakash, Baral, Daya Sagar

Abstractive Summarization of Low resourced Nepali language using Multilingual Transformers

Automatic text summarization in Nepali language is an unexplored area in natural language processing (NLP). Although considerable research has been dedicated to extractive summarization, the area of abstractive summarization, especially for low-resource languages such as Nepali, remains largely unexplored. This study explores the use of multilingual transformer models, specifically mBART and mT5, for generating headlines for Nepali news articles through abstractive summarization. The research addresses key challenges associated with summarizing texts in Nepali by first creating a summarization dataset through web scraping from various Nepali news portals. These multilingual models were then fine-tuned using different strategies. The performance of the fine-tuned models were then assessed using ROUGE scores and human evaluation to ensure the generated summaries were coherent and conveyed the original meaning. During the human evaluation, the participants were asked to select the best summary among those generated by the models, based on criteria such as relevance, fluency, conciseness, informativeness, factual accuracy, and coverage. During the evaluation with ROUGE scores, the 4-bit quantized mBART with LoRA model was found to be effective in generating better Nepali news headlines in comparison to other models and also it was selected 34.05% of the time during the human evaluation, outperforming all other fine-tuned models created for Nepali News headline generation.

dataset, evaluation, summarization, (15 more...)

2409.19566

Country:

Asia > Nepal (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search

Sun, Linzhuang, Liang, Hao, Wei, Jingxuan, Yu, Bihui, He, Conghui, Zhou, Zenan, Zhang, Wentao

Large Language Models (LLMs) have exhibited exceptional performance across a broad range of tasks and domains. However, they still encounter difficulties in solving mathematical problems due to the rigorous and logical nature of mathematics. Previous studies have employed techniques such as supervised fine-tuning (SFT), prompt engineering, and search-based methods to improve the mathematical problem-solving abilities of LLMs. Despite these efforts, their performance remains suboptimal and demands substantial computational resources. To address this issue, we propose a novel approach, BEATS, to enhance mathematical problem-solving abilities. Our method leverages newly designed prompts that guide the model to iteratively rewrite, advance by one step, and generate answers based on previous steps. Additionally, we employ a pruning tree search to optimize search time while achieving strong performance. Furthermore, we introduce a new back-verification technique that uses LLMs to validate the correctness of the generated answers. Notably, our method improves Qwen2-7b-Instruct's score from 36.94 to 61.52 (outperforming GPT-4's 42.5) on the MATH benchmark.

arxiv preprint arxiv, language model, reasoning, (14 more...)

2409.17972

Country:

Asia > China > Guangxi Province > Nanning (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Overview (0.66)
Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

FOX NewsSep-28-2024, 10:00:12 GMT

Lionsgate's bold move into AI is about to change filmmaking forever

With this elaborate integration of AI, there is, of course, the fear that AI will take over or replace human talent. However, the recent collaboration between Lionsgate and Runway shows that it has actually been enhancing the process versus diminishing creativity. Instead of replacing their human counterparts, these technologies are being used as tools to help humans cut down on time for specific tasks, which allows them to focus on the joy of creating. It also enables more creative approaches at a lower cost.

cyberguy, hollywood, lionsgate, (12 more...)

FOX News

Genre: Overview > Innovation (0.36)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence (1.00)

Abdullah, Abdulhady Abas, Ahmed, Aram Mahmood, Rashid, Tarik, Veisi, Hadi, Rassul, Yassin Hussein, Hassan, Bryar, Fattah, Polla, Ali, Sabat Abdulhameed, Shamsaldin, Ahmed S.

Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods

arXiv.org Artificial IntelligenceSep-28-2024

Speech signal processing is a cornerstone of modern communication technologies, tasked with improving the clarity and comprehensibility of audio data in noisy environments. The primary challenge in this field is the effective separation and recognition of speech from background noise, crucial for applications ranging from voice-activated assistants to automated transcription services. The quality of speech recognition directly impacts user experience and accessibility in technology-driven communication. This review paper explores advanced clustering techniques, particularly focusing on the Kernel Fuzzy C-Means (KFCM) method, to address these challenges. Our findings indicate that KFCM, compared to traditional methods like K-Means (KM) and Fuzzy C-Means (FCM), provides superior performance in handling non-linear and non-stationary noise conditions in speech signals. The most notable outcome of this review is the adaptability of KFCM to various noisy environments, making it a robust choice for speech enhancement applications. Additionally, the paper identifies gaps in current methodologies, such as the need for more dynamic clustering algorithms that can adapt in real time to changing noise conditions without compromising speech recognition quality. Key contributions include a detailed comparative analysis of current clustering algorithms and suggestions for further integrating hybrid models that combine KFCM with neural networks to enhance speech recognition accuracy. Through this review, we advocate for a shift towards more sophisticated, adaptive clustering techniques that can significantly improve speech enhancement and pave the way for more resilient speech processing systems.

artificial intelligence, fuzzy c-means, machine learning, (15 more...)

2409.19448

Country:

Asia > Middle East > Iraq > Erbil Governorate > Erbil (0.04)
Asia > Middle East > Iraq > Kurdistan Region (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)