AITopics | Antarctica

Collaborating Authors

Antarctica

WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing

Hu, Chenhui, Cao, Pengfei, Chen, Yubo, Liu, Kang, Zhao, Jun

arXiv.org Artificial IntelligenceJun-5-2024

Knowledge editing aims to rectify inaccuracies in large language models (LLMs) without costly retraining for outdated or erroneous knowledge. However, current knowledge editing methods primarily focus on single editing, failing to meet the requirements for lifelong editing. This study reveals a performance degradation encountered by knowledge editing in lifelong editing, characterized by toxicity buildup and toxicity flash, with the primary cause identified as pattern unmatch. We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. Experimental results demonstrate that, in lifelong editing, WilKE exhibits an average improvement of 46.2% and 67.8% on editing GPT2-XL and GPT-J relative to state-of-the-art knowledge editing methods.

editing, strength, toxicity, (15 more...)

arXiv.org Artificial Intelligence

2402.10987

Country:

Antarctica (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse

Yang, Wanli, Sun, Fei, Ma, Xinyu, Liu, Xun, Yin, Dawei, Cheng, Xueqi

arXiv.org Artificial IntelligenceJun-5-2024

Although model editing has shown promise in revising knowledge in Large Language Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this work, we reveal a critical phenomenon: even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. However, benchmarking LLMs after each edit, while necessary to prevent such collapses, is impractically time-consuming and resource-intensive. To mitigate this, we propose using perplexity as a surrogate metric, validated by extensive experiments demonstrating changes in an edited model's perplexity are strongly correlated with its downstream task performances. We further conduct an in-depth study on sequential editing, a practical setting for real-world scenarios, across various editing methods and LLMs, focusing on hard cases from our previous single edit studies. The results indicate that nearly all examined editing methods result in model collapse after only few edits. To facilitate further research, we have utilized GPT-3.5 to develop a new dataset, HardEdit, based on those hard cases. This dataset aims to establish the foundation for pioneering research in reliable model editing and the mechanisms underlying editing-induced model collapse. We hope this work can draw the community's attention to the potential risks inherent in model editing practices.

dataset, editing, perplexity, (16 more...)

arXiv.org Artificial Intelligence

2402.09656

Country:

Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > Canada > Ontario > Toronto (0.04)
(20 more...)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Predicting the Geothermal Gradient in Colombia: a Machine Learning Approach

Mejía-Fragoso, Juan Camilo, Florez, Manuel A., Bernal-Olaya, Rocío

arXiv.org Artificial IntelligenceJun-5-2024

Accurate determination of the geothermal gradient is critical for assessing the geothermal energy potential of a given region. Of particular interest is the case of Colombia, a country with abundant geothermal resources. A history of active oil and gas exploration and production has left drilled boreholes in different geological settings, providing direct measurements of the geothermal gradient. Unfortunately, large regions of the country where geothermal resources might exist lack such measurements. Indirect geophysical measurements are costly and difficult to perform at regional scales. Computational thermal models could be constructed, but they require very detailed knowledge of the underlying geology and uniform sampling of subsurface temperatures to be well-constrained. We present an alternative approach that leverages recent advances in supervised machine learning and available direct measurements to predict the geothermal gradient in regions where only global-scale geophysical datasets and course geological knowledge are available. We find that a Gradient Boosted Regression Tree algorithm yields optimal predictions and extensively validate the trained model. We show that predictions of our model are within 12% accuracy and that independent measurements performed by other authors agree well with our model. Finnally, we present a geothermal gradient map for Colombia that highlights regions where futher exploration and data collection should be performed.

colombia, geothermal gradient, gradient, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.geothermics.2024.103074

2404.05184

Country:

North America > Panama (0.14)
North America > United States > Texas (0.14)
Antarctica (0.04)
(15 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development (1.00)
Energy > Renewable > Geothermal > Geothermal Resource Type (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)

Add feedback

Symmetry Discovery Beyond Affine Transformations

Shaw, Ben, Magner, Abram, Moon, Kevin R.

arXiv.org Machine LearningJun-5-2024

Symmetry detection has been shown to improve various machine learning tasks. In the context of continuous symmetry detection, current state of the art experiments are limited to the detection of affine transformations. Under the manifold assumption, we outline a framework for discovering continuous symmetry in data beyond the affine transformation group. We also provide a similar framework for discovering discrete symmetry. We experimentally compare our method to an existing method known as LieGAN and show that our method is competitive at detecting affine symmetries for large sample sizes and superior than LieGAN for small sample sizes. We also show our method is able to detect continuous symmetries beyond the affine group and is generally more computationally efficient than LieGAN.

estimation, symmetry, vector field, (14 more...)

arXiv.org Machine Learning

2406.03619

Country:

North America > United States > Utah > Cache County > Logan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Albany County > Albany (0.04)
Antarctica (0.04)

Genre:

Research Report > Experimental Study (0.66)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Modelling Commonsense Commonalities with Multi-Facet Concept Embeddings

Kteich, Hanane, Li, Na, Chatterjee, Usashi, Bouraoui, Zied, Schockaert, Steven

arXiv.org Artificial IntelligenceJun-4-2024

Concept embeddings offer a practical and efficient mechanism for injecting commonsense knowledge into downstream tasks. Their core purpose is often not to predict the commonsense properties of concepts themselves, but rather to identify commonalities, i.e.\ sets of concepts which share some property of interest. Such commonalities are the basis for inductive generalisation, hence high-quality concept embeddings can make learning easier and more robust. Unfortunately, standard embeddings primarily reflect basic taxonomic categories, making them unsuitable for finding commonalities that refer to more specific aspects (e.g.\ the colour of objects or the materials they are made of). In this paper, we address this limitation by explicitly modelling the different facets of interest when learning concept embeddings. We show that this leads to embeddings which capture a more diverse range of commonsense properties, and consistently improves results in downstream tasks such as ultra-fine entity typing and ontology completion.

computational linguistic, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2403.16984

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
(22 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

nn2poly: An R Package for Converting Neural Networks into Interpretable Polynomials

Morala, Pablo, Cifuentes, Jenny Alexandra, Lillo, Rosa E., Ucar, Iñaki

arXiv.org Machine LearningJun-3-2024

The nn2poly package provides the implementation in R of the NN2Poly method to explain and interpret feed-forward neural networks by means of polynomial representations that predict in an equivalent manner as the original network.Through the obtained polynomial coefficients, the effect and importance of each variable and their interactions on the output can be represented. This capabiltiy of capturing interactions is a key aspect usually missing from most Explainable Artificial Intelligence (XAI) methods, specially if they rely on expensive computations that can be amplified when used on large neural networks. The package provides integration with the main deep learning framework packages in R (tensorflow and torch), allowing an user-friendly application of the NN2Poly algorithm. Furthermore, nn2poly provides implementation of the required weight constraints to be used during the network training in those same frameworks. Other neural networks packages can also be used by including their weights in list format. Polynomials obtained with nn2poly can also be used to predict with new data or be visualized through its own plot method. Simulations are provided exemplifying the usage of the package alongside with a comparison with other approaches available in R to interpret neural networks.

coefficient, neural network, polynomial, (14 more...)

arXiv.org Machine Learning

2406.01588

Country:

Europe > Austria > Vienna (0.14)
Europe > Spain > Galicia > Madrid (0.05)
North America > United States > Texas (0.04)
Antarctica (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Long-term foehn reconstruction combining unsupervised and supervised learning

Stauffer, Reto, Zeileis, Achim, Mayr, Georg J.

arXiv.org Machine LearningJun-3-2024

Foehn winds, characterized by abrupt temperature increases and wind speed changes, significantly impact regions on the leeward side of mountain ranges, e.g., by spreading wildfires. Understanding how foehn occurrences change under climate change is crucial. Unfortunately, foehn cannot be measured directly but has to be inferred from meteorological measurements employing suitable classification schemes. Hence, this approach is typically limited to specific periods for which the necessary data are available. We present a novel approach for reconstructing historical foehn occurrences using a combination of unsupervised and supervised probabilistic statistical learning methods. We utilize in-situ measurements (available for recent decades) to train an unsupervised learner (finite mixture model) for automatic foehn classification. These labeled data are then linked to reanalysis data (covering longer periods) using a supervised learner (lasso or boosting). This allows to reconstruct past foehn probabilities based solely on reanalysis data. Applying this method to ERA5 reanalysis data for six stations across Switzerland and Austria achieves accurate hourly reconstructions of north and south foehn occurrence, respectively, dating back to 1940. This paves the way for investigating how seasonal foehn patterns have evolved over the past 83 years, providing valuable insights into climate change impacts on these critical wind events.

foehn, probability, reconstruction, (14 more...)

arXiv.org Machine Learning

2406.01818

Country:

North America > United States > California (0.14)
Europe > Austria > Tyrol > Innsbruck (0.07)
North America > United States > Montana (0.05)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling

Li, Siyuan, Wang, Zedong, Liu, Zicheng, Wu, Di, Tan, Cheng, Zheng, Jiangbin, Huang, Yufei, Li, Stan Z.

arXiv.org Artificial IntelligenceJun-2-2024

Similar to natural language models, pre-trained genome language models are proposed to capture the underlying intricacies within genomes with unsupervised sequence modeling. They have become essential tools for researchers and practitioners in biology. However, the hand-crafted tokenization policies used in these models may not encode the most discriminative patterns from the limited vocabulary of genomic data. In this paper, we introduce VQDNA, a general-purpose framework that renovates genome tokenization from the perspective of genome vocabulary learning. By leveraging vector-quantized codebooks as learnable vocabulary, VQDNA can adaptively tokenize genomes into pattern-aware embeddings in an end-to-end manner. To further push its limits, we propose Hierarchical Residual Quantization (HRQ), where varying scales of codebooks are designed in a hierarchy to enrich the genome vocabulary in a coarse-to-fine manner. Extensive experiments on 32 genome datasets demonstrate VQDNA's superiority and favorable parameter efficiency compared to existing genome language models. Notably, empirical analysis of SARS-CoV-2 mutations reveals the fine-grained pattern awareness and biological significance of learned HRQ vocabulary, highlighting its untapped potential for broader applications in genomics.

codebook, genome, vqdna, (13 more...)

arXiv.org Artificial Intelligence

2405.10812

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States (0.04)
Antarctica (0.04)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Red Arrows pay tribute to Spitfire crash pilot

BBC NewsMay-29-2024, 11:54:07 GMT

Red Arrows pay tribute to Spitfire crash pilot 2 hours agoEleanor Maslin,BBC NewsShareMODSqn Ldr Mark Long was described as a "passionate aviator" in a tribute by the RAF The Red Arrows have shared their "heartfelt condolences" after the death of a pilot when his Spitfire crashed into a Lincolnshire field. Emergency crews were called shortly before 13:20 BST on 25 May to the site near RAF Coningsby where Sqn Ldr Mark Long crashed. Red Arrows team leader Sqn Ldr Jon Bond said he and fellow pilots were supporting Sqn Ldr Long's family "as much as we can". The display team is getting ready to start its 60th anniversary season after returning from winter training in Greece on Saturday.Sqn Ldr Jon Bond said "things can change quickly" when flying The RAF said a "comprehensive investigation" was now under way to determine the cause of the Spitfire crash. Speaking to BBC Radio Lincolnshire, Sqn Ldr Bond said: "Awful news to come back to on Saturday. "Our absolute heartfelt condolences go to Mark's family, all at the BBMF (Battle of Britain Memorial Flight) and all at RAF Coningsby.

artificial intelligence, red arrow pay tribute, social media, (9 more...)

BBC News

Country:

Europe > United Kingdom > England > Lincolnshire (0.50)
Europe > Greece (0.25)
South America (0.16)
(17 more...)

Industry:

Leisure & Entertainment (0.73)
Transportation > Air (0.60)
Government > Military (0.60)
Media > Radio (0.36)

Technology:

Information Technology > Communications > Social Media (0.53)
Information Technology > Artificial Intelligence (0.36)

Add feedback

Unlearning Climate Misinformation in Large Language Models

Fore, Michael, Singh, Simranjit, Lee, Chaehong, Pandey, Amritanshu, Anastasopoulos, Antonios, Stamoulis, Dimitrios

arXiv.org Artificial IntelligenceMay-29-2024

Misinformation regarding climate change is a key roadblock in addressing one of the most serious threats to humanity. This paper investigates factual accuracy in large language models (LLMs) regarding climate information. Using true/false labeled Q&A data for fine-tuning and evaluating LLMs on climate-related claims, we compare open-source models, assessing their ability to generate truthful responses to climate change questions. We investigate the detectability of models intentionally poisoned with false climate information, finding that such poisoning may not affect the accuracy of a model's responses in other domains. Furthermore, we compare the effectiveness of unlearning algorithms, fine-tuning, and Retrieval-Augmented Generation (RAG) for factually grounding LLMs on climate change topics. Our evaluation reveals that unlearning algorithms can be effective for nuanced conceptual claims, despite previous findings suggesting their inefficacy in privacy contexts. These insights aim to guide the development of more factually reliable LLMs and highlight the need for additional work to secure LLMs against misinformation attacks.

alignscore, climate change, information, (12 more...)

arXiv.org Artificial Intelligence

2405.19563

Country:

Asia > Japan (0.05)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
(16 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Media > News (1.00)
Energy > Renewable (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback