AITopics | astrollama

Collaborating Authors

astrollama

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AstroLLaVA: towards the unification of astronomical data and natural language

Zaman, Sharaf, Smith, Michael J., Khetarpal, Pranav, Chakrabarty, Rishabh, Ginolfi, Michele, Huertas-Company, Marc, Jabłońska, Maja, Kruk, Sandor, Lain, Matthieu Le, Méndez, Sergio José Rodríguez, Tanoglidis, Dimitrios

arXiv.org Artificial IntelligenceApr-14-2025

We present AstroLLaVA, a vision language model for astronomy that enables interaction with astronomical imagery through natural dialogue. By fine-tuning the LLaVA model on a diverse dataset of $\sim$30k images with captions and question-answer pairs sourced from NASA's `Astronomy Picture of the Day', the European Southern Observatory, and the NASA/ESA Hubble Space Telescope, we create a model capable of answering open-ended questions about astronomical concepts depicted visually. Our two-stage fine-tuning process adapts the model to both image captioning and visual question answering in the astronomy domain. We demonstrate AstroLLaVA's performance on an astronomical visual question answering benchmark and release the model weights, code, and training set to encourage further open source work in this space. Finally, we suggest a roadmap towards general astronomical data alignment with pre-trained language models, and provide an open space for collaboration towards this end for interested researchers.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2504.08583

Country: North America > United States (0.69)

Genre: Research Report (1.00)

Industry:

Government > Space Agency (0.69)
Government > Regional Government > North America Government > United States Government (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study

Modesitt, Eric, Yang, Ke, Hulsey, Spencer, Zhai, Chengxiang, Kindratenko, Volodymyr

arXiv.org Artificial IntelligenceDec-18-2024

Recent advances in language modeling demonstrate the need for high-quality domain-specific training data, especially for tasks that require specialized knowledge. General-purpose models, while versatile, often lack the depth needed for expert-level tasks because of limited domain-specific information. Domain adaptation training can enhance these models, but it demands substantial, high-quality data. To address this, we propose ORBIT, a cost-efficient methodology for curating massive, high-quality domain-specific datasets from noisy web sources, tailored for training specialist large language models. Using astronomy as a primary case study, we refined the 1.3T-token FineWeb-Edu dataset into a high-quality, 10B-token subset focused on astronomy. Fine-tuning \textsc{LLaMA-3-8B} on a 1B-token astronomy subset improved performance on the MMLU astronomy benchmark from 69\% to 76\% and achieved top results on AstroBench, an astronomy-specific benchmark. Moreover, our model (Orbit-LLaMA) outperformed \textsc{LLaMA-3-8B-base}, with GPT-4o evaluations preferring it in 73\% of cases across 1000 astronomy-specific questions. Additionally, we validated ORBIT's generalizability by applying it to law and medicine, achieving a significant improvement of data quality compared to an unfiltered baseline. We open-source the ORBIT methodology, including the curated datasets, the codebase, and the resulting model at \href{https://github.com/ModeEric/ORBIT-Llama}{https://github.com/ModeEric/ORBIT-Llama}.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.14436

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > United States > Illinois (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Perkowski, Ernest, Pan, Rui, Nguyen, Tuan Dung, Ting, Yuan-Sen, Kruk, Sandor, Zhang, Tong, O'Neill, Charlie, Jablonska, Maja, Sun, Zechang, Smith, Michael J., Liu, Huiling, Schawinski, Kevin, Iyer, Kartheik, UniverseTBD, Ioana Ciucă for

arXiv.org Artificial IntelligenceJan-5-2024

To enhance this, we introduce AstroLLaMA-Chat, an advanced version of AstroLLaMA. This new iteration broadens the training scope to include introductions and conclusions of papers, alongside abstracts, as these sections are often rich in pivotal information for question-answering tasks. We initiated by downloading all papers up to July 2023, including all the files that come with a submission to arXiv. The data has been further refined for optimal operability, retaining only files with ".tex" suffixes. Through a multi-stage process, and utilising a comprehensive regex matching process, the extraction of the targeted sections was performed. Given the diverse LaTeX formatting standards, approximately 90% of the samples remained post-processing. Subsequently, we removed specific formatting patterns, comments, and superfluous symbols like newlines to ensure the readability of the training data. Further, we have fine-tuned AstroLLaMA-Chat on a domain-specific dialogue dataset. To generate Question-Answer pairs, we engaged GPT-4 (OpenAI 2023) to formulate pertinent questions from paragraphs within 300,000 arXiv papers, with GPT-4 also tasked with answering these questions by retrieving context-relevant information.

astrollama, astrollama-chat, dimension, (13 more...)

arXiv.org Artificial Intelligence

2401.01916

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > United Kingdom (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(8 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Nguyen, Tuan Dung, Ting, Yuan-Sen, Ciucă, Ioana, O'Neill, Charlie, Sun, Ze-Chang, Jabłońska, Maja, Kruk, Sandor, Perkowski, Ernest, Miller, Jack, Li, Jason, Peek, Josh, Iyer, Kartheik, Różański, Tomasz, Khetarpal, Pranav, Zaman, Sharaf, Brodrick, David, Méndez, Sergio J. Rodríguez, Bui, Thang, Goodman, Alyssa, Accomazzi, Alberto, Naiman, Jill, Cranney, Jesse, Schawinski, Kevin, UniverseTBD, null

arXiv.org Artificial IntelligenceSep-12-2023

Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.

astrollama, cloud, magellanic stream, (15 more...)

arXiv.org Artificial Intelligence

2309.06126

Country:

North America > United States > Florida > Hillsborough County > University (0.05)
Oceania > Australia (0.04)
North America > United States > Pennsylvania (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback