AITopics | Smith, Michael J.

Collaborating Authors

Smith, Michael J.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AstroPT: Scaling Large Observation Models for Astronomy

Smith, Michael J., Roberts, Ryan J., Angeloudi, Eirini, Huertas-Company, Marc

arXiv.org Artificial IntelligenceMay-23-2024

This work presents AstroPT, an autoregressive pretrained transformer developed with astronomical use-cases in mind. The AstroPT models presented here have been pretrained on 8.6 million $512 \times 512$ pixel $grz$-band galaxy postage stamp observations from the DESI Legacy Survey DR8. We train a selection of foundation models of increasing size from 1 million to 2.1 billion parameters, and find that AstroPT follows a similar saturating log-log scaling law to textual models. We also find that the models' performances on downstream tasks as measured by linear probing improves with model size up to the model parameter saturation point. We believe that collaborative community development paves the best route towards realising an open source `Large Observation Model' -- a model trained on data taken from the observational sciences at the scale seen in natural language processing. To this end, we release the source code, weights, and dataset for AstroPT under the MIT license, and invite potential collaborators to join us in collectively building and researching these models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.1493

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)

Add feedback

EarthPT: a time series foundation model for Earth Observation

Smith, Michael J., Fleming, Luke, Geach, James E.

arXiv.org Artificial IntelligenceJan-11-2024

We introduce EarthPT -- an Earth Observation (EO) pretrained transformer. EarthPT is a 700 million parameter decoding transformer foundation model trained in an autoregressive self-supervised manner and developed specifically with EO use-cases in mind. We demonstrate that EarthPT is an effective forecaster that can accurately predict future pixel-level surface reflectances across the 400-2300 nm range well into the future. For example, forecasts of the evolution of the Normalised Difference Vegetation Index (NDVI) have a typical error of approximately 0.05 (over a natural range of -1 -> 1) at the pixel level over a five month test set horizon, out-performing simple phase-folded models based on historical averaging. We also demonstrate that embeddings learnt by EarthPT hold semantically meaningful information and could be exploited for downstream tasks such as highly granular, dynamic land use classification. Excitingly, we note that the abundance of EO data provides us with -- in theory -- quadrillions of training tokens. Therefore, if we assume that EarthPT follows neural scaling laws akin to those derived for Large Language Models (LLMs), there is currently no data-imposed limit to scaling EarthPT and other similar `Large Observation Models.'

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2309.07207

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cornwall (0.14)

Genre: Research Report (0.40)

Industry:

Energy (0.51)
Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Perkowski, Ernest, Pan, Rui, Nguyen, Tuan Dung, Ting, Yuan-Sen, Kruk, Sandor, Zhang, Tong, O'Neill, Charlie, Jablonska, Maja, Sun, Zechang, Smith, Michael J., Liu, Huiling, Schawinski, Kevin, Iyer, Kartheik, UniverseTBD, Ioana Ciucă for

arXiv.org Artificial IntelligenceJan-5-2024

To enhance this, we introduce AstroLLaMA-Chat, an advanced version of AstroLLaMA. This new iteration broadens the training scope to include introductions and conclusions of papers, alongside abstracts, as these sections are often rich in pivotal information for question-answering tasks. We initiated by downloading all papers up to July 2023, including all the files that come with a submission to arXiv. The data has been further refined for optimal operability, retaining only files with ".tex" suffixes. Through a multi-stage process, and utilising a comprehensive regex matching process, the extraction of the targeted sections was performed. Given the diverse LaTeX formatting standards, approximately 90% of the samples remained post-processing. Subsequently, we removed specific formatting patterns, comments, and superfluous symbols like newlines to ensure the readability of the training data. Further, we have fine-tuned AstroLLaMA-Chat on a domain-specific dialogue dataset. To generate Question-Answer pairs, we engaged GPT-4 (OpenAI 2023) to formulate pertinent questions from paragraphs within 300,000 arXiv papers, with GPT-4 also tasked with answering these questions by retrieving context-relevant information.

astrollama-chat, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2401.01916

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy

Smith, Michael J., Geach, James E.

arXiv.org Artificial IntelligenceMay-12-2023

In this review, we explore the historical development and future prospects of artificial intelligence (AI) and deep learning in astronomy. We trace the evolution of connectionism in astronomy through its three waves, from the early use of multilayer perceptrons, to the rise of convolutional and recurrent neural networks, and finally to the current era of unsupervised and generative deep learning methods. With the exponential growth of astronomical data, deep learning techniques offer an unprecedented opportunity to uncover valuable insights and tackle previously intractable problems. As we enter the anticipated fourth wave of astronomical connectionism, we argue for the adoption of GPT-like foundation models fine-tuned for astronomical applications. Such models could harness the wealth of high-quality, multimodal astronomical data to serve state-of-the-art downstream tasks. To keep pace with advancements driven by Big Tech, we propose a collaborative, open-source approach within the astronomy community to develop and maintain these foundation models, fostering a symbiotic relationship between AI and astronomy that capitalizes on the unique strengths of both fields.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1098/rsos.221454

2211.03796

Country:

Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Energy (1.00)
Health & Medicine (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback