AITopics | Antarctica

Collaborating Authors

Antarctica

Football Manager 25 cancelled after two delays

BBC NewsFeb-7-2025, 08:17:34 GMT

The latest update in the popular Football Manager series has been cancelled, its makers have announced. Fans of the long-running video game began to speculate about its fate when an update due to be unveiled late last month did not arrive. In a blog post, developer Sports Interactive told players it had made the "difficult decision" to cancel the 2025 edition as it was "too far away from the standards you deserve". It said it would now shift focus to the 2026 version of the game and fans who had preordered the cancelled release could obtain a refund. Football Manager, first launched in 2004, allows fans to step into the shoes of a gaffer and guide a chosen team through a season.

football manager 25, sport interactive, virtual dollhouse, (3 more...)

BBC News

Country:

South America (0.17)
North America > Central America (0.17)
Oceania > Australia (0.07)
(13 more...)

Industry:

Leisure & Entertainment > Sports (0.55)
Leisure & Entertainment > Games > Computer Games (0.40)

Technology: Information Technology > Artificial Intelligence (0.54)

Add feedback

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

Wen, Xueru, Lou, Jie, Lu, Xinyu, Yang, Junjie, Liu, Yanjiang, Lu, Yaojie, Zhang, Debing, XingYu, null

arXiv.org Artificial IntelligenceFeb-7-2025

As AI capabilities increasingly surpass human proficiency in complex tasks, current alignment techniques including SFT and RLHF face fundamental challenges in ensuring reliable oversight. These methods rely on direct human assessment and become untenable when AI outputs exceed human cognitive thresholds. In response to this challenge, we explore two hypotheses: (1) critique of critique can be easier than critique itself, extending the widely-accepted observation that verification is easier than generation to the critique domain, as critique itself is a specialized form of generation; (2) this difficulty relationship is recursively held, suggesting that when direct evaluation is infeasible, performing high-order critiques (e.g., critique of critique of critique) offers a more tractable supervision pathway. To examine these hypotheses, we perform Human-Human, Human-AI, and AI-AI experiments across multiple tasks. Our results demonstrate encouraging evidence supporting these hypotheses and suggest that recursive self-critiquing is a promising direction for scalable oversight.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.04675

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Antarctica (0.04)
Africa (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Anderson, Carolyn Jane, Biswas, Joydeep, Boruch-Gruszecki, Aleksander, Cassano, Federico, Feldman, Molly Q, Guha, Arjun, Lucchetti, Francesca, Wu, Zixuan

arXiv.org Artificial IntelligenceFeb-6-2025

Existing benchmarks for frontier models often test specialized, ``PhD-level'' knowledge that is difficult for non-experts to grasp. In contrast, we present a benchmark based on the NPR Sunday Puzzle Challenge that requires only general knowledge. Our benchmark is challenging for both humans and models, however correct solutions are easy to verify, and models' mistakes are easy to spot. Our work reveals capability gaps that are not evident in existing benchmarks: OpenAI o1 significantly outperforms other reasoning models that are on par on benchmarks that test specialized knowledge. Furthermore, our analysis of reasoning outputs uncovers new kinds of failures. DeepSeek R1, for instance, often concedes with ``I give up'' before providing an answer that it knows is wrong. R1 can also be remarkably ``uncertain'' in its output and in rare cases, it does not ``finish thinking,'' which suggests the need for an inference-time technique to ``wrap up'' before the context window limit is reached. We also quantify the effectiveness of reasoning longer with R1 and Gemini Thinking to identify the point beyond which more reasoning is unlikely to improve accuracy on our benchmark.

benchmark, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.01584

Country:

South America (0.04)
Oceania > Australia (0.04)
North America > United States > Wyoming > Natrona County > Casper (0.04)
(10 more...)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scalable Higher Resolution Polar Sea Ice Classification and Freeboard Calculation from ICESat-2 ATL03 Data

Iqrah, Jurdana Masuma, Koo, Younghyun, Wang, Wei, Xie, Hongjie, Prasad, Sushil K.

arXiv.org Artificial IntelligenceFeb-4-2025

ICESat-2 (IS2) by NASA is an Earth-observing satellite that measures high-resolution surface elevation. The IS2's ATL07 and ATL10 sea ice elevation and freeboard products of 10m-200m segments which aggregated 150 signal photons from the raw ATL03 (geolocated photon) data. These aggregated products can potentially overestimate local sea surface height, thus underestimating the calculations of freeboard (sea ice height above sea surface). To achieve a higher resolution of sea surface height and freeboard information, in this work we utilize a 2m window to resample the ATL03 data. Then, we classify these 2m segments into thick sea ice, thin ice, and open water using deep learning methods (Long short-term memory and Multi-layer perceptron models). To obtain labeled training data for our deep learning models, we use segmented Sentinel-2 (S2) multi-spectral imagery overlapping with IS2 tracks in space and time to auto-label IS2 data, followed by some manual corrections in the regions of transition between different ice/water types or cloudy regions. We employ a parallel workflow for this auto-labeling using PySpark to scale, and we achieve 9-fold data loading and 16.25-fold map-reduce speedup. To train our models, we employ a Horovod-based distributed deep-learning workflow on a DGX A100 8 GPU cluster, achieving a 7.25-fold speedup. Next, we calculate the local sea surface heights based on the open water segments. Finally, we scale the freeboard calculation using the derived local sea level and achieve 8.54-fold data loading and 15.7-fold map-reduce speedup. Compared with the ATL07 (local sea level) and ATL10 (freeboard) data products, our results show higher resolutions and accuracy (96.56%).

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.027

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Southern Ocean > Ross Sea (0.06)
Antarctica (0.04)
(4 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.86)

Industry:

Information Technology (0.46)
Government > Space Agency (0.35)
Government > Regional Government > North America Government > United States Government (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comprehensive Analysis on LLM-based Node Classification Algorithms

Wu, Xixi, Shen, Yifei, Ge, Fangzhou, Shan, Caihua, Jiao, Yizhu, Sun, Xiangguo, Cheng, Hong

arXiv.org Artificial IntelligenceFeb-2-2025

Node classification is a fundamental task in graph analysis, with broad applications across various fields. Recent breakthroughs in Large Language Models (LLMs) have enabled LLM-based approaches for this task. Although many studies demonstrate the impressive performance of LLM-based methods, the lack of clear design guidelines may hinder their practical application. In this work, we aim to establish such guidelines through a fair and systematic comparison of these algorithms. As a first step, we developed LLMNodeBed, a comprehensive codebase and testbed for node classification using LLMs. It includes ten datasets, eight LLM-based algorithms, and three learning paradigms, and is designed for easy extension with new methods and datasets. Subsequently, we conducted extensive experiments, training and evaluating over 2,200 models, to determine the key settings (e.g., learning paradigms and homophily) and components (e.g., model size) that affect performance. Our findings uncover eight insights, e.g., (1) LLM-based methods can significantly outperform traditional methods in a semi-supervised setting, while the advantage is marginal in a supervised setting; (2) Graph Foundation Models can beat open-source LLMs but still fall short of strong LLMs like GPT-4o in a zero-shot setting. We hope that the release of LLMNodeBed, along with our insights, will facilitate reproducible research and inspire future studies in this field. Codes and datasets are released at \href{https://llmnodebed.github.io/}{https://llmnodebed.github.io/}.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.00829

Country:

Oceania > Australia (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > Russia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Revealed: What life on Earth will look like in 2100 - with entire cities plunged underwater and millions of people perishing in the heat

Daily Mail - Science & techFeb-1-2025, 09:27:39 GMT

From Snowpiercer to The Day After Tomorrow, countless movies and series have put forward their vision of how climate change might reshape the world. Worryingly, scientists predict that the reality might be far more shocking than anything imagined by a Hollywood studio. Now, artificial intelligence (AI) reveals what this might look like. With Google's ImageFX AI image generator, MailOnline has used the latest scientific research to predict how the world will be in 2100. As greenhouse gas levels continue to increase, scientists predict that entire cities will be plunged under water.

climate change, pollution, scenario, (13 more...)

Daily Mail - Science & tech

Country:

North America > United States > Florida > Palm Beach County > Palm Beach (0.14)
Europe > United Kingdom (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.05)
(13 more...)

Genre: Research Report > New Finding (0.47)

Industry:

Government (0.69)
Law > Environmental Law (0.47)
Health & Medicine (0.47)
Energy > Energy Policy (0.35)

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback

TabFSBench: Tabular Benchmark for Feature Shifts in Open Environment

Cheng, Zi-Jian, Jia, Zi-Yi, Zhou, Zhi, Guo, Lan-Zhe, Li, Yu-Feng

arXiv.org Artificial IntelligenceJan-31-2025

Tabular data is widely utilized in various machine learning tasks. Current tabular learning research predominantly focuses on closed environments, while in real-world applications, open environments are often encountered, where distribution and feature shifts occur, leading to significant degradation in model performance. Previous research has primarily concentrated on mitigating distribution shifts, whereas feature shifts, a distinctive and unexplored challenge of tabular data, have garnered limited attention. To this end, this paper conducts the first comprehensive study on feature shifts in tabular data and introduces the first tabular feature-shift benchmark (TabFSBench). TabFSBench evaluates impacts of four distinct feature-shift scenarios on four tabular model categories across various datasets and assesses the performance of large language models (LLMs) and tabular LLMs in the tabular benchmark for the first time. Our study demonstrates three main observations: (1) most tabular models have the limited applicability in feature-shift scenarios; (2) the shifted feature set importance has a linear relationship with model performance degradation; (3) model performance in closed environments correlates with feature-shift performance. Future research direction is also explored for each observation. TabFSBench is released for public access by using a few lines of Python codes at https://github.com/LAMDASZ-ML/TabFSBench.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.18935

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > New South Wales (0.04)
Antarctica (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Global sea levels could rise by up to 6.2 FEET by 2100, plunging entire cities underwater - so, is your hometown at risk?

Daily Mail - Science & techJan-27-2025, 10:58:23 GMT

The idea of entire cities being plunged underwater might sound like the plot of the latest science fiction blockbuster. But it could become a reality in just 75 years, according to a terrifying new study. Scientists from Nanyang Technological University (NTU), Singapore, have predicted that global sea levels could rise by a staggering 6.2 feet (1.9 metres) by 2100 if carbon dioxide (CO2) emissions continue to increase. 'The high-end projection of 1.9 metres underscores the need for decision-makers to plan for critical infrastructure accordingly,' said Dr Benjamin Grandey, lead author of the study. If global sea levels were to rise by 6.2ft (1.9 metres), towns and cities around the world could be plunged underwater - including several in the UK.

global sea level, sea level, town and city, (11 more...)

Daily Mail - Science & tech

Country:

Asia > Singapore (0.26)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.06)
North America > United States > Texas (0.06)
(14 more...)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Physics-Trained Neural Network as Inverse Problem Solver for Potential Fields: An Example of Downward Continuation between Arbitrary Surfaces

Sun, Jing, Li, Lu, Zhang, Liang

arXiv.org Artificial IntelligenceJan-26-2025

We treat downward continuation as an inverse problem that relies on solving a forward problem defined by the formula for upward continuation, and we propose a new physics-trained deep neural network (DNN)-based solution for this task. We hard-code the upward continuation process into the DNN's learning framework, where the DNN itself learns to act as the inverse problem solver and can perform downward continuation without ever being shown any ground truth data. We test the proposed method on both synthetic magnetic data and real-world magnetic data from West Antarctica. The preliminary results demonstrate its effectiveness through comparison with selected benchmarks, opening future avenues for the combined use of DNNs and established geophysical theories to address broader potential field inverse problems, such as density and geometry modelling. Introduction Downward continuation of potential field, including gravity or magnetic field, refers to transferring the data from one observation surface to a lower surface that is closer to the source of the field. The goal is to enhance the resolution of the continued field and amplify the shallow geological signals. Airborne surveys are typically flown at uneven heights, making continuation from these surfaces a common requirement. Downward continuation is a critical task in the processing of potential field data, impacting the success of various downstream analyses, such as revealing the density structure and boundaries of anomalous bodies, especially for detecting and highlighting shallow anomalous sources. Many methods have been developed for the task of downward continuation (e.g.

artificial intelligence, continuation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.0519

Country:

Antarctica > West Antarctica (0.26)
Asia > Middle East > Jordan (0.05)
Oceania > Australia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Task Allocation in Customer-led Two-sided Markets with Satellite Constellation Services

Qiao, Jianglin, Cao, Zehong, de Jonge, Dave, Kowalczyk, Ryszard

arXiv.org Artificial IntelligenceJan-22-2025

Multi-agent systems (MAS) are increasingly applied to complex task allocation in two-sided markets, where agents such as companies and customers interact dynamically. Traditional company-led Stackelberg game models, where companies set service prices, and customers respond, struggle to accommodate diverse and personalised customer demands in emerging markets like crowdsourcing. This paper proposes a customer-led Stackelberg game model for cost-efficient task allocation, where customers initiate tasks as leaders, and companies create their strategies as followers to meet these demands. We prove the existence of Nash Equilibrium for the follower game and Stackelberg Equilibrium for the leader game while discussing their uniqueness under specific conditions, ensuring cost-efficient task allocation and improved market performance. Using the satellite constellation services market as a real-world case, experimental results show a 23% reduction in customer payments and a 6.7-fold increase in company revenues, demonstrating the model's effectiveness in emerging markets.

artificial intelligence, customer, game theory, (17 more...)

arXiv.org Artificial Intelligence

2501.13364

Country:

Europe (0.04)
Africa (0.04)
Oceania > Australia > South Australia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Banking & Finance > Trading (0.86)
Energy (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback