AITopics | Indian Ocean

Collaborating Authors

Indian Ocean

A multi-modal representation of El Ni\~no Southern Oscillation Diversity

Schlör, Jakob, Strnad, Felix, Capotondi, Antonietta, Goswami, Bedartha

arXiv.org Artificial IntelligenceJul-21-2023

The El Niño-Southern Oscillation (ENSO), characterized by anomalous sea surface temperature (SST) in the tropical Pacific, exhibits notable diversity in its temporal evolution and spatial distribution of anomalies. The El Niño events of 1982-83 and 1997-98, for instance, recorded exceptionally high sea surface temperature anomaly (SSTA) values in the eastern equatorial Pacific, whereas the El Niño of 2002-03 were notably less extreme and primarily restricted to the central equatorial Pacific (McPhaden, 2004). Despite each being categorized as an El Niño, the 2002-03 event exhibited global climate conditions distinct from those of the earlier two events. In order to describe these event-to-event differences, El Niño events have been categorized as Eastern Pacific (EP), and Central Pacific (CP) types (Capotondi et al., 2020). EP El Niño events typically have their peak SSTA in the Eastern Pacific, exhibit stronger intensities, and a largely reduced zonal thermocline slope, resulting in the discharge of warm water from the equatorial thermocline. In contrast, CP events show peak SSTA in the Central Pacific and are comparatively weaker with more limited changes in zonal thermocline slope and reduced warm water discharge (Kug, Jin, and An, 2009; Capotondi, 2013). Despite considerable research, the underlying causes of ENSO diversity remain elusive (Lee and McPhaden, 2010; Capotondi et al., 2015; Capotondi et al., 2020). And while some general circulation models (GCMs) do exhibit ENSO event-to-event differences, their representation of ENSO diversity appears to be model dependent and is often different in intensity, pattern and duration than observed (Cai et al., 2018). The different types of ENSO events have substantially different downstream impacts on the global climate and dynamics (Strnad et al., 2022).

artificial intelligence, category, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.11552

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)
Indian Ocean (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Numerical Data Imputation for Multimodal Data Sets: A Probabilistic Nearest-Neighbor Kernel Density Approach

Lalande, Florian, Doya, Kenji

arXiv.org Artificial IntelligenceJul-10-2023

Numerical data imputation algorithms replace missing values by estimates to leverage incomplete data sets. Current imputation methods seek to minimize the error between the unobserved ground truth and the imputed values. But this strategy can create artifacts leading to poor imputation in the presence of multimodal or complex distributions. To tackle this problem, we introduce the $k$NN$\times$KDE algorithm: a data imputation method combining nearest neighbor estimation ($k$NN) and density estimation with Gaussian kernels (KDE). We compare our method with previous data imputation methods using artificial and real-world data with different data missing scenarios and various data missing rates, and show that our method can cope with complex original data structure, yields lower data imputation errors, and provides probabilistic estimates with higher likelihood than current methods. We release the code in open-source for the community: https://github.com/DeltaFloflo/knnxkde

artificial intelligence, data quality, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.16906

Country:

Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
Oceania > Australia > Tasmania (0.04)
North America > United States > Wyoming (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Energy (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

Zhuang, Shengyao, Ren, Houxing, Shou, Linjun, Pei, Jian, Gong, Ming, Zuccon, Guido, Jiang, Daxin

arXiv.org Artificial IntelligenceJul-7-2023

The Differentiable Search Index (DSI) is an emerging paradigm for information retrieval. Unlike traditional retrieval architectures where index and retrieval are two different and separate components, DSI uses a single transformer model to perform both indexing and retrieval. In this paper, we identify and tackle an important issue of current DSI models: the data distribution mismatch that occurs between the DSI indexing and retrieval processes. Specifically, we argue that, at indexing, current DSI methods learn to build connections between the text of long documents and the identifier of the documents, but then retrieval of document identifiers is based on queries that are commonly much shorter than the indexed documents. This problem is further exacerbated when using DSI for cross-lingual retrieval, where document text and query text are in different languages. To address this fundamental problem of current DSI models, we propose a simple yet effective indexing framework for DSI, called DSI-QG. When indexing, DSI-QG represents documents with a number of potentially relevant queries generated by a query generation model and re-ranked and filtered by a cross-encoder ranker. The presence of these queries at indexing allows the DSI models to connect a document identifier to a set of queries, hence mitigating data distribution mismatches present between the indexing and the retrieval phases. Empirical results on popular mono-lingual and cross-lingual passage retrieval datasets show that DSI-QG significantly outperforms the original DSI model.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2206.10128

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.05)
Asia > China (0.04)
Oceania > Australia > Queensland (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government > Military (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

How accurate are existing land cover maps for agriculture in Sub-Saharan Africa?

Kerner, Hannah, Nakalembe, Catherine, Yang, Adam, Zvonkov, Ivan, McWeeny, Ryan, Tseng, Gabriel, Becker-Reshef, Inbal

arXiv.org Artificial IntelligenceJul-5-2023

Satellite Earth observations (EO) can provide affordable and timely information for assessing crop conditions and food production. Such monitoring systems are essential in Africa, where there is high food insecurity and sparse agricultural statistics. EO-based monitoring systems require accurate cropland maps to provide information about croplands, but there is a lack of data to determine which of the many available land cover maps most accurately identify cropland in African countries. This study provides a quantitative evaluation and intercomparison of 11 publicly available land cover maps to assess their suitability for cropland classification and EO-based agriculture monitoring in Africa using statistically rigorous reference datasets from 8 countries. We hope the results of this study will help users determine the most suitable map for their needs and encourage future work to focus on resolving inconsistencies between maps and improving accuracy in low-accuracy regions.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.02575

Country:

Africa > Sub-Saharan Africa (0.41)
North America > Canada > Quebec > Montreal (0.14)
Africa > Mali (0.06)
(17 more...)

Genre: Research Report > New Finding (1.00)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Selecting Robust Features for Machine Learning Applications using Multidata Causal Discovery

S., Saranya Ganesh, Beucler, Tom, Tam, Frederick Iat-Hin, Gomez, Milton S., Runge, Jakob, Gerhardus, Andreas

arXiv.org Artificial IntelligenceJun-30-2023

Robust feature selection is vital for creating reliable and interpretable Machine Learning (ML) models. When designing statistical prediction models in cases where domain knowledge is limited and underlying interactions are unknown, choosing the optimal set of features is often difficult. To mitigate this issue, we introduce a Multidata (M) causal feature selection approach that simultaneously processes an ensemble of time series datasets and produces a single set of causal drivers. This approach uses the causal discovery algorithms PC1 or PCMCI that are implemented in the Tigramite Python package. These algorithms utilize conditional independence tests to infer parts of the causal graph. Our causal feature selection approach filters out causally-spurious links before passing the remaining causal features as inputs to ML models (Multiple linear regression, Random Forest) that predict the targets. We apply our framework to the statistical intensity prediction of Western Pacific Tropical Cyclones (TC), for which it is often difficult to accurately choose drivers and their dimensionality reduction (time lags, vertical levels, and area-averaging). Using more stringent significance thresholds in the conditional independence tests helps eliminate spurious causal relationships, thus helping the ML model generalize better to unseen TC cases. M-PC1 with a reduced number of features outperforms M-PCMCI, non-causal ML, and other feature selection methods (lagged correlation, random), even slightly outperforming feature selection based on eXplainable Artificial Intelligence. The optimal causal drivers obtained from our causal feature selection help improve our understanding of underlying relationships and suggest new potential drivers of TC intensification.

artificial intelligence, machine learning, selection, (16 more...)

arXiv.org Artificial Intelligence

2304.05294

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Add feedback

Regularized Multivariate Functional Principal Component Analysis

Haghbin, Hossein, Zhao, Yue, Maadooliat, Mehdi

arXiv.org Machine LearningJun-24-2023

Multivariate Functional Principal Component Analysis (MFPCA) is a valuable tool for exploring relationships and identifying shared patterns of variation in multivariate functional data. However, controlling the roughness of the extracted Principal Components (PCs) can be challenging. This paper introduces a novel approach called regularized MFPCA (ReMFPCA) to address this issue and enhance the smoothness and interpretability of the multivariate functional PCs. ReMFPCA incorporates a roughness penalty within a penalized framework, using a parameter vector to regulate the smoothness of each functional variable. The proposed method generates smoothed multivariate functional PCs, providing a concise and interpretable representation of the data. Extensive simulations and real data examples demonstrate the effectiveness of ReMFPCA and its superiority over alternative methods. The proposed approach opens new avenues for analyzing and uncovering relationships in complex multivariate functional datasets.

artificial intelligence, machine learning, refpca, (14 more...)

arXiv.org Machine Learning

2306.1398

Country:

Asia > Middle East > Iran (0.14)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
Indian Ocean > Arabian Gulf (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Neuro-Symbolic Bi-Directional Translation -- Deep Learning Explainability for Climate Tipping Point Research

Ashcraft, Chace, Sleeman, Jennifer, Tang, Caroline, Brett, Jay, Gnanadesikan, Anand

arXiv.org Artificial IntelligenceJun-19-2023

In recent years, there has been an increase in using deep learning for climate and weather modeling. Though results have been impressive, explainability and interpretability of deep learning models are still a challenge. A third wave of Artificial Intelligence (AI), which includes logic and reasoning, has been described as a way to address these issues. Neuro-symbolic AI is a key component of this integration of logic and reasoning with deep learning. In this work we propose a neuro-symbolic approach called Neuro-Symbolic Question-Answer Program Translator, or NS-QAPT, to address explainability and interpretability for deep learning climate simulation, applied to climate tipping point discovery. The NS-QAPT method includes a bidirectional encoder-decoder architecture that translates between domain-specific questions and executable programs used to direct the climate simulation, acting as a bridge between climate scientists and deep learning models. We show early compelling results of this translation method and introduce a domain-specific language and associated executable programs for a commonly known tipping point, the collapse of the Atlantic Meridional Overturning Circulation (AMOC).

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.11161

Country:

North America > United States > Maryland > Prince George's County > Laurel (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Open-domain Keyphrase Generation

Do, Lam Thanh, Akash, Pritom Saha, Chang, Kevin Chen-Chuan

arXiv.org Artificial IntelligenceJun-19-2023

In this work, we study the problem of unsupervised open-domain keyphrase generation, where the objective is a keyphrase generation model that can be built without using human-labeled data and can perform consistently across domains. To solve this problem, we propose a seq2seq model that consists of two modules, namely \textit{phraseness} and \textit{informativeness} module, both of which can be built in an unsupervised and open-domain fashion. The phraseness module generates phrases, while the informativeness module guides the generation towards those that represent the core concepts of the text. We thoroughly evaluate our proposed method using eight benchmark datasets from different domains. Results on in-domain datasets show that our approach achieves state-of-the-art results compared with existing unsupervised models, and overall narrows the gap between supervised and unsupervised methods down to about 16\%. Furthermore, we demonstrate that our model performs consistently across domains, as it overall surpasses the baselines on out-of-domain datasets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.10755

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Illinois (0.04)
(13 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network

Larson, Albert, Akanda, Ali Shafqat

arXiv.org Artificial IntelligenceJun-16-2023

Sea surface temperature (SST) is an essential climate variable that can be measured via ground truth, remote sensing, or hybrid model methodologies. Here, we celebrate SST surveillance progress via the application of a few relevant technological advances from the late 20th and early 21st century. We further develop our existing water cycle observation framework, Flux to Flow (F2F), to fuse AMSR-E and MODIS into a higher resolution product with the goal of capturing gradients and filling cloud gaps that are otherwise unavailable. Our neural network architecture is constrained to a deep convolutional residual regressive neural network. We utilize three snapshots of twelve monthly SST measurements in 2010 as measured by the passive microwave radiometer AMSR-E, the visible and infrared monitoring MODIS instrument, and the in situ Argo dataset ISAS. The performance of the platform and success of this approach is evaluated using the root mean squared error (RMSE) metric. We determine that the 1:1 configuration of input and output data and a large observation region is too challenging for the single compute node and dcrrnn structure as is. When constrained to a single 100 x 100 pixel region and a small training dataset, the algorithm improves from the baseline experiment covering a much larger geography. For next discrete steps, we envision the consideration of a large input range with a very small output range. Furthermore, we see the need to integrate land and sea variables before performing computer vision tasks like those within. Finally, we see parallelization as necessary to overcome the compute obstacles we encountered.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2306.09987

Country:

Southern Ocean (0.04)
Pacific Ocean (0.04)
Indian Ocean > Bay of Bengal (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Contrastive Loss is All You Need to Recover Analogies as Parallel Lines

Ri, Narutatsu, Lee, Fei-Tzin, Verma, Nakul

arXiv.org Artificial IntelligenceJun-13-2023

While static word embedding models are known to represent linguistic analogies as parallel lines in high-dimensional space, the underlying mechanism as to why they result in such geometric structures remains obscure. We find that an elementary contrastive-style method employed over distributional information performs competitively with popular word embedding models on analogy recovery tasks, while achieving dramatic speedups in training time. Further, we demonstrate that a contrastive loss is sufficient to create these parallel structures in word embeddings, and establish a precise relationship between the co-occurrence statistics and the geometric structure of the resulting word embeddings.

analogy, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.08221

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California (0.04)
(12 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback