AITopics | typhoon

Collaborating Authors

typhoon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Developing an Open Conversational Speech Corpus for the Isan Language

Na-Thalang, Adisai, Wittayasakpan, Chanakan, Phatcharoen, Kritsadha, Buakaw, Supakit

arXiv.org Artificial IntelligenceDec-5-2025

This paper introduces the development of the first open conversational speech dataset for the Isan language, the most widely spoken regional dialect in Thailand. Unlike existing speech corpora that are primarily based on read or scripted speech, this dataset consists of natural speech, thereby capturing authentic linguistic phenomena such as colloquials, spontaneous prosody, disfluencies, and frequent code-switching with central Thai. A key challenge in building this resource lies in the lack of a standardized orthography for Isan. Current writing practices vary considerably, due to the different lexical tones between Thai and Isan. This variability complicates the design of transcription guidelines and poses questions regarding consistency, usability, and linguistic authenticity. To address these issues, we establish practical transcription protocols that balance the need for representational accuracy with the requirements of computational processing. By releasing this dataset as an open resource, we aim to contribute to inclusive AI development, support research on underrepresented languages, and provide a basis for addressing the linguistic and technical challenges inherent in modeling conversational speech.

artificial intelligence, natural language, thatphithakkul, (10 more...)

arXiv.org Artificial Intelligence

2511.21229

Country: Asia > Thailand (0.24)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Speech (0.46)

Add feedback

UK fighters to defend Polish skies after Russian drone incursion

BBC NewsSep-15-2025, 17:53:01 GMT

Fighter jets from the UK will join Nato allies in defending Polish airspace after last week's incursion of Russian drones, the defence secretary has confirmed. RAF Typhoon jets will fly air defence missions over Poland as part of the military alliance's mission to bolster the eastern flank. Other allies including Denmark, Germany and France are already taking part - a jet from the latter was scrambled earlier on Monday in response to another potential incursion by Russian drones. Nato said that alert was quickly over. Tensions have risen across Europe since Poland accused Russia of the incident, which saw 19 drones enter its territory.

defend polish sky, incursion, russian drone, (12 more...)

BBC News

Country:

Europe > Poland (0.52)
Asia > Russia (0.44)
Europe > Ukraine (0.35)
(22 more...)

Industry:

Government > Military > Air Force (0.93)
Government > Regional Government > Europe Government > United Kingdom Government (0.49)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.98)

Add feedback

Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks

Kitamoto, Asanobu, Dzik, Erwan, Faure, Gaspar

arXiv.org Artificial IntelligenceNov-25-2024

This paper presents the Digital Typhoon Dataset V2, a new version of the longest typhoon satellite image dataset for 40+ years aimed at benchmarking machine learning models for long-term spatio-temporal data. The new addition in Dataset V2 is tropical cyclone data from the southern hemisphere, in addition to the northern hemisphere data in Dataset V1. Having data from two hemispheres allows us to ask new research questions about regional differences across basins and hemispheres. We also discuss new developments in representations and tasks of the dataset. We first introduce a self-supervised learning framework for representation learning. Combined with the LSTM model, we discuss performance on intensity forecasting and extra-tropical transition forecasting tasks. We then propose new tasks, such as the typhoon center estimation task. We show that an object detection-based model performs better for stronger typhoons. Finally, we study how machine learning models can generalize across basins and hemispheres, by training the model on the northern hemisphere data and testing it on the southern hemisphere data.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.16421

Country:

Oceania > Australia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning

Artkaew, Phakphum

arXiv.org Artificial IntelligenceMay-28-2024

Commonsense reasoning is one of the important aspect of natural language understanding, with several benchmarks developed to evaluate it. However, only a few of these benchmarks are available in languages other than English. Developing parallel benchmarks facilitates cross-lingual evaluation, enabling a better understanding of different languages. This research introduces a collection of Winograd Schemas in Thai, a novel dataset designed to evaluate commonsense reasoning capabilities in the context of the Thai language. Through a methodology involving native speakers, professional translators, and thorough validation, the schemas aim to closely reflect Thai language nuances, idioms, and cultural references while maintaining ambiguity and commonsense challenges. We evaluate the performance of popular large language models on this benchmark, revealing their strengths, limitations, and providing insights into the current state-of-the-art. Results indicate that while models like GPT-4 and Claude-3-Opus achieve high accuracy in English, their performance significantly drops in Thai, highlighting the need for further advancements in multilingual commonsense reasoning.

benchmark, language model, winograd schema, (13 more...)

arXiv.org Artificial Intelligence

2405.18375

Country:

North America > United States > New York (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Typhoon: Thai Large Language Models

Pipatanakul, Kunat, Jirabovonvisut, Phatrasek, Manakul, Potsawee, Sripaisarnmongkol, Sittipong, Patomwong, Ruangsak, Chokchainant, Pathomporn, Tharnpipitchai, Kasima

arXiv.org Artificial IntelligenceDec-21-2023

Typhoon is a series of Thai large language models (LLMs) developed specifically for the Thai language. This technical report presents challenges and insights in developing Thai LLMs, including data preparation, pretraining, instruction-tuning, and evaluation. As one of the challenges of low-resource languages is the amount of pretraining data, we apply continual training to transfer existing world knowledge from a strong LLM. To evaluate the Thai knowledge encapsulated in each model from the pretraining stage, we develop ThaiExam, a benchmark based on examinations for high-school students and investment professionals in Thailand. In addition, we fine-tune Typhoon to follow Thai instructions, and we evaluate instruction-tuned models on Thai instruction datasets as well as translation, summarization, and question-answering tasks. Experimental results on a suite of Thai benchmarks show that Typhoon outperforms all open-source Thai language models, and its performance is on par with GPT-3.5 in Thai while having only 7 billion parameters and being 2.62 times more efficient in tokenizing Thai text.

arxiv preprint arxiv, computational linguistic, language model, (14 more...)

arXiv.org Artificial Intelligence

2312.13951

Country:

Asia > Thailand (0.35)
North America > Canada > Ontario > Toronto (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > K-12 Education > Secondary School (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Residual Diffusion Modeling for Km-scale Atmospheric Downscaling

Mardani, Morteza, Brenowitz, Noah, Cohen, Yair, Pathak, Jaideep, Chen, Chieh-Yu, Liu, Cheng-Chin, Vahdat, Arash, Kashinath, Karthik, Kautz, Jan, Pritchard, Mike

arXiv.org Artificial IntelligenceDec-9-2023

Predictions of weather hazard require expensive km-scale simulations driven by coarser global inputs. Here, a cost-effective stochastic downscaling model is trained from a high-resolution 2-km weather model over Taiwan conditioned on 25-km ERA5 reanalysis. To address the multi-scale machine learning challenges of weather data, we employ a two-step approach Corrector Diffusion (\textit{CorrDiff}), where a UNet prediction of the mean is corrected by a diffusion step. Akin to Reynolds decomposition in fluid dynamics, this isolates generative learning to the stochastic scales. \textit{CorrDiff} exhibits skillful RMSE and CRPS and faithfully recovers spectra and distributions even for extremes. Case studies of coherent weather phenomena reveal appropriate multivariate relationships reminiscent of learnt physics: the collocation of intense rainfall and sharp gradients in fronts and extreme winds and rainfall bands near the eyewall of typhoons. Downscaling global forecasts successfully retains many of these benefits, foreshadowing the potential of end-to-end, global-to-km-scales machine learning weather predictions.

corrdiff, diffusion model, prediction, (14 more...)

arXiv.org Artificial Intelligence

2309.15214

Country:

Asia > Japan (0.04)
Oceania > New Zealand (0.04)
North America > United States > Hawaii (0.04)
(6 more...)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models

Abdurrahman, Muhammed Shahir, Elezabi, Hashem, Xu, Bruce Changlong

arXiv.org Artificial IntelligenceMar-27-2023

Through exploiting a high level of parallelism enabled by graphics processing units, transformer architectures have enabled tremendous strides forward in the field of natural language processing. In a traditional masked language model, special MASK tokens are used to prompt our model to gather contextual information from surrounding words to restore originally hidden information. In this paper, we explore a task-specific masking framework for pre-trained large language models that enables superior performance on particular downstream tasks on the datasets in the GLUE benchmark. We develop our own masking algorithm, Typhoon, based on token input gradients, and compare this with other standard baselines. We find that Typhoon offers performance competitive with whole-word masking on the MRPC dataset. Our implementation can be found in a public Github Repository.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2303.15619

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

bo_hot's Diary

#artificialintelligenceDec-26-2021, 16:11:44 GMT

Some of you may remember earlier this year we conducted an experiment to compare traditional mapping with ai-assisted mapping. Below is our summary of findings and the full report for those who may be interested. We hope this experiement will be the start of the conversation of how we can ethically and responsibly introduce AI augmented mapping workflows into HOT's work in 2022. In the last 10 years, the use of AI/ML in the geospatial sector has boomed. Private sector, academic and nonprofit organizations alike have been investing significant thought, time and resources into exploring and testing the potential and possibility of how AI/ML can augment and amplify current GIS workflows.

data quality, experiment, mapper, (10 more...)

#artificialintelligence

Country: Asia > Philippines (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.59)

Add feedback

Semantic-based End-to-End Learning for Typhoon Intensity Prediction

Zahera, Hamada M., Sherif, Mohamed Ahmed, Ngonga, Axel

arXiv.org Machine LearningMar-21-2020

Disaster prediction is one of the most critical tasks towards disaster surveillance and preparedness. Existing technologies employ different machine learning approaches to predict incoming disasters from historical environmental data. However, for short-term disasters (e.g., earthquakes), historical data alone has a limited prediction capability. Therefore, additional sources of warnings are required for accurate prediction. We consider social media as a supplementary source of knowledge in addition to historical environmental data. However, social media posts (e.g., tweets) is very informal and contains only limited content. To alleviate these limitations, we propose the combination of semantically-enriched word embedding models to represent entities in tweets with their semantic representations computed with the traditionalword2vec. Moreover, we study how the correlation between social media posts and typhoons magnitudes (also called intensities)-in terms of volume and sentiments of tweets-. Based on these insights, we propose an end-to-end based framework that learns from disaster-related tweets and environmental data to improve typhoon intensity prediction. This paper is an extension of our work originally published in K-CAP 2019 [32]. We extended this paper by building our framework with state-of-the-art deep neural models, up-dated our dataset with new typhoons and their tweets to-date and benchmark our approach against recent baselines in disaster prediction. Our experimental results show that our approach outperforms the accuracy of the state-of-the-art baselines in terms of F1-score with (CNN by12.1%and BiLSTM by3.1%) improvement compared with last experiments

environmental data, tweet, typhoon, (15 more...)

arXiv.org Machine Learning

2003.13779

Country:

Asia > Japan (0.04)
Asia > Philippines (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.79)

Add feedback

AI For Climate Action

#artificialintelligenceOct-3-2019, 16:38:23 GMT

Climate action is the latest buzzword among industry circles since the many International Panel on Climate Change (IPCC) reports and the recent UN Climate Summit in New York City. Greta Thunberg grabbed the headlines, but industrialists are all wondering: How can we move swiftly and effectively to reduce carbon emissions? How can we use AI and other exponential technologies to do the job better, faster and cheaper? As a business strategist and urban planner, I advise companies to focus on cities since they consume 80% of energy and emit 70% of carbon, so we'll win or lose the carbon battle in the cities. Fortunately, cities can move faster than national governments and, as energy buyers, they can directly negotiate energy types and pricing, giving them enormous economic clout.

climate action, exponential technology, recovery, (11 more...)

#artificialintelligence

Country:

North America > United States > New York (0.26)
Oceania > Guam (0.05)
North America > United States > District of Columbia > Washington (0.05)
(3 more...)

Industry:

Government (1.00)
Energy (0.93)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback