AITopics

Collaborating Authors

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation

Neural Information Processing SystemsMar-27-2025, 13:53:14 GMT

Recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech. However, an LLM-based strategy for modeling spoken dialogs remains elusive, calling for further investigation. This paper introduces an extensive speech-text LLM framework, the Unified Spoken Dialog Model (USDM), designed to generate coherent spoken responses with naturally occurring prosodic features relevant to the given input speech without relying on explicit automatic speech recognition (ASR) or text-tospeech (TTS) systems. We have verified the inclusion of prosody in speech tokens that predominantly contain semantic information and have used this foundation to construct a prosody-infused speech-text model. Additionally, we propose a generalized speech-text pretraining scheme that enhances the capture of cross-modal semantics. To construct USDM, we fine-tune our speech-text model on spoken dialog data using a multi-step spoken dialog template that stimulates the chain-ofreasoning capabilities exhibited by the underlying LLM. Automatic and human evaluations on the DailyTalk dataset demonstrate that our approach effectively generates natural-sounding spoken responses, surpassing previous and cascaded baselines. Our code and checkpoints are available at https://github.com/naverai/usdm.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Supplementary Material

Neural Information Processing SystemsMar-27-2025, 13:53:06 GMT

The train, text and validation splits for SST2 [47] and SST5 [47] are used from the source itself while the validation data for TREC6 [35, 18] is obtained using 10% of the train data. The test data for glue-SST2 [51] is obtained using 5% of the train data. Seed value of 42 is used in generator argument in random_split function of torch. In Table 1, we summarize the number classes, and number of instances in each split in the text datasets.

artificial intelligence, machine learning, utomata 0, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

This new AI tool changes a speaker's accent to American English in real-time - hear for yourself

ZDNetMar-27-2025, 13:53:04 GMT

Krisp, an AI startup known for its noise cancellation and transcription services, is launching a new AI tool that can convert a speaker's accent to American English in real time. The company claims the tool can help native speakers better understand non-native English speakers "more easily, without changing [their] natural voice and vocal traits." The company is initially rolling out support for altering 17 Indian dialects into US English but plans to expand in the future with Filipino and more. The tool is compatible with Zoom, Microsoft Teams, Google Meet, and other meeting app platforms. So, as long as users have access to Krisp's existing desktop app, the tool can "clarify" accents. According to Krisp's website, Indian accents were the first accents the company wanted to work on because people from the region are a large segment of the global workforce, especially within STEM fields.

artificial intelligence, krisp, real time system, (5 more...)

ZDNet

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Architecture > Real Time Systems (0.62)

Add feedback

b8ab7288e7d5aefc695175f22bbddead-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 13:53:03 GMT

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.71)

Add feedback

VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images Supplementary Materials

Neural Information Processing SystemsMar-27-2025, 13:52:56 GMT

Figure 5: t-SNE plots to illustrate the effectiveness of random sampling with the majority species in the Fish-10K dataset. Randomly sampled images are shown as blue dots, while the remaining data points are represented by red dots. To generate the vector representation of the images, we leverage a VGG19 pretrained on the ImageNet dataset. We collected images of three taxonomic groups of organisms: fish, birds, and butterflies, each containing around 10K images. Images for fish (Fish-10K) were curated from the larger image collection, FishAIR [1], which contains images from the Great Lakes Invasive Network Project (GLIN) [2]. We created the Fish-10K dataset by randomly sampling 10K images and preprocessing the images to crop and remove the background. To ensure diversity within the Fish-10K dataset, we applied a targeted sampling strategy in the source collection, FishAIR [1]. Specifically, we retained all images of species with fewer than 200 images, considering these as minority or rare classes. Random sampling was applied only to the majority species--those with more than 200 images per class. To assess the potential sampling bias among the majority species, we generated feature vectors for each image in Fish-10K using a pretrained VGG-19 model. Our analysis shows that the distribution of sampled images closely mirrors the distribution of images that were not included in the dataset (denoted as "others" in the plot). This suggests that our random sampling approach provides a sufficiently accurate representation of the original distribution for the majority species.

large language model, machine learning, scientific name, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images

Neural Information Processing SystemsMar-27-2025, 13:52:53 GMT

Images are increasingly becoming the currency for documenting biodiversity on the planet, providing novel opportunities for accelerating scientific discoveries in the field of organismal biology, especially with the advent of large vision-language models (VLMs). We ask if pre-trained VLMs can aid scientists in answering a range of biologically relevant questions without any additional fine-tuning. In this paper, we evaluate the effectiveness of 12 state-of-the-art (SOTA) VLMs in the field of organismal biology using a novel dataset, VLM4Bio, consisting of 469K questionanswer pairs involving 30K images from three groups of organisms: fishes, birds, and butterflies, covering five biologically relevant tasks. We also explore the effects of applying prompting techniques and tests for reasoning hallucination on the performance of VLMs, shedding new light on the capabilities of current SOTA VLMs in answering biologically relevant questions using images.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Appendix No-regret Algorithms for Fair Resource Allocation

Neural Information Processing SystemsMar-27-2025, 13:52:44 GMT

We provide a more comprehensive review of the fair machine learning literature in this section. Multiple different definitions have been used to quantify the fairness of machine learning algorithms. Hardt et al. [2016] introduced equality of opportunity as a fairness criterion, which ensures that individuals have an equal chance of being correctly classified by machine learning algorithms, regardless of their protected attributes like race or gender. Kleinberg et al. [2017] formalized three different notions of fairness and showed that no algorithm can satisfy these notions simultaneously, thus showing the inherent trade-offs in competing notions of fairness. Other prevalent fairness criteria include Price-of-fairness introduced by Bertsimas et al. [2011] which quantifies how much the aggregate utility is affected by enforcing fairness.

allocation, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

96842011407c2691ab4eefff48fc864d-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 13:52:41 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Industry: Education (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.93)
(2 more...)

Add feedback

AutoMix: Automatically Mixing Language Models

Neural Information Processing SystemsMar-27-2025, 13:52:34 GMT

Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix are two key technical contributions. First, it has a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring extensive training. Second, given that self-verification can be noisy, it employs a POMDP based router that can effectively select an appropriately sized model, based on answer confidence. Experiments across five language models and five challenging datasets show that AutoMix consistently surpasses strong baselines, reducing computational cost by over 50% for comparable performance.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: