AITopics

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Energy (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Neural Information Processing SystemsFeb-9-2026, 01:59:23 GMT

5565ab682d6c7f8d9da34ba0919974b0-Paper-Conference.pdf

arxiv preprint arxiv, staircase model, transformer, (12 more...)

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Workflow (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Neural Information Processing SystemsOct-9-2025, 15:48:06 GMT

Staircase Attention for Recurrent Processing of Sequences

Staircase model, Transformer cores are stacked diagonally, so each step sees one new input chunk.

arxiv preprint arxiv, staircase model, transformer, (12 more...)

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Workflow (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Neural Information Processing SystemsAug-22-2025, 00:43:05 GMT

Hash Layers For Large Sparse Models

A key component to a MoE model is the routing (gating) strategy.

artificial intelligence, machine learning, natural language, (15 more...)

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Energy (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

arXiv.org Artificial IntelligenceAug-6-2024

Are Female Carpenters like Blue Bananas? A Corpus Investigation of Occupation Gender Typicality

Ju, Da, Ulrich, Karen, Williams, Adina

People tend to use language to mention surprising properties of events: for example, when a banana is blue, we are more likely to mention color than when it is yellow. This fact is taken to suggest that yellowness is somehow a typical feature of bananas, and blueness is exceptional. Similar to how a yellow color is typical of bananas, there may also be genders that are typical of occupations. In this work, we explore this question using information theoretic techniques coupled with corpus statistic analysis. In two distinct large corpora, we do not find strong evidence that occupations and gender display the same patterns of mentioning as do bananas and color. Instead, we find that gender mentioning is correlated with femaleness of occupation in particular, suggesting perhaps that woman-dominated occupations are seen as somehow ``more gendered'' than male-dominated ones, and thereby they encourage more gender mentioning overall.

gender, occupation, pushshift, (15 more...)

2408.02948

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Communications > Social Media (0.83)

arXiv.org Artificial IntelligenceOct-5-2023

Now It Sounds Like You: Learning Personalized Vocabulary On Device

Wang, Sid, Shenoy, Ashish, Chuang, Pierce, Nguyen, John

In recent years, Federated Learning (FL) has shown significant advancements in its ability to perform various natural language processing (NLP) tasks. This work focuses on applying personalized FL for on-device language modeling. Due to limitations of memory and latency, these models cannot support the complexity of sub-word tokenization or beam search decoding, resulting in the decision to deploy a closed-vocabulary language model. However, closed-vocabulary models are unable to handle out-of-vocabulary (OOV) words belonging to specific users. To address this issue, We propose a novel technique called "OOV expansion" that improves OOV coverage and increases model accuracy while minimizing the impact on memory and latency. This method introduces a personalized "OOV adapter" that effectively transfers knowledge from a central model and learns word embedding for personalized vocabulary. OOV expansion significantly outperforms standard FL personalization methods on a set of common FL benchmarks.

federated learning, learning, personalization, (14 more...)

2305.03584

Country:

North America > United States > Virginia (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > China (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.34)

WIREDJun-29-2023, 11:00:00 GMT

Reddit Is Already on the Rebound

Social media researchers at the Network Contagion Research Institute in Princeton, New Jersey, got a rude awakening early last month. They were roused by 6:30 am phone calls from a colleague warning that Reddit had started blocking the institute's Pushshift service from updating its ongoing archive of every post on the discussion platform. That was a problem for more than just NCRI, because some of Reddit's 50,000 volunteer moderators depend on Pushshift to quickly investigate problem users, and many academics rely on the service. If it went stale, mods, as Reddit calls moderators, would have to work overtime or let more trash content accumulate. Researchers studying online communities would be forced to put projects and doctoral dissertations on ice.

platform, pushshift, reddit, (6 more...)

WIRED

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.26)
Africa > South Africa > Western Cape > Cape Town (0.06)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.32)

#artificialintelligenceMar-15-2022, 23:40:16 GMT

An AI Twitter bot that only tweets good news, with Python and GPT2

Running AI these days is increasingly simple due to the hard work of open source contributors producing top-notch libraries out there, and research groups opening up their work so others can build on it. One key library doing that is HuggingFace's Transformers library. HuggingFace are a startup building, amongst other NLP-related products, a library and model ecosystem that allows almost anyone to quickly and easily set up AI-powered chat bots that can consume or produce natural language. In this post, I'll demonstrate how I used this library to produce a Twitter bot that is only tweeting made-up (and slightly quirky) good news This blog post isn't meant to explain any theory, but for those who aren't familiar, the easiest way to explain this kind of AI, is they're sophisticated pattern recognition systems. If you feed it enough data, it can build up an ability to recognize the patterns in the english language, to the extent that if you ask it to repeat the pattern, not only will it generate mostly correct English grammar, it might also from time to time generate a coherent sentence!

huggingface, library, news site, (10 more...)

#artificialintelligence

Industry: Media > News (0.36)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

arXiv.org Artificial IntelligenceOct-14-2021

SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures

Ung, Megan, Xu, Jing, Boureau, Y-Lan

Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an unpleasant experience and may discourage conversation partners from giving feedback in the future. This work proposes SaFeRDialogues, a task and dataset of graceful responses to conversational feedback about safety failures. We collect a dataset of 10k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback. We show how fine-tuning on this dataset results in conversations that human raters deem considerably more likely to lead to a civil conversation, without sacrificing engagingness or general conversational ability.

dataset, dialogpt, recovery, (15 more...)

2110.07518

Country:

North America > Canada > Newfoundland and Labrador > Labrador (0.05)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.47)
Information Technology > Communications > Social Media > Crowdsourcing (0.47)

arXiv.org Artificial IntelligenceOct-22-2020

Recipes for Safety in Open-domain Chatbots

Xu, Jing, Ju, Da, Li, Margaret, Boureau, Y-Lan, Weston, Jason, Dinan, Emily

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for evaluating them, as well as a novel method to distill safety considerations inside generative models without the use of an external classifier at deployment time. We conduct experiments comparing these methods and find our new techniques are (i) safer than existing models as measured by automatic and human evaluations while (ii) maintaining usability metrics such as engagingness relative to the state of the art. We then discuss the limitations of this work by analyzing failure cases of our models.

classifier, large language model, machine learning, (18 more...)

2010.07079

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Oceania > Australia > Western Australia > Perth (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)