AITopics | saša

Collaborating Authors

saša

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Using Statisticsto Automate Stochastic Optimization

Neural Information Processing SystemsFeb-14-2026, 16:53:23 GMT

Bythe Mark zN !0asN gro waystodeter zN is "close DeterministicIfinaddition zN in (7), vN = 1 N Themisintroducedtomakethe thescaleofaysnonnegative). Ifthenthe (2) are "close " tostationarity.

artificial intelligence, machine learning, pflug, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
North America > United States > Washington > King County > Redmond (0.04)
Europe > Russia (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Using Statistics to Automate Stochastic Optimization

Neural Information Processing SystemsAug-20-2025, 06:19:12 GMT

Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops", then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution.

estimator, stationary distribution, variance, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Research Report (0.46)
Instructional Material (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

Self-Aware Safety Augmentation: Leveraging Internal Semantic Understanding to Enhance Safety in Vision-Language Models

Wang, Wanying, Ma, Zeyu, Zheng, Han, Tan, Xin, Chen, Mingang

arXiv.org Artificial IntelligenceJul-30-2025

Large vision-language models (LVLMs) are vulnerable to harmful input compared to their language-only backbones. We investigated this vulnerability by exploring LVLMs internal dynamics, framing their inherent safety understanding in terms of three key capabilities. Specifically, we define these capabilities as safety perception, semantic understanding, and alignment for linguistic expression, and experimentally pinpointed their primary locations within the model architecture. The results indicate that safety perception often emerges before comprehensive semantic understanding, leading to the reduction in safety. Motivated by these findings, we propose \textbf{Self-Aware Safety Augmentation (SASA)}, a technique that projects informative semantic representations from intermediate layers onto earlier safety-oriented layers. This approach leverages the model's inherent semantic understanding to enhance safety recognition without fine-tuning. Then, we employ linear probing to articulate the model's internal semantic comprehension to detect the risk before the generation process. Extensive experiments on various datasets and tasks demonstrate that SASA significantly improves the safety of LVLMs, with minimal impact on the utility.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.21637

Country:

Europe (0.47)
Asia > China (0.15)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
(3 more...)

Add feedback

Large Language Models can be Strong Self-Detoxifiers

Ko, Ching-Yun, Chen, Pin-Yu, Das, Payel, Mroueh, Youssef, Dan, Soham, Kollias, Georgios, Chaudhury, Subhajit, Pedapati, Tejaswini, Daniel, Luca

arXiv.org Artificial IntelligenceOct-4-2024

This paper contains examples that may be considered offensive and inappropriate. Reducing the likelihood of generating harmful and toxic output is an essential task when aligning large language models (LLMs). Existing methods mainly rely on training an external reward model (i.e., another language model) or fine-tuning the LLM using self-generated data to influence the outcome. In this paper, we show that LLMs have the capability of self-detoxification without the use of an additional reward model or re-training. We propose Self-disciplined Autoregressive Sampling (SASA), a lightweight controlled decoding algorithm for toxicity reduction of LLMs. SASA leverages the contextual representations from an LLM to learn linear subspaces characterizing toxic v.s. When auto-completing a response token-by-token, SASA dynamically tracks the margin of the current output to steer the generation away from the toxic subspace, by adjusting the autoregressive sampling strategy. Recent advancements in large language models (LLMs) have dramatically enhanced their capabilities in textual understanding and reasoning (Brown et al., 2020; Kojima et al., 2022). Their capabilities in performing diverse linguistic tasks and producing coherent texts have catalyzed their adoption across a variety of applications (Rae et al., 2021; Hoffmann et al., 2022; Le Scao et al., 2023; Touvron et al., 2023a;b; Achiam et al., 2023). However, with the escalating size of models (Raffel et al., 2020; Brown et al., 2020; Achiam et al., 2023), there is a corresponding increase in the scale of the training datasets required to avert overfitting and to encapsulate extensive world knowledge. These extensive datasets, predominantly derived from internet crawls and merely subjected to basic filtering protocols (Raffel et al., 2020), often harbor biases that are problematic or directly detrimental for many applications and may not inherently align with these desirable attributes (Wallace et al., 2019; Gehman et al., 2020). In fact, it is known that language models trained on such data may not only mimic but also amplify these biases (Bolukbasi et al., 2016; Caliskan et al., 2017; Zhao et al., 2018; Sheng et al., 2019; Gehman et al., 2020; Hartvigsen et al., For example, an "aligned" LLM may be inadvertently or maliciously tricked into generating harmful or toxic output that causes usage violations and safety concerns (Sun et al., 2024).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.03818

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Statistical Adaptive Stochastic Gradient Methods

Zhang, Pengchuan, Lang, Hunter, Liu, Qiang, Xiao, Lin

arXiv.org Machine LearningFeb-24-2020

We propose a statistical adaptive procedure called SALSA for automatically scheduling the learning rate (step size) in stochastic gradient methods. SALSA first uses a smoothed stochastic line-search procedure to gradually increase the learning rate, then automatically switches to a statistical method to decrease the learning rate. The line search procedure ``warms up'' the optimization process, reducing the need for expensive trial and error in setting an initial learning rate. The method for decreasing the learning rate is based on a new statistical test for detecting stationarity when using a constant step size. Unlike in prior work, our test applies to a broad class of stochastic gradient algorithms without modification. The combined method is highly robust and autonomous, and it matches the performance of the best hand-tuned learning rate schedules in our experiments on several deep learning tasks.

learning rate, saša, test accuracy, (14 more...)

arXiv.org Machine Learning

2002.10597

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Washington > King County > Redmond (0.04)
(9 more...)

Genre: Research Report > New Finding (0.87)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)

Add feedback

Using Statistics to Automate Stochastic Optimization

Lang, Hunter, Zhang, Pengchuan, Xiao, Lin

arXiv.org Artificial IntelligenceSep-21-2019

Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning. Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops," then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution. This test can be performed easily during training, and when it fires, we decrease the learning rate by a constant multiplicative factor. Our experiments on several deep learning tasks demonstrate that this statistical adaptive stochastic approximation (SASA) method can automatically find good learning rate schedules and match the performance of hand-tuned methods using default settings of its parameters. The statistical testing helps to control the variance of this procedure and improves its robustness.

estimator, experiment, statistics, (17 more...)

arXiv.org Artificial Intelligence

1909.09785

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.47)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia

Wu, Zhaohui (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)

AAAI ConferencesMar-6-2015

Human languages are naturally ambiguous, which makes it difficult to automatically understand the semantics of text. Most vector space models (VSM) treat all occurrences of a word as the same and build a single vector to represent the meaning of a word, which fails to capture any ambiguity. We present sense-aware semantic analysis (SaSA), a multi-prototype VSM for word representation based on Wikipedia, which could account for homonymy and polysemy. The "sense-specific'' prototypes of a word are produced by clustering Wikipedia pages based on both local and global contexts of the word in Wikipedia. Experimental evaluations on semantic relatedness for both isolated words and words in sentential contexts and word sense induction demonstrate its effectiveness.

artificial intelligence, machine learning, natural language, (21 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Industry:

Transportation > Ground (0.46)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback