AITopics | Oceania

ComparedtoDeepQ-learning,deeppolicygradient (PG) methods are often more flexible and applicable to discrete and continuous action problems. However, these methods tend to suffer from high sample complexity and training instability since the gradient may not accurately reflect the policy gain when the policy changes substantially [6].

artificial intelligence, machine learning, virtual policy, (16 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

RRHF (1)

yuanhongyi

Neural Information Processing SystemsFeb-8-2026, 23:56:44 GMT

RRHF can align with not only human preferences but also any preferences. As a large language model, Wombat has the possibility to generate unsafe responses. We also conduct experiments on the IMDB dataset for assessing positive movie reviews generation. The task expects the model to give positive and fluent movie review completions based on given partial review input texts. RRHF-OP-128 follows the bottommost workflow in Figure 2 in the main texts.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand (0.05)
Oceania > Australia > Tasmania (0.05)

Industry:

Media > Film (0.56)
Leisure & Entertainment (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

75c58d36157505a600e0695ed0b3a22d-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 23:45:04 GMT

complexity, hypernetwork, neural network, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada (0.04)
Africa > Kenya (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

524ef58c2bd075775861234266e5e020-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 23:44:18 GMT

forecasting, neural information processing system, time series forecasting, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

6097d8f3714205740f30debe1166744e-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 23:43:06 GMT

The running time and memory needed for our algorithm to approximate the privacy curveofaDPalgorithm composed with itselfktimes is O( k).

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Security & Privacy (0.93)

Add feedback

51ae7d9db3423ae96cd6afeb01529819-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 23:28:00 GMT

eqn, experiment, parallel data, (15 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Entropic Desired Dynamics for Intrinsic Control: Supplemental Material Steven Hansen

Neural Information Processing SystemsFeb-8-2026, 22:46:44 GMT

While this is not close to the state-of-the-art in general (c.f. Figure 2 shows the effect of action entropy on exploratory behavior in Montezuma's Revenge. Number of unique avatar positions visited. Full training curves across all 6 Atari games are shown in Figure 1, including the random policy baseline. To ensure this didn't hamper performance, we At each state visited by the agent evaluator during training, the agent's state (consisting of the avatar's The full curves are included for completeness. The compute cluster we performed experiments on is heterogenous, and has features such as host-sharing, adaptive load-balancing, etc.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Add feedback

50d005f92a6c5c9646db4b761da676ba-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 22:39:03 GMT

Failure case 2: Augerino depends on the used parameterisation of invariance. The full GGN approximation in Eq. 5 is inO(NP2C) for computingN matrix-products. The diagonalGGNapproximation would be inO(NPC)and computation of the log-determinant onlyO(P). Computing the log-determinant can be done efficiently inO(D3 +G3)by decomposing the Kronecker factors (Immer et al., 2021a). The last two terms dependent onS come up due to the aggregation ofaugmentation samples inour approximation, that is,the expectations overaandg in the second line of Eq. 15.

approximation, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: