AITopics | unicorn

Country: North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.65)

Neural Information Processing SystemsFeb-8-2026, 04:15:37 GMT

UniCoRn_with_appendix

Preetam Nandy

A/B testing is a powerful tool because of its design simplicity and ease of setup.

artificial intelligence, experiment, unicorn, (16 more...)

Country: North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.65)

Neural Information Processing SystemsNov-19-2025, 20:27:59 GMT

Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning

This work lays the information theoretic foundation for COMRL methods, leading to a better understanding of task representation learning in the context of reinforcement learning.

machine learning, natural language, reinforcement learning, (16 more...)

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceNov-12-2025

Making LLMs Reliable When It Matters Most: A Five-Layer Architecture for High-Stakes Decisions

Jadad, Alejandro R.

Current large language models (LLMs) excel in verifiable domains where outputs can be checked before action but prove less reliable for high-stakes strategic decisions with uncertain outcomes. This gap, driven by mutually reinforcing cognitive biases in both humans and artificial intelligence (AI) systems, threatens the defensibility of valuations and sustainability of investments in the sector. This report describes a framework emerging from systematic qualitative assessment across 7 frontier-grade LLMs and 3 market-facing venture vignettes under time pressure. Detailed prompting specifying decision partnership and explicitly instructing avoidance of sycophancy, confabulation, solution drift, and nihilism achieved initial partnership state but failed to maintain it under operational pressure. Sustaining protective partnership state required an emergent 7-stage calibration sequence, built upon a 4-stage initialization process, within a 5-layer protection architecture enabling bias self-monitoring, human-AI adversarial challenge, partnership state verification, performance degradation detection, and stakeholder protection. Three discoveries resulted: partnership state is achievable through ordered calibration but requires emergent maintenance protocols; reliability degrades when architectural drift and context exhaustion align; and dissolution discipline prevents costly pursuit of fundamentally wrong directions. Cross-model validation revealed systematic performance differences across LLM architectures. This approach demonstrates that human-AI teams can achieve cognitive partnership capable of preventing avoidable regret in high-stakes decisions, addressing return-on-investment expectations that depend on AI systems supporting consequential decision-making without introducing preventable cognitive traps when verification arrives too late.

large language model, machine learning, natural language, (20 more...)

2511.07669

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Switzerland (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-11-2025, 00:30:55 GMT

8a30aba6514b56d02976f49797f6338a-Paper-Conference.pdf

machine learning, natural language, reinforcement learning, (15 more...)

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsAug-14-2025, 03:13:04 GMT

32e19424b63cc63077a4031b87fb1010-Supplemental.pdf

artificial intelligence, machine learning, unicorn, (15 more...)

Country: North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Neural Information Processing SystemsAug-14-2025, 03:13:00 GMT

UniCoRn_with_appendix

Preetam Nandy

artificial intelligence, experiment, unicorn, (16 more...)

Country: North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

arXiv.org Artificial IntelligenceMar-14-2025

Unicorn: A Universal and Collaborative Reinforcement Learning Approach Towards Generalizable Network-Wide Traffic Signal Control

Zhang, Yifeng, Liu, Yilin, Gong, Ping, Li, Peizhuo, Fan, Mingfeng, Sartoretti, Guillaume

Adaptive traffic signal control (ATSC) is crucial in reducing congestion, maximizing throughput, and improving mobility in rapidly growing urban areas. Recent advancements in parameter-sharing multi-agent reinforcement learning (MARL) have greatly enhanced the scalable and adaptive optimization of complex, dynamic flows in large-scale homogeneous networks. However, the inherent heterogeneity of real-world traffic networks, with their varied intersection topologies and interaction dynamics, poses substantial challenges to achieving scalable and effective ATSC across different traffic scenarios. To address these challenges, we present Unicorn, a universal and collaborative MARL framework designed for efficient and adaptable network-wide ATSC. Specifically, we first propose a unified approach to map the states and actions of intersections with varying topologies into a common structure based on traffic movements. Next, we design a Universal Traffic Representation (UTR) module with a decoder-only network for general feature extraction, enhancing the model's adaptability to diverse traffic scenarios. Additionally, we incorporate an Intersection Specifics Representation (ISR) module, designed to identify key latent vectors that represent the unique intersection's topology and traffic dynamics through variational inference techniques. To further refine these latent representations, we employ a contrastive learning approach in a self-supervised manner, which enables better differentiation of intersection-specific features. Moreover, we integrate the state-action dependencies of neighboring agents into policy optimization, which effectively captures dynamic agent interactions and facilitates efficient regional collaboration. Our results show that Unicorn outperforms other methods across various evaluation metrics, highlighting its potential in complex, dynamic traffic networks.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2503.11488

Country:

North America > United States (0.28)
Europe > Monaco (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Charpentier, Lucas Georges Gabriel, Samuel, David

GPT or BERT: why not both?

arXiv.org Artificial IntelligenceDec-29-2024

We present a simple way to merge masked language modeling with causal language modeling. This hybrid training objective results in a model that combines the strengths of both modeling paradigms within a single transformer stack: GPT-BERT can be transparently used like any standard causal or masked language model. We test the pretraining process that enables this flexible behavior on the BabyLM Challenge 2024. The results show that the hybrid pretraining outperforms masked-only or causal-only models. We openly release the models, training corpora and code.

large language model, machine learning, natural language, (19 more...)

2410.24159

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.14)
Asia > Singapore (0.04)
(21 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-29-2024

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Qiu, Jiahao, Lu, Yifu, Zeng, Yifan, Guo, Jiacheng, Geng, Jiayi, Wang, Huazheng, Huang, Kaixuan, Wu, Yue, Wang, Mengdi

Inference-time alignment enhances the performance of large language models without requiring additional training or fine-tuning but presents challenges due to balancing computational efficiency with high-quality output. Best-of-N (BoN) sampling, as a simple yet powerful approach, generates multiple responses and selects the best one, achieving improved performance but with a high computational cost. We propose TreeBoN, a novel framework that integrates a speculative tree-search strategy into Best-of-N (BoN) Sampling. TreeBoN maintains a set of parent nodes, iteratively branching and pruning low-quality responses, thereby reducing computational overhead while maintaining high output quality. Our approach also leverages token-level rewards from Direct Preference Optimization (DPO) to guide tree expansion and prune low-quality paths. We evaluate TreeBoN using AlpacaFarm, HH-RLHF, UltraFeedback, GSM8K, and TutorEval datasets, demonstrating consistent improvements. Specifically, TreeBoN achieves the highest win rate of 65% on TutorEval and around 60% win rates across other different datasets, outperforming standard BoN with the same computational cost and showcasing its scalability and alignment efficacy.

large language model, machine learning, natural language, (14 more...)