AITopics | switchback

Collaborating Authors

switchback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stable and low-precision training for large-scale vision-language models Mitchell Wortsman 1 Tim Dettmers 1 Luke Zettlemoyer

Neural Information Processing SystemsNov-14-2025, 05:24:04 GMT

Our main focus is int8 as GPU support for float8 is rare, though we also analyze float8 training through simulation.

large language model, loss spike, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Stable and low-precision training for large-scale vision-language models Mitchell Wortsman 1 Tim Dettmers 1 Luke Zettlemoyer

Neural Information Processing SystemsNov-14-2025, 05:24:01 GMT

Our main focus is int8 as GPU support for float8 is rare, though we also analyze float8 training through simulation.

large language model, loss spike, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Stable and low-precision training for large-scale vision-language models Mitchell Wortsman 1 Tim Dettmers 1 Luke Zettlemoyer

Neural Information Processing SystemsOct-8-2025, 06:32:24 GMT

Our main focus is int8 as GPU support for float8 is rare, though we also analyze float8 training through simulation.

large language model, loss spike, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Stable and low-precision training for large-scale vision-language models Mitchell Wortsman 1 Tim Dettmers 1 Luke Zettlemoyer

Neural Information Processing SystemsOct-8-2025, 06:32:19 GMT

Our main focus is int8 as GPU support for float8 is rare, though we also analyze float8 training through simulation.

large language model, loss spike, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization

Xi, Haocheng, Chen, Yuxiang, Zhao, Kang, Zheng, Kaijun, Chen, Jianfei, Zhu, Jun

arXiv.org Artificial IntelligenceMar-19-2024

Pretraining transformers are generally time-consuming. Fully quantized training (FQT) is a promising approach to speed up pretraining. However, most FQT methods adopt a quantize-compute-dequantize procedure, which often leads to suboptimal speedup and significant performance degradation when used in transformers due to the high memory access overheads and low-precision computations. In this work, we propose Jetfire, an efficient and accurate INT8 training method specific to transformers. Our method features an INT8 data flow to optimize memory access and a per-block quantization method to maintain the accuracy of pretrained transformers. Extensive experiments demonstrate that our INT8 FQT method achieves comparable accuracy to the FP16 training baseline and outperforms the existing INT8 training works for transformers. Moreover, for a standard transformer block, our method offers an end-to-end training speedup of 1.42x and a 1.49x memory reduction compared to the FP16 baseline.

large language model, machine learning, quantization, (18 more...)

arXiv.org Artificial Intelligence

2403.12422

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

Stable and low-precision training for large-scale vision-language models

Wortsman, Mitchell, Dettmers, Tim, Zettlemoyer, Luke, Morcos, Ari, Farhadi, Ali, Schmidt, Ludwig

arXiv.org Artificial IntelligenceOct-16-2023

We introduce new methods for 1) accelerating and 2) stabilizing training for large language-vision models. 1) For acceleration, we introduce SwitchBack, a linear layer for int8 quantized training which provides a speed-up of 13-25% while matching the performance of bfloat16 training within 0.1 percentage points for the 1B parameter CLIP ViT-Huge -- the largest int8 training to date. Our main focus is int8 as GPU support for float8 is rare, though we also analyze float8 training through simulation. While SwitchBack proves effective for float8, we show that standard techniques are also successful if the network is trained and initialized so that large feature magnitudes are discouraged, which we accomplish via layer-scale initialized with zeros. 2) For stability, we analyze loss spikes and find they consistently occur 1-8 iterations after the squared gradients become under-estimated by their AdamW second moment estimator. As a result, we recommend an AdamW-Adafactor hybrid which avoids loss spikes when training a CLIP ViT-Huge model and outperforms gradient clipping at the scales we test.

large language model, machine learning, spike, (21 more...)

arXiv.org Artificial Intelligence

2304.13013

Country:

North America > Mexico > Gulf of Mexico (0.14)
North America > Canada > Quebec > Montreal (0.04)
Africa > Angola > Namibe Province > South Atlantic Ocean (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

ESA's Solar Orbiter records a mysterious magnetic switchback

Daily Mail - Science & techSep-14-2022, 10:15:38 GMT

The European Space Agency's Solar Orbiter spacecraft has captured the reversal of the Sun's magnetic field on camera for the first time. These reversals, known as magnetic switchbacks, have previously been hypothesised, but until now have not been observed directly. The new observation provides a full view of the structure and confirms that magnetic switchbacks have an S-shaped character. ESA hopes the footage will help to unravel the mystery of how their physical formation mechanism might help accelerate solar winds. Scientists develop a'recipe' for parents to stop babies crying Meghan Markle's handshake is ignored by member of the public Kremlin journalist admits Russia is losing'huge number of people' Thousands gather for arrival of Queen's coffin at Buckingham Palace The European Space Agency's Solar Orbiter spacecraft has captured the reversal of the Sun's magnetic field on camera for the first time.

artificial intelligence, solar orbiter, switchback, (18 more...)

Daily Mail - Science & tech

Country:

Europe > United Kingdom > England > Greater London > London (0.25)
Europe > Russia (0.25)
Asia > Russia (0.25)
(4 more...)

Industry:

Government > Space Agency (0.59)
Government > Regional Government > Europe Government > United Kingdom Government (0.36)
Media > News (0.36)

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

An Oral History of the 2004 Darpa Grand Challenge

WIREDAug-3-2017, 17:16:29 GMT

On March 13, 2004, a gaggle of engineers and a few thousand spectators congregated outside a California dive bar to watch 15 self-driving cars speed across the Mojave Desert in the first-ever Darpa Grand Challenge. Before the start of the race, which marked the first big push toward a fully autonomous vehicle, the grounds surrounding the bar teemed with sweaty, stressed, sleep-deprived geeks, desperately tinkering with their motley assortment of driver less Frankencars: SUVs, dune buggies, monster trucks, even a motorcycle. After the race, they left behind a vehicular graveyard littered with smashed fence posts, messes of barbed wire, and at least one empty fire extinguisher. What happened in between--the rush out of the starter gate, the switchbacks across the rocky terrain, the many, many crashes--didn't just hint at the possibilities and potential limitations of autonomous vehicles that auto and tech companies are facing and that consumers will experience in the coming years as driverless vehicles swarm the roads. It created the self- driving community as we know it today, the men and women in too-big polo shirts who would go on to dominate an automotive revolution. In 2001, eager to keep soldiers away from harm in combat zones, the US Congress demanded that a third of the military's ground combat vehicles be uncrewed by 2015. But defense industry stalwarts weren't innovating quickly enough on the sensor and computing technologies that would enable autonomous driving.

artificial intelligence, tether, vehicle, (18 more...)

WIRED

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.05)
Oceania > Micronesia (0.04)
North America > United States > Nevada (0.04)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Faster Optimal and Suboptimal Hierarchical Search

Leighton, Michael J. (University of New Hampshire) | Ruml, Wheeler ( University of New Hampshire ) | Holte, Robert C. (University of Alberta)

AAAI ConferencesJul-5-2011

In problem domains for which an informed admissible heuristic function is not available, one attractive approach is hierarchical search. Hierarchical search uses search in an abstracted version of the problem to dynamically generate heuristic values. This paper makes two contributions to hierarchical search. First, we propose a simple modification to the state-of-the-art algorithm Switchback that reduces the number of expansions (and hence the running time) by approximately half, while maintaining its guarantee of optimality. Second, we propose a new algorithm for suboptimal hierarchical search, called Switch. Empirical results suggest that Switch yields faster search than straightforward modifications of Switchback, such as weighting the heuristic or greedy search. The success of Switch illustrates the potential for further research on specifically suboptimal hierarchical search.

artificial intelligence, node, short circuit, (17 more...)

AAAI Conferences

Fourth Annual Symposium on Combinatorial Search

Country:

North America > United States > New Hampshire (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Searching Without a Heuristic: Efficient Use of Abstraction

Larsen, Bradford John (University of New Hampshire) | Burns, Ethan (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire) | Holte, Robert (University of Alberta)

AAAI ConferencesJul-15-2010

In problem domains where an informative heuristic evaluation function is not known or not easily computed, abstraction can be used to derive admissible heuristic values. Optimal path lengths in the abstracted problem are consistent heuristic estimates for the original problem. Pattern databases are the traditional method of creating such heuristics, but they exhaustively compute costs for all abstract states and are thus usually appropriate only when all instances share the same single goal state. Hierarchical heuristic search algorithms address these shortcomings by searching for paths in the abstract space on an as-needed basis. However, existing hierarchical algorithms search less efficiently than pattern database constructors: abstract nodes may be expanded many times during the course of a base-level search. We present a novel hierarchical heuristic search algorithm, called Switchback, that uses an alternating direction of search to avoid abstract node re-expansions. This algorithm is simple to implement and demonstrates superior performance to existing hierarchical heuristic search algorithms on several standard benchmarks.

algorithm, artificial intelligence, node, (16 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New Hampshire (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Africa > Togo (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback