AITopics | footprint

vs Standard Experimental Setup Details

Neural Information Processing SystemsApr-25-2026, 18:36:29 GMT

A.1 Hyperparameters for QLORA We do a hyperparameter search for LoRA over the following variables: LoRA dropout { 0.0, 0.05, 0.1}, LoRA r { 8, 16, 32, 64, 128, 256}, LoRA layers {key+query, all attention layers, all FFN layers, all layers, attention + FFN output layers}. We keep LoRA α fixed and search the learning rate, since LoRA α is always proportional to the learning rate. We find that LoRA dropout 0.05 is useful for small models (7B, 13B), but not for larger models (33B, 65B). Each dot represents a combination of hyperparameters and for each LoRA r we run 3 random seed with each hyperparameter combination. The performance of specific LoRA r values appears to be independent of other hyperparameters.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

1feb87871436031bdc0f2beaa62a049b-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 18:36:26 GMT

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications > Social Media (0.93)

Add feedback

Dog walkers find 2,000-year-old footprints on beach in Scotland

The Iron Age human and animal footprints were preserved before high winds destroyed them. Breakthroughs, discoveries, and DIY tips sent six days a week. Two friends out walking their dogs along the eastern coast of Scotland unexpectedly found an archaeological goldmine . After wind gusts as strong as 55 mph blew away sand on the dunes of a beach near Angus, Ivor Campbell and Jenny Snedden (along with their pooches Ziggy and Juno) spotted the unique indentations in a layer of long-dried clay. The pair contacted a local archaeologist, and researchers from the University of Aberdeen quickly descended on the picturesque seaside locale to preserve the discoveries.

andrew paul, artificial intelligence, footprint, (9 more...)

Popular Science

Country: Europe > United Kingdom > Scotland (0.65)

Technology: Information Technology > Artificial Intelligence (0.51)

Add feedback

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention William Brandon

Neural Information Processing SystemsFeb-17-2026, 00:14:55 GMT

Key-value (KV) caching plays an essential role in accelerating decoding for transformer-based autoregressive large language models (LLMs).

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

3ffebb08d23c609875d7177ee769a3e9-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 00:13:19 GMT

dataset, suggestion, training time, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Backprop with Approximate Activations for Memory-efficient Network Training

Ayan Chakrabarti, Benjamin Moseley

Neural Information Processing SystemsFeb-11-2026, 20:03:46 GMT

It also stores approximate per-layer copies of activations, at significantmemorysavings,thatareusedinthebackwardpass.

activation, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

FjORD: FairandAccurateFederatedLearning underheterogeneoustargetswithOrderedDropout

Neural Information Processing SystemsFeb-9-2026, 05:30:20 GMT

Although significant efforts have been made into tackling statistical data heterogeneity,the diversity in the processing capabilities andnetworkbandwidth ofclients,termedassystemheterogeneity,hasremained largelyunexplored.

artificial intelligence, fjord, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Virginia (0.04)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A QLoRA vs Standard Finetuning Experimental Setup Details A.1 Hyperparameters for QL

Neural Information Processing SystemsFeb-8-2026, 18:51:26 GMT

We do a hyperparameter search for LoRA over the following variables: LoRA dropout { 0.0, 0.05, LoRA α is always proportional to the learning rate. We find that LoRA dropout 0.05 is useful for small models (7B, 13B), but not for larger models (33B, We use the same preprocessing of the Super-Natural Instruction dataset as Wang et al. RA finetuning experiments outlined in Section 5. This limits the dataset to 9,209 examples. HH-RLHF This is a human preference dataset about helpfulness and harmlessness.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Technology: