AITopics

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot Yihang Chen 2, Tianhao Wu2

Neural Information Processing SystemsMar-21-2025, 15:29:52 GMT

Network pruning is a method for reducing test-time computational resource requirements with minimal performance degradation. Conventional wisdom of pruning algorithms suggests that: (1) Pruning methods exploit information from training data to find good subnetworks; (2) The architecture of the pruned network is crucial for good performance. In this paper, we conduct sanity checks for the above beliefs on several recent unstructured pruning methods and surprisingly find that: (1) A set of methods which aims to find good subnetworks of the randomly-initialized network (which we call "initial tickets"), hardly exploits any information from the training data; (2) For the pruned networks obtained by these methods, randomly changing the preserved weights in each layer, while keeping the total number of preserved weights unchanged per layer, does not affect the final performance. These findings inspire us to choose a series of simple data-independent prune ratios for each layer, and randomly prune each layer accordingly to get a subnetwork (which we call "random tickets"). Experimental results show that our zero-shot random tickets outperform or attain a similar performance compared to existing "initial tickets". In addition, we identify one existing pruning method that passes our sanity checks. We hybridize the ratios in our random ticket with this method and propose a new method called "hybrid tickets", which achieves further improvement.

artificial intelligence, machine learning, ticket, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

eae27d77ca20db309e056e3d2dcd7d69-AuthorFeedback.pdf

Neural Information Processing SystemsMar-21-2025, 15:29:40 GMT

We thank all reviewers for taking their time reading the paper and providing us with insightful comments and suggestions! To R1: Thank you for appreciating our work! Regarding "results from a fixed learning rate": There may be some misunderstandings. Here are our responses to your questions. We will add the discussions in the next version.

artificial intelligence, machine learning, sanity check, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

3d57795f0e263aa69577f1bbceade46b-Paper-Conference.pdf

Neural Information Processing SystemsMar-21-2025, 15:29:37 GMT

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

A Experimental Details Datasets

Neural Information Processing SystemsMar-21-2025, 15:29:30 GMT

For the standard EBM, we train on 300,000 simulated QCD jets. For the hybrid model EBM-CLF, we train on 300,000 simulated Standard Model jets (100,000 QCD jets, 100,000 boosted jets originating from the W boson, and 100,000 boosted jets originating from the top quark). For OOD detection test sets, we employ the hypothetical Higgs boson (in the decay mode of H hh (b b)(b b)) with a mass of 174 GeV, which decays into two lighter Higgs bosons of 80 GeV. All the jet samples are generated with a pipeline of physics simulators. QCD jets are extracted from QCD di-jet events that are generated with MadGraph [4] for LHC 13 TeV, followed by Pythia8 [61] and Delphes [18] for parton shower and fast detector simulation.

artificial intelligence, ebm-clf, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

3d4c0a618d0acd7921493e4f30395c22-Paper-Conference.pdf

Neural Information Processing SystemsMar-21-2025, 15:29:27 GMT

detection, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada > Quebec (0.14)

Genre:

Research Report (0.68)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

A Appendices

Neural Information Processing SystemsMar-21-2025, 15:29:19 GMT

A.1 Guarantees on the decrease of the training loss As the scores are updated, the relative order of the importances is likely shuffled, and some connections will be replaced by more important ones. Under certain conditions, we are able to formally prove that as these replacements happen, the training loss is guaranteed to decrease. Our proof is adapted from [Ramanujan et al., 2020] to consider the case of fine-tuable W. We suppose that (a) the training loss L is smooth and admits a first-order Taylor development everywhere it is defined and (b) the learning rate of W (α We first consider the case where k = 1 in the TopK masking, meaning that only one connection is remaining (and the other weights are deactivated/masked). The first term is null because of inequalities (6) and the second term is negative because of inequality (7). We note that this proof is not specific to the TopK masking function.

artificial intelligence, inequality, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

eae15aabaa768ae4a5993a8a4f4fa6e4-Paper.pdf

Neural Information Processing SystemsMar-21-2025, 15:29:13 GMT

machine learning, natural language, pruning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

eae15aabaa768ae4a5993a8a4f4fa6e4-AuthorFeedback.pdf

Neural Information Processing SystemsMar-21-2025, 15:29:02 GMT

artificial intelligence, machine learning, pruning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Instruction Tuning Large Language Models to Understand Electronic Health Records

Neural Information Processing SystemsMar-21-2025, 15:28:48 GMT

Large language models (LLMs) have shown impressive capabilities in solving a wide range of tasks based on human instructions. However, developing a conversational AI assistant for electronic health record (EHR) data remains challenging due to (1) the lack of large-scale instruction-following datasets and (2) the limitations of existing model architectures in handling complex and heterogeneous EHR data. In this paper, we introduce MIMIC-Instr, a dataset comprising over 400K open-ended instruction-following examples derived from the MIMIC-IV EHR database. This dataset covers various topics and is suitable for instructiontuning general-purpose LLMs for diverse clinical use cases. Additionally, we propose Llemr, a general framework that enables LLMs to process and interpret EHRs with complex data structures. Llemr demonstrates competitive performance in answering a wide range of patient-related questions based on EHR data. Furthermore, our evaluations on clinical predictive modeling benchmarks reveal that the fine-tuned Llemr achieves performance comparable to state-of-the-art (SOTA) baselines using curated features. The dataset and code are available at https://github.com/zzachw/llemr.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Instruction Tuning Large Language Models to Understand Electronic Health Records

Neural Information Processing SystemsMar-21-2025, 15:28:45 GMT

Large language models (LLMs) have shown impressive capabilities in solving a wide range of tasks based on human instructions. However, developing a conversational AI assistant for electronic health record (EHR) data remains challenging due to (1) the lack of large-scale instruction-following datasets and (2) the limitations of existing model architectures in handling complex and heterogeneous EHR data. In this paper, we introduce MIMIC-Instr, a dataset comprising over 400K open-ended instruction-following examples derived from the MIMIC-IV EHR database. This dataset covers various topics and is suitable for instructiontuning general-purpose LLMs for diverse clinical use cases. Additionally, we propose Llemr, a general framework that enables LLMs to process and interpret EHRs with complex data structures. Llemr demonstrates competitive performance in answering a wide range of patient-related questions based on EHR data. Furthermore, our evaluations on clinical predictive modeling benchmarks reveal that the fine-tuned Llemr achieves performance comparable to state-of-the-art (SOTA) baselines using curated features. The dataset and code are available at https://github.com/zzachw/llemr.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology: