AITopics | fsl

Collaborating Authors

fsl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c91591a8d461c2869b9f535ded3e213e-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 03:48:02 GMT

artificial intelligence, inthissection, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

POODLE: Penalizing

Neural Information Processing SystemsFeb-11-2026, 03:47:58 GMT

artificial intelligence, conferenceon computer vision, machine learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

the tight space constraint, we have done our best to address a majority of each reviewer's questions / comments

Neural Information Processing SystemsOct-2-2025, 22:07:42 GMT

We thank all the reviewers for their diligence, appreciation of our work, and valuable comments / suggestions. ACM TOG 2019.) that similarly go from lower to higher number of parameters, progressively. "Unsupervised visual representation learning by context prediction", ICCV 2015, which proposed a SSL task The text suggests..) Y es, this is a typo that distorts the meaning. The blue box in Figure 1 just maps the point's indices to balls they are part of, to further compute the ball vectors. An ablation study can certainly be added.

artificial intelligence, machine learning, point cloud, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1cc8a8ea51cd0adddf5dab504a285915-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 08:57:03 GMT

adjustment, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unveiling the Role of Learning Rate Schedules via Functional Scaling Laws

Li, Binghui, Chen, Fengling, Huang, Zixun, Wang, Lean, Wu, Lei

arXiv.org Machine LearningSep-25-2025

Scaling laws have played a cornerstone role in guiding the training of large language models (LLMs). However, most existing works on scaling laws primarily focus on the final-step loss, overlooking the loss dynamics during the training process and, crucially, the impact of learning rate schedule (LRS). In this paper, we aim to bridge this gap by studying a teacher-student kernel regression setup trained via online stochastic gradient descent (SGD). Leveraging a novel intrinsic time viewpoint and stochastic differential equation (SDE) modeling of SGD, we introduce the Functional Scaling Law (FSL), which characterizes the evolution of population risk during the training process for general LRSs. Remarkably, the impact of the LRSs is captured through an explicit convolution-type functional term, making their effects fully tractable. To illustrate the utility of FSL, we analyze three widely used LRSs -- constant, exponential decay, and warmup-stable-decay (WSD) -- under both data-limited and compute-limited regimes. We provide theoretical justification for widely adopted empirical practices in LLMs pre-training such as (i) higher-capacity models are more data- and compute-efficient; (ii) learning rate decay can improve training efficiency; (iii) WSD-like schedules can outperform direct-decay schedules. Lastly, we explore the practical relevance of FSL as a surrogate model for fitting, predicting and optimizing the loss curves in LLM pre-training, with experiments conducted across model sizes ranging from 0.1B to 1B parameters. We hope our FSL framework can deepen the understanding of LLM pre-training dynamics and provide insights for improving large-scale model training.

arxiv preprint arxiv, lrss, proof, (11 more...)

arXiv.org Machine Learning

2509.19189

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

. We thank R1 for pointing some expositions issues and the proposed

Neural Information Processing SystemsAug-20-2025, 04:37:07 GMT

We thank reviewers for detailed and helpful reviews. Table 1 shows the results. If we understand correctly, R2's main concern is that the word embeddings of We believe that it would hardly happen. The reasons are as follows. Second, we can easily assume a FSL scenario in which we have access to the labels of the test set.

exposition issue, fsl, scenario, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.72)

Add feedback

An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification

More, Riddhi, Bradbury, Jeremy S.

arXiv.org Artificial IntelligenceFeb-4-2025

Flaky tests exhibit non-deterministic behavior during execution and they may pass or fail without any changes to the program under test. Detecting and classifying these flaky tests is crucial for maintaining the robustness of automated test suites and ensuring the overall reliability and confidence in the testing. However, flaky test detection and classification is challenging due to the variability in test behavior, which can depend on environmental conditions and subtle code interactions. Large Language Models (LLMs) offer promising approaches to address this challenge, with fine-tuning and few-shot learning (FSL) emerging as viable techniques. With enough data fine-tuning a pre-trained LLM can achieve high accuracy, making it suitable for organizations with more resources. Alternatively, we introduce FlakyXbert, an FSL approach that employs a Siamese network architecture to train efficiently with limited data. To understand the performance and cost differences between these two methods, we compare fine-tuning on larger datasets with FSL in scenarios restricted by smaller datasets. Our evaluation involves two existing flaky test datasets, FlakyCat and IDoFT. Our results suggest that while fine-tuning can achieve high accuracy, FSL provides a cost-effective approach with competitive accuracy, which is especially beneficial for organizations or projects with limited historical data available for training. These findings underscore the viability of both fine-tuning and FSL in flaky test detection and classification with each suited to different organizational needs and resource availability.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.02715

Country:

North America > Canada > Ontario > Durham Region > Oshawa (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Federated Split Learning for Human Activity Recognition with Differential Privacy

Ndeko, Josue, Shaon, Shaba, Beal, Aubrey, Sahoo, Avimanyu, Nguyen, Dinh C.

arXiv.org Artificial IntelligenceNov-9-2024

This paper proposes a novel intelligent human activity recognition (HAR) framework based on a new design of Federated Split Learning (FSL) with Differential Privacy (DP) over edge networks. Our FSL-DP framework leverages both accelerometer and gyroscope data, achieving significant improvements in HAR accuracy. The evaluation includes a detailed comparison between traditional Federated Learning (FL) and our FSL framework, showing that the FSL framework outperforms FL models in both accuracy and loss metrics. Additionally, we examine the privacy-performance trade-off under different data settings in the DP mechanism, highlighting the balance between privacy guarantees and model accuracy. The results also indicate that our FSL framework achieves faster communication times per training round compared to traditional FL, further emphasizing its efficiency and effectiveness. This work provides valuable insight and a novel framework which was tested on a real-life dataset.

activity recognition, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.06263

Country:

North America > United States > Alabama (0.04)
North America > Canada > Newfoundland and Labrador > Labrador (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Smart Houses & Appliances (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Data Science (1.00)
(2 more...)

Add feedback

MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering

Guan, Che, Huang, Mengyu, Zhang, Peng

arXiv.org Artificial IntelligenceMar-27-2024

In today's fast-paced industry, professionals face the challenge of summarizing a large number of documents and extracting vital information from them on a daily basis. These metrics are frequently hidden away in tables and/or their nested hyperlinks. To address this challenge, the approach of Table Question Answering (QA) has been developed to extract the relevant information. However, traditional Table QA training tasks that provide a table and an answer(s) from a gold cell coordinate(s) for a question may not always ensure extracting the accurate answer(s). Recent advancements in Large Language Models (LLMs) have opened up new possibilities for extracting information from tabular data using prompts. In this paper, we introduce the Multi-hop Few-shot Open Rich Table QA (MFORT-QA) approach, which consists of two major steps. The first step involves Few-Shot Learning (FSL), where relevant tables and associated contexts of hyperlinks are retrieved based on a given question. The retrieved content is then used to construct few-shot prompts as inputs to an LLM, such as ChatGPT. To tackle the challenge of answering complex questions, the second step leverages Chain-of-thought (CoT) prompting to decompose the complex question into a sequential chain of questions and reasoning thoughts in a multi-hop manner. Retrieval-Augmented Generation (RAG) enhances this process by retrieving relevant tables and contexts of hyperlinks that are relevant to the resulting reasoning thoughts and questions. These additional contexts are then used to supplement the prompt used in the first step, resulting in more accurate answers from an LLM. Empirical results from OTT-QA demonstrate that our abstractive QA approach significantly improves the accuracy of extractive Table QA methods.

arxiv, chatgpt, mfort-qa, (16 more...)

arXiv.org Artificial Intelligence

2403.19116

Country: