AITopics | Instructional Material

Collaborating Authors

Instructional Material

Accelerated Training for Matrix-norm Regularization: A Boosting Approach

Neural Information Processing SystemsFeb-16-2024, 06:32:06 GMT

Sparse learning models typically combine a smooth loss with a nonsmooth penalty, such as trace norm. Although recent developments in sparse approximation have offered promising solution methods, current approaches either apply only to matrix-norm constrained problems or provide suboptimal convergence rates. In this paper, we propose a boosting method for regularized learning that guarantees \epsilon accuracy within O(1/\epsilon) iterations. Performance is further accelerated by interlacing boosting with fixed-rank local optimization---exploiting a simpler local objective than previous work. The proposed method yields state-of-the-art performance on large-scale problems.

accelerated training, matrix-norm regularization

Neural Information Processing Systems

Genre: Instructional Material (0.40)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

I would love this to be like an assistant, not the teacher: a voice of the customer perspective of what distance learning students want from an Artificial Intelligence Digital Assistant

Rienties, Bart, Domingue, John, Duttaroy, Subby, Herodotou, Christothea, Tessarolo, Felipe, Whitelock, Denise

arXiv.org Artificial IntelligenceFeb-16-2024

With the release of Generative AI systems such as ChatGPT, an increasing interest in using Artificial Intelligence (AI) has been observed across domains, including higher education. While emerging statistics show the popularity of using AI amongst undergraduate students, little is yet known about students' perceptions regarding AI including self-reported benefits and concerns from their actual usage, in particular in distance learning contexts. Using a two-step, mixed-methods approach, we examined the perceptions of ten online and distance learning students from diverse disciplines regarding the design of a hypothetical AI Digital Assistant (AIDA). In the first step, we captured students' perceptions via interviews, while the second step supported the triangulation of data by enabling students to share, compare, and contrast perceptions with those of peers. All participants agreed on the usefulness of such an AI tool while studying and reported benefits from using it for real-time assistance and query resolution, support for academic tasks, personalisation and accessibility, together with emotional and social support. Students' concerns related to the ethical and social implications of implementing AIDA, data privacy and data use, operational challenges, academic integrity and misuse, and the future of education. Implications for the design of AI-tailored systems are also discussed.

aida, participant, student, (15 more...)

arXiv.org Artificial Intelligence

2403.15396

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)
Europe > United Kingdom (0.04)
Asia > Singapore (0.04)

Genre:

Instructional Material (0.93)
Questionnaire & Opinion Survey (0.93)
Research Report (0.64)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)
Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

A Regression Mixture Model to understand the effect of the Covid-19 pandemic on Public Transport Ridership

Moreau, Hugues, Côme, Étienne, Samé, Allou, Oukhellou, Latifa

arXiv.org Artificial IntelligenceFeb-16-2024

The Covid-19 pandemic drastically changed urban mobility, both during the height of the pandemic with government lockdowns, but also in the longer term with the adoption of working-from-home policies. To understand its effects on rail public transport ridership, we propose a dedicated Regression Mixture Model able to perform both the clustering of public transport stations and the segmentation of time periods, while ignoring variations due to additional variables such as the official lockdowns or non-working days. Each cluster is thus defined by a series of segments in which the effect of the exogenous variables is constant. As each segment within a cluster has its own regression coefficients to model the impact of the covariates, we analyze how these coefficients evolve to understand the changes in the cluster. We present the regression mixture model and the parameter estimation using the EM algorithm, before demonstrating the benefits of the model on both simulated and real data. Thanks to a five-year dataset of the ridership in the Paris public transport system, we analyze the impact of the pandemic, not only in terms of the number of travelers but also on the weekly commute. We further analyze the specific changes that the pandemic caused inside each cluster.

dataset, regression, segmentation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDMW60847.2023.00163

2402.12392

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > France > Île-de-France > Val-d'Oise > Roissy (0.04)

Genre:

Research Report > Experimental Study (0.46)
Instructional Material > Course Syllabus & Notes (0.46)
Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Health & Medicine > Epidemiology (0.84)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.60)
Health & Medicine > Therapeutic Area > Immunology (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Add feedback

FedKit: Enabling Cross-Platform Federated Learning for Android and iOS

He, Sichang, Tang, Beilong, Zhang, Boyan, Shao, Jiaoqi, Ouyang, Xiaomin, Nugraha, Daniel Nata, Luo, Bing

arXiv.org Artificial IntelligenceFeb-16-2024

We present FedKit, a federated learning (FL) system tailored for cross-platform FL research on Android and iOS devices. FedKit pipelines cross-platform FL development by enabling model conversion, hardware-accelerated training, and cross-platform model aggregation. Our FL workflow supports flexible machine learning operations (MLOps) in production, facilitating continuous model delivery and training. We have deployed FedKit in a real-world use case for health data analysis on university campuses, demonstrating its effectiveness. FedKit is open-source at https://github.com/FedCampus/FedKit.

aggregation, backend, enabling cross-platform federated learning, (11 more...)

arXiv.org Artificial Intelligence

2402.10464

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.15)
Europe > Germany > Hamburg (0.05)
Asia > China > Jiangsu Province (0.05)
Asia > China > Hong Kong (0.05)

Genre:

Research Report (0.40)
Instructional Material (0.35)

Industry: Information Technology (0.48)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Black Box: a new podcast series about AI and us – trailer

The GuardianFeb-15-2024, 16:24:44 GMT

In rural Norway, a young woman's boyfriend forgets who she is overnight. In Detroit, a man is arrested for a crime, but he was never there. In a Spanish town, disturbing pictures of young girls have appeared, but no one knows who is behind them. In this new series from the Guardian, we'll explore what it is that connects all these stories: the collision between people and artificial intelligence.

black box, new podcast series, trailer

The Guardian

Country: Europe > Norway (0.36)

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry: Transportation > Air (0.40)

Technology:

Information Technology > Artificial Intelligence (0.82)
Information Technology > Communications > Mobile (0.52)

Add feedback

What's coming up at #AAAI2024?

AIHubFeb-15-2024, 14:57:14 GMT

The 38th AAAI Conference on Artificial Intelligence (AAAI2024) will take place in Vancouver, and runs from Tuesday 20 February to Tuesday 27 February. Find out about some of the main events that are taking place throughout the conference. The following distinguished invited speakers will be presenting at this year's conference. We will be holding a science communication training session on Wednesday 21 February. This will comprise a one-hour talk (starting at 2pm), and a two-hour informal drop-in session (starting at 3pm).

aaai2024, artificial intelligence

AIHub

Genre: Instructional Material (0.42)

Technology: Information Technology > Artificial Intelligence (0.91)

Add feedback

QuRating: Selecting High-Quality Data for Training Language Models

Wettig, Alexander, Gupta, Aatmik, Malik, Saumya, Chen, Danqi

arXiv.org Artificial IntelligenceFeb-15-2024

Selecting high-quality pre-training data is important for creating capable language models, but existing methods rely on simple heuristics. We introduce QuRating, a method for selecting pre-training data that captures the abstract qualities of texts which humans intuitively perceive. In this paper, we investigate four qualities - writing style, required expertise, facts & trivia, and educational value. We find that LLMs are able to discern these qualities and observe that they are better at making pairwise judgments of texts than at rating the quality of a text directly. We train a QuRater model to learn scalar ratings from pairwise judgments, and use it to annotate a 260B training corpus with quality ratings for each of the four criteria. In our experiments, we select 30B tokens according to the different quality ratings and train 1.3B-parameter language models on the selected data. We find that it is important to balance quality and diversity, as selecting only the highest-rated documents leads to poor results. When we sample using quality ratings as logits over documents, our models achieve lower perplexity and stronger in-context learning performance than baselines. Beyond data selection, we use the quality ratings to construct a training curriculum which improves performance without changing the training dataset. We extensively analyze the quality ratings and discuss their characteristics, biases, and wider implications.

large language model, machine learning, southern asia 10, (22 more...)

arXiv.org Artificial Intelligence

2402.09739

Country:

North America > United States > Texas (0.67)
Europe > Russia (0.67)
Asia > Middle East > Republic of Türkiye (0.67)
(41 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Personal (1.00)
Instructional Material (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Retail (1.00)
(40 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What to Do When Your Discrete Optimization Is the Size of a Neural Network?

Silva, Hugo, White, Martha

arXiv.org Artificial IntelligenceFeb-15-2024

Oftentimes, machine learning applications using neural networks involve solving discrete optimization problems, such as in pruning, parameter-isolation-based continual learning and training of binary networks. Still, these discrete problems are combinatorial in nature and are also not amenable to gradient-based optimization. Additionally, classical approaches used in discrete settings do not scale well to large neural networks, forcing scientists and empiricists to rely on alternative methods. Among these, two main distinct sources of top-down information can be used to lead the model to good solutions: (1) extrapolating gradient information from points outside of the solution set (2) comparing evaluations between members of a subset of the valid solutions. We take continuation path (CP) methods to represent using purely the former and Monte Carlo (MC) methods to represent the latter, while also noting that some hybrid methods combine the two. The main goal of this work is to compare both approaches. For that purpose, we first overview the two classes while also discussing some of their drawbacks analytically. Then, on the experimental section, we compare their performance, starting with smaller microworld experiments, which allow more fine-grained control of problem variables, and gradually moving towards larger problems, including neural network regression and neural network pruning for image classification, where we additionally compare against magnitude-based pruning.

estimator, experiment, optimization, (15 more...)

arXiv.org Artificial Intelligence

2402.10339

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.92)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Ising on the Graph: Task-specific Graph Subsampling via the Ising Model

Bånkestad, Maria, Andersson, Jennifer, Mair, Sebastian, Sjölund, Jens

arXiv.org Artificial IntelligenceFeb-15-2024

Reducing a graph while preserving its overall structure is an important problem with many applications. Typically, the reduction approaches either remove edges (sparsification) or merge nodes (coarsening) in an unsupervised way with no specific downstream task in mind. In this paper, we present an approach for subsampling graph structures using an Ising model defined on either the nodes or edges and learning the external magnetic field of the Ising model using a graph neural network. Our approach is task-specific as it can learn how to reduce a graph for a specific downstream task in an end-to-end fashion. The utilized loss function of the task does not even have to be differentiable.

ising model, magnetic field, sparsity pattern, (12 more...)

arXiv.org Artificial Intelligence

2402.10206

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)

Add feedback

MCMC-driven learning

Bouchard-Côté, Alexandre, Campbell, Trevor, Pleiss, Geoff, Surjanovic, Nikola

arXiv.org Artificial IntelligenceFeb-14-2024

This paper is intended to appear as a chapter for the Handbook of Markov Chain Monte Carlo. The goal of this chapter is to unify various problems at the intersection of Markov chain Monte Carlo (MCMC) and machine learning$\unicode{x2014}$which includes black-box variational inference, adaptive MCMC, normalizing flow construction and transport-assisted MCMC, surrogate-likelihood MCMC, coreset construction for MCMC with big data, Markov chain gradient descent, Markovian score climbing, and more$\unicode{x2014}$within one common framework. By doing so, the theory and methods developed for each may be translated and generalized.

algorithm, gradient, inference, (14 more...)

arXiv.org Artificial Intelligence

2402.09598

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(4 more...)

Genre:

Instructional Material (1.00)
Overview (0.92)

Industry: Transportation (0.34)

Add feedback