AITopics

2303.17765

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.66)

arXiv.org Artificial IntelligenceJul-15-2023

Machine Learning Meets Mental Training -- A Proof of Concept Applied to Memory Sports

Regnani, Emanuele

"Mens sana in corpore sano" (Juvenal, 100-127 AD) Mental training has long been part of human culture, appearing in several different forms ranging from meditation to particular games or cognitive exercises aimed at various purposes. The past decades, however, have seen it losing its cardinal role in the well-roundedness of an individual and becoming more of a side hustle, confined to particular hobbies or to specific techniques needed for mental-health purposes. By contrast, recent years have seen an exponential investment in and development of artificial intelligence and machine learning technologies, which seem to be successfully tackling increasingly difficult tasks and problems. This work, then, aims to combine the two fields together by presenting a practical implementation of machine learning to the particular form of mental training that is the art of memory, taken in its competitive version called "Memory Sports". Such a fusion, on the one hand, strives to raise awareness about both realms, while on the other it seeks to encourage research in this mixed field as a way to, ultimately, drive forward the development of this seemingly underestimated sport. After first introducing the topic of mental training and its particular branch of Memory Sports, in the first chapter, the machine learning involved in the project is explained in the second chapter. The third chapter, then, presents two practical implementations of machine learning in Memory Sports, the results of which are discussed in the final chapter, together with several potential directions for future research. Ultimately, as well as stimulating interest and inspiration regarding the two fields involved in this research and exploring their points of contact, the aim here is also to investigate potential developments of human-machine collaborations, which are likely to be the focus of the next advances in technology and society overall. Starting to think in this view can help better prepare for the abrupt changes that might come and even be part of them, so as to drive their aim and scope toward a more responsible, and thus better, outcome.

artificial intelligence, inductive learning, machine learning, (17 more...)

2307.08712

Country:

North America > United States > California (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Chess (0.45)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJul-14-2023

Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models

Zhu, Xiaorui, Qin, Yichen, Wang, Peng

High-dimensional data analysis plays an important role in modern scientific discoveries. There has been extensive work on high-dimensional variable selection and estimation using penalized regressions, such as Lasso (Tibshirani, 1996), SCAD (Fan and Li, 2001), MCP (Zhang et al., 2010), and selection by partitioning solution paths (Liu and Wang, 2018). In recent years, inference for the true regression coefficients and the true model began to attract attention. A major challenge of high-dimensional inference is how to quantify the uncertainty of the coefficient estimate because such uncertainty depends on two components, the uncertainty in parameter estimation given the selected model, the uncertainty in selecting the model, both of which are difficult to estimate and are actively studied. For inference of the regression coefficients, Scheffé (1953) introduces the notion of simultaneous confidence intervals, which is a sequence of intervals containing the true coefficients at a given probability. For the high-dimensional linear models, Dezeure et al. (2017) and Zhang and Cheng (2017) construct the simultaneous confidence intervals using the debiased Lasso approach (van de Geer et al., 2014; Zhang and Zhang, 2014).

artificial intelligence, confidence interval, machine learning, (19 more...)

arXiv.org Machine Learning

2307.07574

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Ayub, Muhammad Sohaib, Ullah, Naimat, Ali, Sarwan, Khan, Imdad Ullah, Awais, Mian Muhammad, Khan, Muhammad Asad, Faizullah, Safiullah

CAMP: A Context-Aware Cricket Players Performance Metric

Cricket is the second most popular sport after soccer in terms of viewership. However, the assessment of individual player performance, a fundamental task in team sports, is currently primarily based on aggregate performance statistics, including average runs and wickets taken. We propose Context-Aware Metric of player Performance, CAMP, to quantify individual players' contributions toward a cricket match outcome. CAMP employs data mining methods and enables effective data-driven decision-making for selection and drafting, coaching and training, team line-ups, and strategy development. CAMP incorporates the exact context of performance, such as opponents' strengths and specific circumstances of games, such as pressure situations. We empirically evaluate CAMP on data of limited-over cricket matches between 2001 and 2019. In every match, a committee of experts declares one player as the best player, called Man of the M}atch (MoM). The top two rated players by CAMP match with MoM in 83\% of the 961 games. Thus, the CAMP rating of the best player closely matches that of the domain experts. By this measure, CAMP significantly outperforms the current best-known players' contribution measure based on the Duckworth-Lewis-Stern (DLS) method.

artificial intelligence, data mining, machine learning, (17 more...)

doi: 10.1080/01605682.2023.2237530

2307.137

Country:

Asia > Pakistan (0.04)
Asia > India (0.04)
Oceania > New Zealand (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Sports > Cricket (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

SALC: Skeleton-Assisted Learning-Based Clustering for Time-Varying Indoor Localization

Hsiao, An-Hung, Shen, Li-Hsiang, Chang, Chen-Yi, Chiu, Chun-Jie, Feng, Kai-Ten

Wireless indoor localization has attracted significant amount of attention in recent years. Using received signal strength (RSS) obtained from WiFi access points (APs) for establishing fingerprinting database is a widely utilized method in indoor localization. However, the time-variant problem for indoor positioning systems is not well-investigated in existing literature. Compared to conventional static fingerprinting, the dynamicallyreconstructed database can adapt to a highly-changing environment, which achieves sustainability of localization accuracy. To deal with the time-varying issue, we propose a skeleton-assisted learning-based clustering localization (SALC) system, including RSS-oriented map-assisted clustering (ROMAC), cluster-based online database establishment (CODE), and cluster-scaled location estimation (CsLE). The SALC scheme jointly considers similarities from the skeleton-based shortest path (SSP) and the time-varying RSS measurements across the reference points (RPs). ROMAC clusters RPs into different feature sets and therefore selects suitable monitor points (MPs) for enhancing location estimation. Moreover, the CODE algorithm aims for establishing adaptive fingerprint database to alleviate the timevarying problem. Finally, CsLE is adopted to acquire the target position by leveraging the benefits of clustering information and estimated signal variations in order to rescale the weights fromweighted k-nearest neighbors (WkNN) method. Both simulation and experimental results demonstrate that the proposed SALC system can effectively reconstruct the fingerprint database with an enhanced location estimation accuracy, which outperforms the other existing schemes in the open literature.

artificial intelligence, database, machine learning, (16 more...)

2307.0765

Country:

North America > United States > California (0.04)
Asia > Taiwan (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Luo, Yiwei, Gligorić, Kristina, Jurafsky, Dan

Othering and low prestige framing of immigrant cuisines in US restaurant reviews and large language models

Identifying and understanding implicit attitudes toward food can help efforts to mitigate social prejudice due to food's pervasive role as a marker of cultural and ethnic identity. Stereotypes about food are a form of microaggression that contribute to harmful public discourse that may in turn perpetuate prejudice toward ethnic groups and negatively impact economic outcomes for restaurants. Through careful linguistic analyses, we evaluate social theories about attitudes toward immigrant cuisine in a large-scale study of framing differences in 2.1M English language Yelp reviews of restaurants in 14 US states. Controlling for factors such as restaurant price and neighborhood racial diversity, we find that immigrant cuisines are more likely to be framed in objectifying and othering terms of authenticity (e.g., authentic, traditional), exoticism (e.g., exotic, different), and prototypicality (e.g., typical, usual), but that non-Western immigrant cuisines (e.g., Indian, Mexican) receive more othering than European cuisines (e.g., French, Italian). We further find that non-Western immigrant cuisines are framed less positively and as lower status, being evaluated in terms of affordability and hygiene. Finally, we show that reviews generated by large language models (LLMs) reproduce many of the same framing tendencies. Our results empirically corroborate social theories of taste and gastronomic stereotyping, and reveal linguistic processes by which such attitudes are reified.

large language model, machine learning, natural language, (18 more...)

2307.07645

Country:

South America (0.04)
North America > Central America (0.04)
Oceania (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government (1.00)
Government > Immigration & Customs (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

Švábenský, Valdemar, Baker, Ryan S., Zambrano, Andrés, Zou, Yishan, Slater, Stefan

Towards Generalizable Detection of Urgency of Discussion Forum Posts

Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning.

artificial intelligence, machine learning, urgency, (15 more...)

doi: 10.5281/zenodo.8115790

2307.07614

Country:

North America > United States > Pennsylvania (0.25)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Online (0.93)
Research Report > Experimental Study (0.68)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.67)

Using Linear Regression for Iteratively Training Neural Networks

Khadilkar, Harshad

We present a simple linear regression based approach for learning the weights and biases of a neural network, as an alternative to standard gradient based backpropagation. The present work is exploratory in nature, and we restrict the description and experiments to (i) simple feedforward neural networks, (ii) scalar (single output) regression problems, and (iii) invertible activation functions. However, the approach is intended to be extensible to larger, more complex architectures. The key idea is the observation that the input to every neuron in a neural network is a linear combination of the activations of neurons in the previous layer, as well as the parameters (weights and biases) of the layer. If we are able to compute the ideal total input values to every neuron by working backwards from the output, we can formulate the learning problem as a linear least squares problem which iterates between updating the parameters and the activation values. We present an explicit algorithm that implements this idea, and we show that (at least for small problems) the approach is more stable and faster than gradient-based methods.

artificial intelligence, machine learning, neural network, (19 more...)

2307.05189

Country:

North America > United States (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

arXiv.org Machine LearningJul-13-2023

balance -- a Python package for balancing biased data samples

Sarig, Tal, Galili, Tal, Eilat, Roee

Surveys are an important research tool, providing unique measurements on subjective experiences such as sentiment and opinions that cannot be measured by other means. However, because survey data is collected from a self-selected group of participants, directly inferring insights from it to a population of interest, or training ML models on such data, can lead to erroneous estimates or under-performing models. In this paper we present balance, an open-source Python package by Meta, offering a simple workflow for analyzing and adjusting biased data samples with respect to a population of interest. The balance workflow includes three steps: understanding the initial bias in the data relative to a target we would like to infer, adjusting the data to correct for the bias by producing weights for each unit in the sample based on propensity scores, and evaluating the final biases and the variance inflation after applying the fitted weights. The package provides a simple API that can be used by researchers and data scientists from a wide range of fields on a variety of data. The paper provides the relevant context, methodological background, and presents the package's API.

covariate, target population, variance, (17 more...)

arXiv.org Machine Learning

2307.06024

Country:

Asia > Middle East > Israel > Southern District > Eilat (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)
Workflow (0.86)

Industry:

Government > Voting & Elections (0.67)
Government > Regional Government (0.67)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Shetiya, Suraj, Hasan, Shohedul, Asudeh, Abolfazl, Das, Gautam

Efficient Strongly Polynomial Algorithms for Quantile Regression

arXiv.org Artificial IntelligenceJul-13-2023

Linear Regression is a seminal technique in statistics and machine learning, where the objective is to build linear predictive models between a response (i.e., dependent) variable and one or more predictor (i.e., independent) variables from a given dataset of n instances, where each instance is a set of values of the independent variables and the corresponding value of the dependent variable. One of the classical and widely used approaches is Ordinary Least Square Regression (OLS), where the objective is the minimize the average squared error between the predicted and actual value of the dependent variable. Another classical approach is Quantile Regression (QR), where the objective is to minimize the average weighted absolute error between the predicted and actual value of the dependent variable. QR (also known as "Median Regression" for the special case of the middle quantile), is less affected by outliers and thus statistically a more robust alternative to OLS [15, 18]. However, while there exist efficient algorithms for OLS, the state-of-art algorithms for QR require solving large linear programs with many variables and constraints. They can be solved using using interior point methods [24] which are weakly polynomial (i.e., in the arithmetic computation model the running time is polynomial in the number of bits required to represent the rational numbers in the input), or using Simplex-based exterior point methods which can have exponential time complexity in the worst case [10]. The main focus of our paper is an investigation of the computational complexity of Quantile Regression, and in particular, to design efficient strongly polynomial algorithms (i.e., in the arithmetic computation model the running time is polynomial in the number of rational numbers in the input) for various special cases of the problem.

algorithm, hyperplane, vertex, (14 more...)

2307.08706

Country:

North America > United States > Texas (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)