AITopics

2210.14664

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.85)

Donhauser, Konstantin, Ruggeri, Nicolo, Stojanovic, Stefan, Yang, Fanny

Fast Rates for Noisy Interpolation Require Rethinking the Effects of Inductive Bias

arXiv.org Artificial IntelligenceOct-26-2022

Good generalization performance on high-dimensional data crucially hinges on a simple structure of the ground truth and a corresponding strong inductive bias of the estimator. Even though this intuition is valid for regularized models, in this paper we caution against a strong inductive bias for interpolation in the presence of noise: While a stronger inductive bias encourages a simpler structure that is more aligned with the ground truth, it also increases the detrimental effect of noise. Specifically, for both linear regression and classification with a sparse ground truth, we prove that minimum $\ell_p$-norm and maximum $\ell_p$-margin interpolators achieve fast polynomial rates close to order $1/n$ for $p > 1$ compared to a logarithmic rate for $p = 1$. Finally, we provide preliminary experimental evidence that this trade-off may also play a crucial role in understanding non-linear interpolating models used in practice.

artificial intelligence, interpolator, machine learning, (18 more...)

2203.03597

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

arXiv.org Artificial IntelligenceOct-25-2022

Similarity between Units of Natural Language: The Transition from Coarse to Fine Estimation

Mu, Wenchuan

Capturing the similarities between human language units is crucial for explaining how humans associate different objects, and therefore its computation has received extensive attention, research, and applications. With the ever-increasing amount of information around us, calculating similarity becomes increasingly complex, especially in many cases, such as legal or medical affairs, measuring similarity requires extra care and precision, as small acts within a language unit can have significant real-world effects. My research goal in this thesis is to develop regression models that account for similarities between language units in a more refined way. Computation of similarity has come a long way, but approaches to debugging the measures are often based on continually fitting human judgment values. To this end, my goal is to develop an algorithm that precisely catches loopholes in a similarity calculation. Furthermore, most methods have vague definitions of the similarities they compute and are often difficult to interpret. The proposed framework addresses both shortcomings. It constantly improves the model through catching different loopholes. In addition, every refinement of the model provides a reasonable explanation. The regression model introduced in this thesis is called progressively refined similarity computation, which combines attack testing with adversarial training. The similarity regression model of this thesis achieves state-of-the-art performance in handling edge cases.

baseline algorithm and evaluation metric, data mining, machine learning, (24 more...)

2210.14275

Genre:

Research Report > New Finding (1.00)
Personal (1.00)
Overview (1.00)
Instructional Material (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(7 more...)

Ranjan, Sidharth, van Schijndel, Marten, Agarwal, Sumeet, Rajkumar, Rajakrishnan

Dual Mechanism Priming Effects in Hindi Word Order

arXiv.org Artificial IntelligenceOct-25-2022

Word order choices during sentence production can be primed by preceding sentences. In this work, we test the DUAL MECHANISM hypothesis that priming is driven by multiple different sources. Using a Hindi corpus of text productions, we model lexical priming with an n-gram cache model and we capture more abstract syntactic priming with an adaptive neural language model. We permute the preverbal constituents of corpus sentences, and then use a logistic regression model to predict which sentences actually occurred in the corpus against artificially generated meaning-equivalent variants. Our results indicate that lexical priming and lexically-independent syntactic priming affect complementary sets of verb classes. By showing that different priming influences are separable from one another, our results support the hypothesis that multiple different cognitive mechanisms underlie priming.

machine learning, natural language, simulation of human behavior, (20 more...)

2210.13938

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

#artificialintelligenceOct-24-2022, 11:02:44 GMT

Polynomial Regression in R for Data Science - Detechtor

Create a regressor and call it, 'poly_reg'. Assign the regressor to the lm() function as we did in linear regression. The function takes two arguments. The formula and the data, same way we did in linear regression. To transform this from a linear regression to a polynomial regression model, we need to add some polynomial features.

dataset, polynomial regression model, regression model, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceOct-24-2022, 11:02:41 GMT

Multiple Linear Regression in R for Data Science - Detechtor

We are going to learn how to implement a Multiple Linear Regression model in R. This is a bit more complex than Simple Linear Regression but it's going to be so practical and fun. Multiple Linear Regression is a data science technique that uses several explanatory variables to predict the outcome of a response variable. A Multiple linear regression model attempts to model the relationship between two or more explanatory variables (independent variables) and a response variable (dependent variable), by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable y.

independent variable, linear regression, regression, (14 more...)

Country:

Africa > Kenya > Nairobi City County > Nairobi (0.05)
Africa > Kenya > Mombasa County > Mombasa (0.05)
Africa > Kenya > Kisumu County > Kisumu (0.05)

Genre:

Research Report (0.67)
Instructional Material > Course Syllabus & Notes (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceOct-24-2022, 10:02:32 GMT

12 Best Data Analytics Courses in Coursera

Coursera is an E-Learning platform that provides thousands of online courses on various subjects. And Coursera has a wide range of Data Analytics courses too. That's why I thought to share the 12 Best Data Analytics Courses in Coursera with you. So, give your few minutes to this article and find out the Best Data Analytics Courses on Coursera. Now without any further ado, let's get started- This is one of the most popular Data Analyst Certification programs.

best data analytic course, hour week, specialization program, (12 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

#artificialintelligenceOct-24-2022, 07:30:27 GMT

Understanding how Traffic Forecasting works part2(Statistics)

Abstract: With accurate and timely traffic forecasting, the impacted traffic conditions can be predicted in advance to guide agencies and residents to respond to changes in traffic patterns appropriately. However, existing works on traffic forecasting mainly relied on historical traffic patterns confining to short-term prediction, under 1 hour, for instance. To better manage future roadway capacity and accommodate social and human impacts, it is crucial to propose a flexible and comprehensive framework to predict physical-aware long-term traffic conditions for public users and transportation agencies. In this paper, the gap of robust long-term traffic forecasting was bridged by taking social media features into consideration. A correlation study and a linear regression model were first implemented to evaluate the significance of the correlation between two time-series data, traffic intensity and Twitter data intensity.

dynamic spatial-temporal graph neural network, prediction, traffic forecasting work part2, (9 more...)

Industry: Transportation (1.00)

Technology:

Information Technology > Communications > Social Media (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Soni, Sandeep, Bamman, David, Eisenstein, Jacob

Predicting Long-Term Citations from Short-Term Linguistic Influence

arXiv.org Artificial IntelligenceOct-24-2022

A standard measure of the influence of a research paper is the number of times it is cited. However, papers may be cited for many reasons, and citation count offers limited information about the extent to which a paper affected the content of subsequent publications. We therefore propose a novel method to quantify linguistic influence in timestamped document collections. There are two main steps: first, identify lexical and semantic changes using contextual embeddings and word frequencies; second, aggregate information about these changes into per-document influence scores by estimating a high-dimensional Hawkes process with a low-rank parameter matrix. We show that this measure of linguistic influence is predictive of $\textit{future}$ citations: the estimate of linguistic influence from the two years after a paper's publication is correlated with and predictive of its citation count in the following three years. This is demonstrated using an online evaluation with incremental temporal training/test splits, in comparison with a strong baseline that includes predictors for initial citation counts, topics, and lexical features.

computational linguistic, machine learning, natural language, (21 more...)

2210.13628

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maryland > Baltimore (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(12 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-24-2022

Subspace Recovery from Heterogeneous Data with Non-isotropic Noise

Duchi, John, Feldman, Vitaly, Hu, Lunjia, Talwar, Kunal

Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component analysis (PCA), with a focus on dealing with irregular noise. Our data come from $n$ users with user $i$ contributing data samples from a $d$-dimensional distribution with mean $\mu_i$. Our goal is to recover the linear subspace shared by $\mu_1,\ldots,\mu_n$ using the data points from all users, where every data point from user $i$ is formed by adding an independent mean-zero noise vector to $\mu_i$. If we only have one data point from every user, subspace recovery is information-theoretically impossible when the covariance matrices of the noise vectors can be non-spherical, necessitating additional restrictive assumptions in previous work. We avoid these assumptions by leveraging at least two data points from each user, which allows us to design an efficiently-computable estimator under non-spherical and user-dependent noise. We prove an upper bound for the estimation error of our estimator in general scenarios where the number of data points and amount of noise can vary across users, and prove an information-theoretic error lower bound that not only matches the upper bound up to a constant factor, but also holds even for spherical Gaussian noise. This implies that our estimator does not introduce additional estimation error (up to a constant factor) due to irregularity in the noise. We show additional results for a linear regression problem in a similar setup.

artificial intelligence, machine learning, probability, (15 more...)

2210.13497

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)