AITopics

2309.13775

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

arXiv.org Machine LearningDec-9-2023

Conditional Stochastic Interpolation for Generative Learning

Huang, Ding, Huang, Jian, Li, Ting, Shen, Guohao

We propose a conditional stochastic interpolation (CSI) approach to learning conditional distributions. CSI learns probability flow equations or stochastic differential equations that transport a reference distribution to the target conditional distribution. This is achieved by first learning the drift function and the conditional score function based on conditional stochastic interpolation, which are then used to construct a deterministic process governed by an ordinary differential equation or a diffusion process for conditional sampling. In our proposed CSI model, we incorporate an adaptive diffusion term to address the instability issues arising during the training process. We provide explicit forms of the conditional score function and the drift function in terms of conditional expectations under mild conditions, which naturally lead to an nonparametric regression approach to estimating these functions. Furthermore, we establish non-asymptotic error bounds for learning the target conditional distribution via conditional stochastic interpolation in terms of KL divergence, taking into account the neural network approximation error. We illustrate the application of CSI on image generation using a benchmark image dataset.

artificial intelligence, machine learning, score function, (18 more...)

2312.05579

Country: Asia > China (0.14)

Genre: Research Report (0.63)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Borchers, Conrad, Zhang, Jiayi, Baker, Ryan S., Aleven, Vincent

Using Think-Aloud Data to Understand Relations between Self-Regulation Cycle Characteristics and Student Performance in Intelligent Tutoring Systems

arXiv.org Artificial IntelligenceDec-9-2023

Numerous studies demonstrate the importance of self-regulation during learning by problem-solving. Recent work in learning analytics has largely examined students' use of SRL concerning overall learning gains. Limited research has related SRL to in-the-moment performance differences among learners. The present study investigates SRL behaviors in relationship to learners' moment-by-moment performance while working with intelligent tutoring systems for stoichiometry chemistry. We demonstrate the feasibility of labeling SRL behaviors based on AI-generated think-aloud transcripts, identifying the presence or absence of four SRL categories (processing information, planning, enacting, and realizing errors) in each utterance. Using the SRL codes, we conducted regression analyses to examine how the use of SRL in terms of presence, frequency, cyclical characteristics, and recency relate to student performance on subsequent steps in multi-step problems. A model considering students' SRL cycle characteristics outperformed a model only using in-the-moment SRL assessment. In line with theoretical predictions, students' actions during earlier, process-heavy stages of SRL cycles exhibited lower moment-by-moment correctness during problem-solving than later SRL cycle stages. We discuss system re-design opportunities to add SRL support during stages of processing and paths forward for using machine learning to speed research depending on the assessment of SRL based on transcription of think-aloud data.

correctness, student, utterance, (16 more...)

doi: 10.1145/3636555.3636911

2312.05675

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Tennessee > Shelby County > Memphis (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Badanidiyuru, Ashwinkumar, Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Leeman, Ethan, Manurangsi, Pasin, Varadarajan, Avinash V, Zhang, Chiyuan

Optimal Unbiased Randomizers for Regression with Label Differential Privacy

arXiv.org Artificial IntelligenceDec-9-2023

We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better label randomizers depending on a privately estimated prior distribution over the labels. We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs on several datasets, highlighting the importance of reducing bias when training neural networks with label DP. We also provide theoretical results shedding light on the structural properties of the optimal unbiased randomizers.

mechanism, optimal unbiased randomizer, randomizer, (12 more...)

2312.05659

Country:

North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

arXiv.org Artificial IntelligenceDec-8-2023

Enhancing Robustness of Foundation Model Representations under Provenance-related Distribution Shifts

Ding, Xiruo, Sheng, Zhecheng, Hur, Brian, Chen, Feng, Pakhomov, Serguei V. S., Cohen, Trevor

Foundation models are a current focus of attention in both industry and academia. While they have shown their capabilities in a variety of tasks, in-depth research is required to determine their robustness to distribution shift when used as a basis for supervised machine learning. This is especially important in the context of clinical data, with particular limitations related to data accessibility, lack of pretraining materials, and limited availability of high-quality annotations. In this work, we examine the stability of models based on representations from foundation models under distribution shift. We focus on confounding by provenance, a form of distribution shift that emerges in the context of multi-institutional datasets when there are differences in source-specific language use and class distributions. Using a sampling strategy that synthetically induces varying degrees of distribution shift, we evaluate the extent to which representations from foundation models result in predictions that are inherently robust to confounding by provenance. Additionally, we examine the effectiveness of a straightforward confounding adjustment method inspired by Pearl's conception of backdoor adjustment. Results indicate that while foundation models do show some out-of-the-box robustness to confounding-by-provenance related distribution shifts, this can be considerably improved through adjustment. These findings suggest a need for deliberate adjustment of predictive models using representations from foundation models in the context of source-specific distributional differences.

dataset, foundation model, robustness, (13 more...)

2312.05435

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)

arXiv.org Artificial IntelligenceDec-8-2023

Disentangling CO Chemistry in a Protoplanetary Disk Using Explanatory Machine Learning Techniques

Diop, Amina, Cleeves, Ilse, Anderson, Dana, Pegues, Jamila, Plunkett, Adele

Molecular abundances in protoplanetary disks are highly sensitive to the local physical conditions, including gas temperature, gas density, radiation field, and dust properties. Often multiple factors are intertwined, impacting the abundances of both simple and complex species. We present a new approach to understanding these chemical and physical interdependencies using machine learning. Specifically we explore the case of CO modeled under the conditions of a generic disk and build an explanatory regression model to study the dependence of CO spatial density on the gas density, gas temperature, cosmic ray ionization rate, X-ray ionization rate, and UV flux. Our findings indicate that combinations of parameters play a surprisingly powerful role in regulating CO compared to any singular physical parameter. Moreover, in general, we find the conditions in the disk are destructive toward CO. CO depletion is further enhanced in an increased cosmic ray environment and in disks with higher initial C/O ratios. These dependencies uncovered by our new approach are consistent with previous studies, which are more modeling intensive and computationally expensive. Our work thus shows that machine learning can be a powerful tool not only for creating efficient predictive models, but also for enabling a deeper understanding of complex chemical processes.

abundance, chemistry, predictor, (16 more...)

2312.05254

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > United States > Ohio (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

arXiv.org Artificial IntelligenceDec-8-2023

Soft Frequency Capping for Improved Ad Click Prediction in Yahoo Gemini Native

Aharon, Michal, Kaplan, Yohay, Levy, Rina, Somekh, Oren, Blanc, Ayelet, Eshel, Neetai, Shahar, Avi, Singer, Assaf, Zlotnik, Alex

Yahoo's native advertising (also known as Gemini native) serves billions of ad impressions daily, reaching a yearly run-rate of many hundred of millions USD. Driving the Gemini native models that are used to predict both click probability (pCTR) and conversion probability (pCONV) is OFFSET - a feature enhanced collaborative-filtering (CF) based event prediction algorithm. \offset is a one-pass algorithm that updates its model for every new batch of logged data using a stochastic gradient descent (SGD) based approach. Since OFFSET represents its users by their features (i.e., user-less model) due to sparsity issues, rule based hard frequency capping (HFC) is used to control the number of times a certain user views a certain ad. Moreover, related statistics reveal that user ad fatigue results in a dramatic drop in click through rate (CTR). Therefore, to improve click prediction accuracy, we propose a soft frequency capping (SFC) approach, where the frequency feature is incorporated into the OFFSET model as a user-ad feature and its weight vector is learned via logistic regression as part of OFFSET training. Online evaluation of the soft frequency capping algorithm via bucket testing showed a significant 7.3% revenue lift. Since then, the frequency feature enhanced model has been pushed to production serving all traffic, and is generating a hefty revenue lift for Yahoo Gemini native. We also report related statistics that reveal, among other things, that while users' gender does not affect ad fatigue, the latter seems to increase with users' age.

frequency, frequency feature, vector, (17 more...)

doi: 10.1145/3357384.3357801

2312.05052

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Schuler, Alejandro, Li, Yi, van der Laan, Mark

Lassoed Tree Boosting

arXiv.org Machine LearningDec-8-2023

Gradient boosting performs exceptionally in most prediction problems and scales well to large datasets. In this paper we prove that a ``lassoed'' gradient boosted tree algorithm with early stopping achieves faster than $n^{-1/4}$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. This rate is remarkable because it does not depend on the dimension, sparsity, or smoothness. We use simulation and real data to confirm our theory and demonstrate empirical performance and scalability on par with standard boosting. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.

algorithm, artificial intelligence, machine learning, (16 more...)

2205.10697

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

arXiv.org Machine LearningDec-7-2023

Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy

Bridgman, Wyatt, Balakrishnan, Uma, Jones, Reese, Chen, Jiefu, Wu, Xuqing, Safta, Cosmin, Huang, Yueqin, Khalil, Mohammad

In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using linear regression, which is also known as point collocation or stochastic response surfaces. Regression exhibits better scalability and can handle noisy function evaluations in contrast to other non-intrusive approaches, such as projection. However, since over-sampling is generally advisable for the linear regression approach, the simulation requirements become prohibitive for expensive forward models. We propose to leverage transfer learning whereby knowledge gained through similar PCE surrogate construction tasks (source domains) is transferred to a new surrogate-construction task (target domain) which has a limited number of forward model simulations (training data). The proposed transfer learning strategy determines how much, if any, information to transfer using new techniques inspired by Bayesian modeling and data assimilation. The strategy is scrutinized using numerical investigations and applied to an engineering problem from the oil and gas industry.

artificial intelligence, machine learning, objective function, (18 more...)

2312.04648

Country:

Europe (0.28)
North America > United States > California (0.14)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Energy > Oil & Gas > Upstream (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Barrabes, Miriam, Montserrat, Daniel Mas, Geleta, Margarita, Giro-i-Nieto, Xavier, Ioannidis, Alexander G.

Adversarial Learning for Feature Shift Detection and Correction

arXiv.org Machine LearningDec-7-2023

Data shift is a phenomenon present in many real-world applications, and while there are multiple methods attempting to detect shifts, the task of localizing and correcting the features originating such shifts has not been studied in depth. Feature shifts can occur in many datasets, including in multi-sensor data, where some sensors are malfunctioning, or in tabular and structured data, including biomedical, financial, and survey data, where faulty standardization and data processing pipelines can lead to erroneous features. In this work, we explore using the principles of adversarial learning, where the information from several discriminators trained to distinguish between two distributions is used to both detect the corrupted features and fix them in order to remove the distribution shift between datasets. We show that mainstream supervised classifiers, such as random forest or gradient boosting trees, combined with simple iterative heuristics, can localize and correct feature shifts, outperforming current statistical and neural network-based techniques.

artificial intelligence, deep learning, machine learning, (19 more...)

2312.04546

Country:

North America > Montserrat (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)