Goto

Collaborating Authors

 Instructional Material


Monge-Kantorovich Fitting With Sobolev Budgets

arXiv.org Artificial Intelligence

We consider the problem of finding the ``best'' approximation of an $n$-dimensional probability measure $\rho$ using a measure $\nu$ whose support is parametrized by $f : \mathbb{R}^m \to \mathbb{R}^n$ where $m < n$. We quantify the performance of the approximation with the Monge-Kantorovich $p$-cost (also called the Wasserstein $p$-cost) $\mathbb{W}_p^p(\rho, \nu)$, and constrain the complexity of the approximation by bounding the $W^{k,q}$ Sobolev norm of $f$, which acts as a ``budget.'' We may then reformulate the problem as minimizing a functional $\mathscr{J}_p(f)$ under a constraint on the Sobolev budget. We treat general $k \geq 1$ for the Sobolev differentiability order (though $q, m$ are chosen to restrict $W^{k,q}$ to the supercritical regime $k q > m$ to guarantee existence of optimizers). The problem is closely related to (but distinct from) principal curves with length constraints when $m=1, k = 1$ and smoothing splines when $k > 1$. New aspects and challenges arise from the higher order differentiability condition. We study the gradient of $\mathscr{J}_p$, which is given by a vector field along $f$ we call the barycenter field. We use it to construct improvements to a given $f$, which gives a nontrivial (almost) strict monotonicty relation between the functional $\mathscr{J}_p$ and the Sobolev budget. We also provide a natural discretization scheme and establish its consistency. We use this scheme to model a generative learning task; in particular, we demonstrate that adding a constraint like ours as a soft penalty yields substantial improvement in training a GAN to produce images of handwritten digits, with performance competitive with weight-decay.


Diffusion Models to Enhance the Resolution of Microscopy Images: A Tutorial

arXiv.org Artificial Intelligence

Over a century ago, the microscopist Ernst Abbe devised an equation showing how the resolution of an optical microscope is limited by the wavelength of the illumination light [1]. This critical limitation, known as the Abbe's diffraction limit, implies that it is not possible to resolve objects smaller than 200 nanometers using an optical microscope. For scale, the diameter of a DNA molecule is about 2.5 nanometers--approximately one hundred times smaller. Since then, the quest to overcome this limit and to develop techniques for highresolution imaging of cellular and subcellular structures has led to significant advancements in biomedical research [2, 3, 4, 5, 6, 7] paving the way for super-resolution microscopy. The super-resolution techniques that have revolutionized the field include structured illumination microscopy (SIM) [6, 7], stimulated emission depletion (STED) [2, 3], stochastic optical reconstruction microscopy (STORM) [5], and photoactivated localization microscopy (PALM) [4]. However, these techniques require complex and expensive instrumentation, limiting their widespread availability.


Beyond Text-to-Text: An Overview of Multimodal and Generative Artificial Intelligence for Education Using Topic Modeling

arXiv.org Artificial Intelligence

Generative artificial intelligence (GenAI) can reshape education and learning. While large language models (LLMs) like ChatGPT dominate current educational research, multimodal capabilities, such as text-to-speech and text-to-image, are less explored. This study uses topic modeling to map the research landscape of multimodal and generative AI in education. An extensive literature search using Dimensions.ai yielded 4175 articles. Employing a topic modeling approach, latent topics were extracted, resulting in 38 interpretable topics organized into 14 thematic areas. Findings indicate a predominant focus on text-to-text models in educational contexts, with other modalities underexplored, overlooking the broader potential of multimodal approaches. The results suggest a research gap, stressing the importance of more balanced attention across different AI modalities and educational levels. In summary, this research provides an overview of current trends in generative AI for education, underlining opportunities for future exploration of multimodal technologies to fully realize the transformative potential of artificial intelligence in education.


From Passive Watching to Active Learning: Empowering Proactive Participation in Digital Classrooms with AI Video Assistant

arXiv.org Artificial Intelligence

In online education, innovative tools are crucial for enhancing learning outcomes. SAM (Study with AI Mentor) is an advanced platform that integrates educational videos with a context-aware chat interface powered by large language models. SAM encourages students to ask questions and explore unclear concepts in real-time, offering personalized, context-specific assistance, including explanations of formulas, slides, and images. In a crowdsourced user study involving 140 participants, SAM was evaluated through pre- and post-knowledge tests, comparing a group using SAM with a control group. The results demonstrated that SAM users achieved greater knowledge gains, with a 96.8% answer accuracy. Participants also provided positive feedback on SAM's usability and effectiveness. SAM's proactive approach to learning not only enhances learning outcomes but also empowers students to take full ownership of their educational experience, representing a promising future direction for online learning tools.


Lost in the Logic: An Evaluation of Large Language Models' Reasoning Capabilities on LSAT Logic Games

arXiv.org Artificial Intelligence

In this thesis, I evaluate the performance of Large Language Models (LLMs) on the Law School Admissions Test (LSAT), specifically the Logic Games section of the test. I focus on this section because it presents a complex logical reasoning task and thus is a valuable source of data for evaluating how modern, increasingly capable LLMs can handle hard logical reasoning tasks. I construct a dataset of LSAT logic games and their associated metadata, and extensively evaluate LLMs' performance in a Chain-of-Thought prompting setting. Given the weak performance in this setting, I explore other prompting frameworks on a smaller subset of the dataset, adapting ideas from Reflexion to this task. This results in a substantially improved accuracy of 70 percent for GPT-4 and 46 percent for GPT-3.5 on this data subset, highlighting the capacity of LLMs to revise their logical errors, despite initially weak performance. Finally, I analyze the types of logic games that models perform better or worse on, as well as the types of logical errors I observe from human annotation, providing detailed insights on the logical reasoning capabilities of LLMs.


Choose the Final Translation from NMT and LLM hypotheses Using MBR Decoding: HW-TSC's Submission to the WMT24 General MT Shared Task

arXiv.org Artificial Intelligence

This paper presents the submission of Huawei Translate Services Center (HW-TSC) to the WMT24 general machine translation (MT) shared task, where we participate in the English to Chinese (en2zh) language pair. Similar to previous years' work, we use training strategies such as regularized dropout, bidirectional training, data diversification, forward translation, back translation, alternated training, curriculum learning, and transductive ensemble learning to train the neural machine translation (NMT) model based on the deep Transformer-big architecture. The difference is that we also use continue pre-training, supervised fine-tuning, and contrastive preference optimization to train the large language model (LLM) based MT model. By using Minimum Bayesian risk (MBR) decoding to select the final translation from multiple hypotheses for NMT and LLM-based MT models, our submission receives competitive results in the final evaluation.


Robust Training Objectives Improve Embedding-based Retrieval in Industrial Recommendation Systems

arXiv.org Artificial Intelligence

Improving recommendation systems (RS) can greatly enhance the user experience across many domains, such as social media. Many RS utilize embedding-based retrieval (EBR) approaches to retrieve candidates for recommendation. In an EBR system, the embedding quality is key. According to recent literature, self-supervised multitask learning (SSMTL) has showed strong performance on academic benchmarks in embedding learning and resulted in an overall improvement in multiple downstream tasks, demonstrating a larger resilience to the adverse conditions between each downstream task and thereby increased robustness and task generalization ability through the training objective. However, whether or not the success of SSMTL in academia as a robust training objectives translates to large-scale (i.e., over hundreds of million users and interactions in-between) industrial RS still requires verification. Simply adopting academic setups in industrial RS might entail two issues. Firstly, many self-supervised objectives require data augmentations (e.g., embedding masking/corruption) over a large portion of users and items, which is prohibitively expensive in industrial RS. Furthermore, some self-supervised objectives might not align with the recommendation task, which might lead to redundant computational overheads or negative transfer. In light of these two challenges, we evaluate using a robust training objective, specifically SSMTL, through a large-scale friend recommendation system on a social media platform in the tech sector, identifying whether this increase in robustness can work at scale in enhancing retrieval in the production setting. Through online A/B testing with SSMTL-based EBR, we observe statistically significant increases in key metrics in the friend recommendations, with up to 5.45% improvements in new friends made and 1.91% improvements in new friends made with cold-start users.


Evaluating the Performance and Robustness of LLMs in Materials Science Q&A and Property Predictions

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have the potential to revolutionize scientific research, yet their robustness and reliability in domain-specific applications remain insufficiently explored. This study conducts a comprehensive evaluation and robustness analysis of LLMs within the field of materials science, focusing on domain-specific question answering and materials property prediction. Three distinct datasets are used in this study: 1) a set of multiple-choice questions from undergraduate-level materials science courses, 2) a dataset including various steel compositions and yield strengths, and 3) a band gap dataset, containing textual descriptions of material crystal structures and band gap values. The performance of LLMs is assessed using various prompting strategies, including zero-shot chain-of-thought, expert prompting, and few-shot in-context learning. The robustness of these models is tested against various forms of 'noise', ranging from realistic disturbances to intentionally adversarial manipulations, to evaluate their resilience and reliability under real-world conditions. Additionally, the study uncovers unique phenomena of LLMs during predictive tasks, such as mode collapse behavior when the proximity of prompt examples is altered and performance enhancement from train/test mismatch. The findings aim to provide informed skepticism for the broad use of LLMs in materials science and to inspire advancements that enhance their robustness and reliability for practical applications.


Simplify your life with AI automation -- learn it for life with this e-degree

PCWorld

TL;DR: Learn AI and automation with lifetime access to the ChatGPT and Automation E-Degree, packed with expert-led courses for just 24.97 through September 29. Want to get ahead in the AI game? The ChatGPT and Automation E-Degree gives you the tools to master AI and automation with lifetime access to courses for 24.97 designed for hands-on learning. The Mastering ChatGPT and OpenAI for Automation course walks you through the essentials of automating everyday tasks, making it easy to implement AI solutions in your workflow. Whether it's automating email responses or handling customer inquiries, you'll learn how to make AI work for you.


Data Visualization to Evaluate and Facilitate Targeted Data Acquisitions in Support of a Real-time Ocean Forecasting System

arXiv.org Artificial Intelligence

A robust evaluation toolset has been designed for Naval Research Laboratory's Real-Time Ocean Forecasting System RELO with the purpose of facilitating an adaptive sampling strategy and providing a more educated guidance for routing underwater gliders. The major challenges are to integrate into the existing operational system, and provide a bridge between the modeling and operative environments. Visualization is the selected approach and the developed software is divided into 3 packages: The first package is to verify that the glider is actually following the waypoints and to predict the position of the glider for the next cycle's instructions. The second package helps ensure that the delivered waypoints are both useful and feasible. The third package provides the confidence levels for the suggested path. This software's implementation is in Python for portability and modularity to allow easy expansion of new visuals.