Goto

Collaborating Authors

 Opinion


A conclusive remark on linguistic theorizing and language modeling

Chesi, Cristiano

arXiv.org Artificial Intelligence

Considering the proliferation of responses to Piantadosi's original paper and the ongoing debate sparked by this special issue of the Italian Journal of Linguistics, it is clear that the discussion has touched a raw nerve in linguistic theorizing . In the original target paper (Chesi, this issue), I illustrated three prototypical (and in many respects, extreme) positions -- the computational, theoretical, and experimental perspectives -- without explicitly endorsing any of them. Instead, I attempted to highlight what I believe are the key weaknesses o f each of these prototypical stances, ultimately concluding that formal (i.e., ' generative ') linguistics -- more specifically, Minimalis m, my theoretical comfort zone -- must adopt practices and tools that are common in both computational and experimental fields . As noted by most respondents, the title and some of the more extreme statements were intended as mild provocations to draw attention to core issues affecting linguistic theorizing . M y position -- somehow obscured behind the ' three - body problem ' -- is that any relevant scientific progress is driven by theoretical insight, not by trawling using experimental or computational methods that are cost - inefficient, energy - intensive, and ultimately unsustainable . Moreover, in full agreement with most of the replies, I believe that the success of certain large language models (L L Ms), which are based on specific architectural assumptions, do es not constitute a refutation of the generative paradigm. On the contrary, it strongly supports several key intuitions that have emerged within the generative linguistic tradition (Rizzi this issue) . H owever, a concrete problem of ' incommensurability ' arises (Hao this issue), as differing methodologies and specialized jargon (Butt this issue) often result in circular, unresolved discussions .


'What I Think about When I Type about Talking': Reflections on Text-Entry Acceleration Interfaces

Communications of the ACM

Today's text-entry tools offer a plethora of interface technologies to support users in a variety of situations and with a range of different input methods and devices.16 Recent hardware developments have enabled remarkable innovations, such as virtual keyboards that allow users to type in thin air, or to use their body as a surface for text entry. Similarly, advances in machine learning and natural language processing have enabled high-quality text generation for various purposes, such as summarizing, expanding, and co-authoring. As these technologies rapidly develop, there has been a rush to incorporate them into existing systems, often with little thought for the interactivity problems this may cause. The use of large language models (LLMs) to speed up text generation and improve prediction or completion models is becoming increasingly commonplace, with enormous theoretical efficiency savings;29 however, the implementation of these LLMs into text-entry interfaces is crucial to realizing their potential.


Reviews: Beyond the Single Neuron Convex Barrier for Neural Network Certification

Neural Information Processing Systems

Originality: The authors propose a novel relaxation (to the best of my knowledge) for networks with ReLU activations that tighten previously proposed relaxations that ignore the correlations between neurons in the network. The theoretical results are also novel (although unsurprising). However, it would be useful for the authors to better clarify the computational requirements and tightness of k-ReLU relative to DeepPoly and other similar relaxations and bound propagation methods like [13] and https://arxiv.org/abs/1805.12514, Quality: The theoretical results are accurate (albeit unsurprising) in my opinion. The experimental section is missing several important details in my opinion: 1) The authors say that experiments are performed on both MNIST and CIFAR-10, but the tables 2/3 only report numbers on MNIST.


Review for NeurIPS paper: Model Class Reliance for Random Forests

Neural Information Processing Systems

This is a relevant and timely paper that has been reviewed by four knowledgeable referees, who also thoroughly considered the author's response to their initial reviews. Three of these reviewers recommend acceptance, providing detailed suggestions on how to improve this work before its final submission. This dissenting opinion was upheld by R3 after discussion with other referees. R3 in my opinion correctly brings up that if the proposed approach aims to improve runtime with an approximate algorithm, this must be sufficiently demonstrated in experiments vs. straightforward alternatives (such as retraining-based methods). That has not been done in the original submission neither in the rebuttal.


Review for NeurIPS paper: Curriculum By Smoothing

Neural Information Processing Systems

Weaknesses: - The authors compared their method to the baseline approach only. However, there are plenty of curriculum learning methods that could have been used as relevant state-of-the-art competing methods to compare with, e.g. Comparison with such competing methods is mandatory, in my opinion. I believe that the non-linearity is typically applied before the pooling operation. Even so, it is not clear why it works so well.


Reviews: DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

Neural Information Processing Systems

In this paper, the authors propose a distributed Newton method for gradient-norm optimization. The method does not impose any specific form on the underlying objective function. The authors present convergence analysis for the method and illustrate the performance of the method on a convex problem (in the main paper). Originality: The topic of the paper, in my opinion, is very interesting. The paper presents an efficient Newton method that is motivated via the optimization of the norm of the gradient.


Reviews: Adaptive Density Estimation for Generative Models

Neural Information Processing Systems

Summary: The authors propose a hybrid method that combines VAEs with adversarial training and flow based models. In particular, they derive an explicit density function p(x) where the likelihood can be evaluated, the corresponding components p(x z) are more flexible than the standard VAE that utilizes diagonal Gaussians, and the generated samples have better quality than a standard VAE. The basic idea of the proposed model is that the VAE is defined between a latent space and an intermediate representation space, and then, the representation space is connected with the data space through an invertible non-linear flow. In general, I think the paper is quite well written, but on the same time I believe that there is a lot of compressed information, and the consequence is that in some parts it is not even clear what the authors want to say (see Clarity comments). The proposed idea of the paper seems quite interesting, but on the same time I have some doubts (see Quality comments).


Reviews: Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between User Dynamics and Fairness

Neural Information Processing Systems

Originality: To the best of my knowledge the model of general user retention dynamics and corresponding statements evidencing negative feedback loops are novel contributions to the literature in sequential fairness works. The contributions of the paper would be clearer if citations were provided for methods and models introduced in earlier works (for example, I suggest adding citations for the fairness criteria in lines 149-158, for user departure models in 197-208, and for the statement in lines 173-174, if applicable). Since the full related work is deferred to the appendix, I see no need to cite [2, 3, 7, 10, 15, 16] without distinction between them. More context on what these works do and how they relate to your work is useful for readers to contextualize your contributions; please expand on the discussion of these papers. Quality: The simple and unifying model of sequential decision making presented is very valuable in my opinion.


Review for NeurIPS paper: Parabolic Approximation Line Search for DNNs

Neural Information Processing Systems

There was a ton of discussion about this paper between reviewers and area chairs, multiple reviewers improved their view of the paper based on the author response, and I read through the paper in detail myself. I was still conflicted after reading it, but I am leaning towards recommending acceptance. However, I *implore* the authors to carefully consider the issues brought up by R2 and R3 as well as the issues that I bring up below. I believe that every single one of the issues brought up can be fixed, and will be extremely-disappointed if these issues are not addressed in the final version (it would make me regret recommending acceptance, and probably be more-harsh on empirical papers in the future, especially by the same authors). Some specific comments from my read-through of the paper: - I agree with the authors that optimization is a mix of theory and empirical work, and it is completely ok for works to be purely empirical if the experiments are done well.


Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models

Lopez, Eleonora, Sigillo, Luigi, Colonnese, Federica, Panella, Massimo, Comminiello, Danilo

arXiv.org Artificial Intelligence

Generating images from brain waves is gaining increasing attention due to its potential to advance brain-computer interface (BCI) systems by understanding how brain signals encode visual cues. Most of the literature has focused on fMRI-to-Image tasks as fMRI is characterized by high spatial resolution. However, fMRI is an expensive neuroimaging modality and does not allow for real-time BCI. On the other hand, electroencephalography (EEG) is a low-cost, non-invasive, and portable neuroimaging technique, making it an attractive option for future real-time applications. Nevertheless, EEG presents inherent challenges due to its low spatial resolution and susceptibility to noise and artifacts, which makes generating images from EEG more difficult. In this paper, we address these problems with a streamlined framework based on the ControlNet adapter for conditioning a latent diffusion model (LDM) through EEG signals. We conduct experiments and ablation studies on popular benchmarks to demonstrate that the proposed method beats other state-of-the-art models. Unlike these methods, which often require extensive preprocessing, pretraining, different losses, and captioning models, our approach is efficient and straightforward, requiring only minimal preprocessing and a few components. The code is available at https://github.com/LuigiSigillo/GWIT.