Goto

Collaborating Authors

 Asia


5 Reasons to Think Twice Before Using ChatGPT--or Any Chatbot--for Financial Advice

WIRED

As people increasingly rely on AI chatbots for guidance, even on financial matters, a healthy dose of skepticism is critical. I've used ChatGPT to help me build a budget before, and it was genuinely helpful. After I input my monthly salary as well as my standard utilities and recurring expenses, the chatbot drafted a few solid options, and I tweaked them into penny-pinching perfection. "Millions of people turn to ChatGPT with money-related questions, from understanding debt to building budgets and learning financial concepts," says Niko Felix, an OpenAI spokesperson, when reached for comment. "ChatGPT can be a helpful tool for exploring options, preparing questions, and making financial topics easier to understand, but it is not a substitute for licensed financial professionals." OpenAI's Terms of Use state that the AI tool is not meant to replace professional financial advice.


Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks

Neural Information Processing Systems

The convergence of GD and SGD when training mildly parameterized neural networks starting from random initialization is studied. For a broad range of models and loss functions, including the most commonly used square loss and cross entropy loss, we prove an "early stage convergence" result. We show that the loss is decreased by a significant amount in the early stage of the training, and this decrease is fast. Furthurmore, for exponential type loss functions, and under some assumptions on the training data, we show global convergence of GD. Instead of relying on extreme over-parameterization, our study is based on a microscopic analysis of the activation patterns for the neurons, which helps us derive more powerful lower bounds for the gradient. The results on activation patterns, which we call "neuron partition", help build intuitions for understanding the behavior of neural networks' training dynamics, and may be of independent interest.


Rethinking the Backward Propagation for Adversarial Transferability

Neural Information Processing Systems

Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access, making it promising to attack real-world applications. Recently, several works have been proposed to boost adversarial transferability, in which the surrogate model is usually overlooked. In this work, we identify that non-linear layers (e.g.


Stochastic Distributed Optimization under Average Second-order Similarity: Algorithms and Analysis

Neural Information Processing Systems

We study finite-sum distributed optimization problems involving a master node and n 1local nodes under the popular δ-similarity and µ-strong convexity conditions. We propose two new algorithms, SVRS and AccSVRS, motivated by previous works. The non-accelerated SVRS method combines the techniques of gradient sliding and variance reduction and achieves a better communication complexity of O(n+ nδ/µ)compared to existing non-accelerated algorithms. Applying the framework proposed in Katyusha X [6], we also develop a directly accelerated version named AccSVRS with the O(n+n3/4 p δ/µ) communication complexity. In contrast to existing results, our complexity bounds are entirely smoothness-free and exhibit superiority in ill-conditioned cases. Furthermore, we establish a nearly matched lower bound to verify the tightness of our AccSVRS method.





AVariational Perspective on High-Resolution ODEs

Neural Information Processing Systems

We consider unconstrained minimization of smooth convex functions. We propose a novel variational perspective using forced Euler-Lagrange equation that allows for studying high-resolution ODEs. Through this, we obtain a faster convergence rate for gradient norm minimization using Nesterov's accelerated gradient method. Additionally, we show that Nesterov's method can be interpreted as a ratematching discretization of an appropriately chosen high-resolution ODE. Finally, using the results from the new variational perspective, we propose a stochastic method for noisy gradients.


Rethinking and Improving Robustness of Convolutional Neural Networks: a Shapley Value-based Approach in Frequency Domain

Neural Information Processing Systems

The existence of adversarial examples poses concerns for the robustness of convolutional neural networks (CNN), for which a popular hypothesis is about the frequency bias phenomenon: CNNs rely more on high-frequency components (HFC) for classification than humans, which causes the brittleness of CNNs. However, most previous works manually select and roughly divide the image frequency spectrum and conduct qualitative analysis. In this work, we introduce Shapley value, a metric of cooperative game theory, into the frequency domain and propose to quantify the positive (negative) impact of every frequency component of data on CNNs. Based on the Shapley value, we quantify the impact in a fine-grained way and show intriguing instance disparity. Statistically, we investigate adversarial training(AT) and the adversarial attack in the frequency domain. The observations motivate us to perform an in-depth analysis and lead to multiple novel hypotheses about i) the cause of adversarial robustness of the AT model; ii) the fairness problem of AT between different classes in the same dataset; iii) the attack bias on different frequency components. Finally, we propose a Shapley-value guided data augmentation technique for improving the robustness. Experimental results on image classification benchmarks show its effectiveness. The code for this paper is at https://github.com/Ytchen981/CSA


FlowCam: Training Generalizable 3DRadiance Fields without Camera Poses via Pixel-Aligned Scene Flow

Neural Information Processing Systems

Reconstruction of 3D neural fields from posed images has emerged as a promising method for self-supervised representation learning. The key challenge preventing the deployment of these 3D scene learners on large-scale video data is their dependence on precise camera poses from structure-from-motion, which is prohibitively expensive to run at scale. We propose a method that jointly reconstructs camera poses and 3D neural scene representations online and in a single forward pass. We estimate poses by first lifting frame-to-frame optical flow to 3D scene flow via differentiable rendering, preserving locality and shift-equivariance of the image processing backbone. SE(3) camera pose estimation is then performed via a weighted least-squares fit to the scene flow field. This formulation enables us to jointly supervise pose estimation and a generalizable neural scene representation via re-rendering the input video, and thus, train end-to-end and fully self-supervised on real-world video datasets. We demonstrate that our method performs robustly on diverse, real-world video, notably on sequences traditionally challenging to optimization-based pose estimation techniques.