err
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation
Jin, Song, Li, Shuqi, Zhang, Shukun, Yan, Rui
While LLMs have shown great success in financial tasks like stock prediction and question answering, their application in fully automating Equity Research Report generation remains uncharted territory. In this paper, we formulate the Equity Research Report (ERR) Generation task for the first time. To address the data scarcity and the evaluation metrics absence, we present an open-source evaluation benchmark for ERR generation - FinRpt. We frame a Dataset Construction Pipeline that integrates 7 financial data types and produces a high-quality ERR dataset automatically, which could be used for model training and evaluation. We also introduce a comprehensive evaluation system including 11 metrics to assess the generated ERRs. Moreover, we propose a multi-agent framework specifically tailored to address this task, named FinRpt-Gen, and train several LLM-based agents on the proposed datasets using Supervised Fine-Tuning and Reinforcement Learning. Experimental results indicate the data quality and metrics effectiveness of the benchmark FinRpt and the strong performance of FinRpt-Gen, showcasing their potential to drive innovation in the ERR generation field. All code and datasets are publicly available.
- Asia > China > Hubei Province > Wuhan (0.04)
- Asia > Middle East > Saudi Arabia > Mecca Province > Thuwal (0.04)
- Asia > China > Hong Kong (0.04)
- (2 more...)
- Research Report > New Finding (0.91)
- Research Report > Experimental Study (0.81)
- Law (1.00)
- Health & Medicine (1.00)
- Banking & Finance > Trading (1.00)
- Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)
Coresets for Clustering Under Stochastic Noise
Huang, Lingxiao, Li, Zhize, Vishnoi, Nisheeth K., Yang, Runkai, Zhao, Haoyu
We study the problem of constructing coresets for $(k, z)$-clustering when the input dataset is corrupted by stochastic noise drawn from a known distribution. In this setting, evaluating the quality of a coreset is inherently challenging, as the true underlying dataset is unobserved. To address this, we investigate coreset construction using surrogate error metrics that are tractable and provably related to the true clustering cost. We analyze a traditional metric from prior work and introduce a new error metric that more closely aligns with the true cost. Although our metric is defined independently of the noise distribution, it enables approximation guarantees that scale with the noise level. We design a coreset construction algorithm based on this metric and show that, under mild assumptions on the data and noise, enforcing an $\varepsilon$-bound under our metric yields smaller coresets and tighter guarantees on the true clustering cost than those obtained via classical metrics. In particular, we prove that the coreset size can improve by a factor of up to $\mathrm{poly}(k)$, where $n$ is the dataset size. Experiments on real-world datasets support our theoretical findings and demonstrate the practical advantages of our approach.
- Europe > Austria > Vienna (0.14)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Singapore (0.04)
- (17 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
- Information Technology > Security & Privacy (0.67)
Extending Prediction-Powered Inference through Conformal Prediction
Csillag, Daniel, Dall'Antonia, Pedro, Struchiner, Claudio José, Goedert, Guilherme Tegoni
Prediction-powered inference is a recent methodology for the safe use of black-box ML models to impute missing data, strengthening inference of statistical parameters. However, many applications require strong properties besides valid inference, such as privacy, robustness or validity under continuous distribution shifts; deriving prediction-powered methods with such guarantees is generally an arduous process, and has to be done case by case. In this paper, we resolve this issue by connecting prediction-powered inference with conformal prediction: by performing imputation through a calibrated conformal set-predictor, we attain validity while achieving additional guarantees in a natural manner. We instantiate our procedure for the inference of means, Z- and M-estimation, as well as e-values and e-value-based procedures. Furthermore, in the case of e-values, ours is the first general prediction-powered procedure that operates off-line. We demonstrate these advantages by applying our method on private and time-series data. Both tasks are nontrivial within the standard prediction-powered framework but become natural under our method.
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Africa > Togo (0.04)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.68)