weight class
AILS-NTUA at SemEval-2025 Task 8: Language-to-Code prompting and Error Fixing for Tabular Question Answering
Evangelatos, Andreas, Filandrianos, Giorgos, Lymperaiou, Maria, Voulodimos, Athanasios, Stamou, Giorgos
In this paper, we present our submission to SemEval-2025 Task 8: Question Answering over Tabular Data. This task, evaluated on the DataBench dataset, assesses Large Language Models' (LLMs) ability to answer natural language questions over structured data while addressing topic diversity and table size limitations in previous benchmarks. We propose a system that employs effective LLM prompting to translate natural language queries into executable code, enabling accurate responses, error correction, and interpretability. Our approach ranks first in both subtasks of the competition in the proprietary model category, significantly outperforming the organizer's baseline.
- North America > United States (0.14)
- Europe > Croatia (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression
Munteanu, Alexander, Omlor, Simon, Woodruff, David
We improve upon previous oblivious sketching and turnstile streaming results for $\ell_1$ and logistic regression, giving a much smaller sketching dimension achieving $O(1)$-approximation and yielding an efficient optimization problem in the sketch space. Namely, we achieve for any constant $c>0$ a sketching dimension of $\tilde{O}(d^{1+c})$ for $\ell_1$ regression and $\tilde{O}(\mu d^{1+c})$ for logistic regression, where $\mu$ is a standard measure that captures the complexity of compressing the data. For $\ell_1$-regression our sketching dimension is near-linear and improves previous work which either required $\Omega(\log d)$-approximation with this sketching dimension, or required a larger $\operatorname{poly}(d)$ number of rows. Similarly, for logistic regression previous work had worse $\operatorname{poly}(\mu d)$ factors in its sketching dimension. We also give a tradeoff that yields a $1+\varepsilon$ approximation in input sparsity time by increasing the total size to $(d\log(n)/\varepsilon)^{O(1/\varepsilon)}$ for $\ell_1$ and to $(\mu d\log(n)/\varepsilon)^{O(1/\varepsilon)}$ for logistic regression. Finally, we show that our sketch can be extended to approximate a regularized version of logistic regression where the data-dependent regularizer corresponds to the variance of the individual logistic losses.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Europe > Ukraine (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Oblivious sketching for logistic regression
Munteanu, Alexander, Omlor, Simon, Woodruff, David
What guarantees are possible for solving logistic regression in one pass over a data stream? To answer this question, we present the first data oblivious sketch for logistic regression. Our sketch can be computed in input sparsity time over a turnstile data stream and reduces the size of a $d$-dimensional data set from $n$ to only $\operatorname{poly}(\mu d\log n)$ weighted points, where $\mu$ is a useful parameter which captures the complexity of compressing the data. Solving (weighted) logistic regression on the sketch gives an $O(\log n)$-approximation to the original problem on the full data set. We also show how to obtain an $O(1)$-approximation with slight modifications. Our sketches are fast, simple, easy to implement, and our experiments demonstrate their practicality.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
- (3 more...)
Competing Above Their Weight Class With AI: A Case Study
Usually, the economy of scale, size, and reputation win the day, but this spunky and highly adaptive ad agency competes with AI plus better processes to put more pressure on the large ad agencies. I think this will be a trend in many industries with the upstarts starting to scare the incumbent "big dogs." The hourly agency model favors longer times and complex hierarchies to create more billable hours, and as a result, more clients are bringing this work in-house. This proves to be an excellent opportunity for a smaller agency to apply technology to outmaneuver the big dog players in the ad business. By automating the ad creation process by applying AI and automated processes together, big gains are being experienced.
Threshold Network Learning in the Presence of Equivalences
This paper applies the theory of Probably Approximately Correct (PAC) learning to multiple output feedforward threshold networks in which the weights conform to certain equivalences. It is shown that the sample size for reliable learning can be bounded above by a formula similar to that required for single output networks with no equivalences. The best previously obtained bounds are improved for all cases.
Threshold Network Learning in the Presence of Equivalences
This paper applies the theory of Probably Approximately Correct (PAC) learning to multiple output feedforward threshold networks in which the weights conform to certain equivalences. It is shown that the sample size for reliable learning can be bounded above by a formula similar to that required for single output networks with no equivalences. The best previously obtained bounds are improved for all cases.
Threshold Network Learning in the Presence of Equivalences
This paper applies the theory of Probably Approximately Correct (PAC) learning to multiple output feedforward threshold networks in which the weights conform to certain equivalences. It is shown that the sample size for reliable learning can be bounded above by a formula similar to that required for single output networks with no equivalences. The best previously obtainedbounds are improved for all cases.