aggregate
Approaching Human-Level Forecasting with Language Models
Forecasting future events is important for policy and decision making. In this work, we study whether language models (LMs) can forecast at the level of competitive human forecasters. Towards this goal, we develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To facilitate our study, we collect a large dataset of questions from competitive forecasting platforms. Under a test set published after the knowledge cut-offs of our LMs, we evaluate the end-to-end performance of our system against the aggregates of human forecasts. On average, the system nears the crowd aggregate of competitive forecasters and, in a certain relaxed setting, surpasses it. Our work suggests that using LMs to forecasts the future could provide accurate predictions at scale and help to inform institutional decision making.
Auto-Regressive U-Net for Full-Field Prediction of Shrinkage-Induced Damage in Concrete
Gaynutdinova, Liya, Havlásek, Petr, Rokoš, Ondřej, Hendriks, Fleur, Doškář, Martin
This paper introduces a deep learning approach for predicting time-dependent full-field damage in concrete. The study uses an auto-regressive U-Net model to predict the evolution of the scalar damage field in a unit cell given microstructural geometry and evolution of an imposed shrinkage profile. By sequentially using the predicted damage output as input for subsequent predictions, the model facilitates the continuous assessment of damage progression. Complementarily, a convolutional neural network (CNN) utilises the damage estimations to forecast key mechanical properties, including observed shrinkage and residual stiffness. The proposed dual-network architecture demonstrates high computational efficiency and robust predictive performance on the synthesised datasets. The approach reduces the computational load traditionally associated with full-field damage evaluations and is used to gain insights into the relationship between aggregate properties, such as shape, size, and distribution, and the effective shrinkage and reduction in stiffness. Ultimately, this can help to optimize concrete mix designs, leading to improved durability and reduced internal damage.
- Europe > Czechia > Prague (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Autonomous Aggregate Sorting in Construction and Mining via Computer Vision-Aided Robotic Arm Systems
Shawon, Md. Taherul Islam, Li, Yuan, Cai, Yincai, Niu, Junjie, Peng, Ting
Traditional aggregate sorting methods, whether manual or mechanical, often suffer from low precision, limited flexibility, and poor adaptability to diverse material properties such as size, shape, and lithology. To address these limitations, this study presents a computer vision-aided robotic arm system designed for autonomous aggregate sorting in construction and mining applications. The system integrates a six-degree-of-freedom robotic arm, a binocular stereo camera for 3D perception, and a ROS-based control framework. Core techniques include an attention-augmented YOLOv8 model for aggregate detection, stereo matching for 3D localization, Denavit-Hartenberg kinematic modeling for arm motion control, minimum enclosing rectangle analysis for size estimation, and hand-eye calibration for precise coordinate alignment. Experimental validation with four aggregate types achieved an average grasping and sorting success rate of 97.5%, with comparable classification accuracy. Remaining challenges include the reliable handling of small aggregates and texture-based misclassification. Overall, the proposed system demonstrates significant potential to enhance productivity, reduce operational costs, and improve safety in aggregate handling, while providing a scalable framework for advancing smart automation in construction, mining, and recycling industries.
- Asia > China > Shaanxi Province > Xi'an (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Asia > Indonesia (0.04)
- Energy (0.67)
- Water & Waste Management > Solid Waste Management (0.34)
Real-time deep learning phase imaging flow cytometer reveals blood cell aggregate biomarkers for haematology diagnostics
Delikoyun, Kerem, Chen, Qianyu, Wei, Liu, Myo, Si Ko, Krell, Johannes, Schlegel, Martin, Kuan, Win Sen, Soong, John Tshon Yit, Schneider, Gerhard, da Costa, Clarissa Prazeres, Knolle, Percy A., Renia, Laurent, Cove, Matthew Edward, Lee, Hwee Kuan, Diepold, Klaus, Hayden, Oliver
While analysing rare blood cell aggregates remains challenging in automated h aematology, they could markedly advance label - free functional diagnostics. Conventional flow cytometers efficiently perform cell counting with leukocyte differentials but fail to identify aggregates with flagged results, requiring manual reviews. Quantitat ive phase imaging flow cytometry captures detailed aggregate morphologies, but clinical use is hampered by massive data storage and offline processing. Incorporating "hidden" biom arkers into routine haematology panels would significantly improve diagnostics with out flagged results. We present RT - HAD, a n end - to - end deep learning - based image and data processing framework for off - axis digital holographic microscopy (DHM), which combines physics - consistent holographic reconstruction and detection, represent ing each blood cell in a graph to recognize aggregates . RT - HAD processes >30 GB of image data on - the - fly with turnaround time of <1.5 min and error rate of 8.9% in platelet aggregate detection, which matches acceptable laboratory error rates of haematology biomarkers and solves the "big data" challenge for point - of - care diagnostics .
- Asia > Singapore (0.06)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Asia > Japan (0.04)
- (2 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.67)
Approaching Human-Level Forecasting with Language Models
Forecasting future events is important for policy and decision making. In this work, we study whether language models (LMs) can forecast at the level of competitive human forecasters. Towards this goal, we develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To facilitate our study, we collect a large dataset of questions from competitive forecasting platforms. Under a test set published after the knowledge cut-offs of our LMs, we evaluate the end-to-end performance of our system against the aggregates of human forecasts.
Language Models for Automated Classification of Brain MRI Reports and Growth Chart Generation
Daniali, Maryam, Karandikar, Shivaram, Zimmerman, Dabriel, Schmitt, J. Eric, Buczek, Matthew J., Jung, Benjamin, Mercedes, Laura, Seidlitz, Jakob, Troiani, Vanessa, Dorfschmidt, Lena, Kafadar, Eren, Williams, Remo, Sotardi, Susan, Vosough, Arastoo, Haag, Scott, Schabdach, Jenna M., Alexander-Bloch, Aaron
Clinically acquired brain MRIs and radiology reports are valuable but underutilized resources due to the challenges of manual analysis and data heterogeneity. We developed fine-tuned language models (LMs) to classify brain MRI reports as normal (reports with limited pathology) or abnormal, fine-tuning BERT, BioBERT, ClinicalBERT, and RadBERT on 44,661 reports. We also explored the reasoning capabilities of a leading LM, Gemini 1.5-Pro, for normal report categorization. Automated image processing and modeling generated brain growth charts from LM-classified normal scans, comparing them to human-derived charts. Fine-tuned LMs achieved high classification performance (F1-Score >97%), with unbalanced training mitigating class imbalance. Performance was robust on out-of-distribution data, with full text outperforming summary (impression) sections. Gemini 1.5-Pro showed a promising categorization performance, especially with clinical inference. LM-derived brain growth charts were nearly identical to human-annotated charts (r = 0.99, p < 2.2e-16). Our LMs offer scalable analysis of radiology reports, enabling automated classification of brain MRIs in large datasets. One application is automated generation of brain growth charts for benchmarking quantitative image features. Further research is needed to address data heterogeneity and optimize LM reasoning.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- (4 more...)
Recursive Aggregates as Intensional Functions in Answer Set Programming: Semantics and Strong Equivalence
Fandinno, Jorge, Hansen, Zachary
This paper shows that the semantics of programs with aggregates implemented by the solvers clingo and dlv can be characterized as extended First-Order formulas with intensional functions in the logic of Here-and-There. Furthermore, this characterization can be used to study the strong equivalence of programs with aggregates under either semantics. We also present a transformation that reduces the task of checking strong equivalence to reasoning in classical First-Order logic, which serves as a foundation for automating this procedure.
- North America > United States > Nebraska > Douglas County > Omaha (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
Express Yourself: Enabling large-scale public events involving multi-human-swarm interaction for social applications with MOSAIX
Alhafnawi, Merihan, Gomez-Gutierrez, Maca, Hunt, Edmund R., Lemaignan, Severin, O'Dowd, Paul, Hauert, Sabine
Robot swarms have the potential to help groups of people with social tasks, given their ability to scale to large numbers of robots and users. Developing multi-human-swarm interaction is therefore crucial to support multiple people interacting with the swarm simultaneously - which is an area that is scarcely researched, unlike single-human, single-robot or single-human, multi-robot interaction. Moreover, most robots are still confined to laboratory settings. In this paper, we present our work with MOSAIX, a swarm of robot Tiles, that facilitated ideation at a science museum. 63 robots were used as a swarm of smart sticky notes, collecting input from the public and aggregating it based on themes, providing an evolving visualization tool that engaged visitors and fostered their participation. Our contribution lies in creating a large-scale (63 robots and 294 attendees) public event, with a completely decentralized swarm system in real-life settings. We also discuss learnings we obtained that might help future researchers create multi-human-swarm interaction with the public.
- North America > United States > New York > New York County > New York City (0.05)
- Europe > United Kingdom > England > Bristol (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (3 more...)
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
Chochlakis, Georgios, Potamianos, Alexandros, Lerman, Kristina, Narayanan, Shrikanth
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs). The knowledge acquired during pre-training is crucial for this few-shot capability, providing the model with task priors. However, recent studies have shown that ICL predominantly relies on retrieving task priors rather than "learning" to perform tasks. This limitation is particularly evident in complex subjective domains such as emotion and morality, where priors significantly influence posterior predictions. In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt. Moreover, we evaluate the posterior bias towards certain annotators by grounding our study in appropriate, quantitative measures of LLM priors. Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead. However, aggregation does not explain the entire gap between ICL and the state of the art, meaning other factors in such tasks also account for the observed phenomena. Finally, by rigorously studying annotator-level labels, we find that it is possible for minority annotators to both better align with LLMs and have their perspectives further amplified.
- Europe (0.14)
- North America > United States > California (0.14)
Neural Decompiling of Tracr Transformers
Thurnherr, Hannes, Riesen, Kaspar
Recently, the transformer architecture has enabled substantial progress in many areas of pattern recognition and machine learning. However, as with other neural network models, there is currently no general method available to explain their inner workings. The present paper represents a first step towards this direction. We utilize \textit{Transformer Compiler for RASP} (Tracr) to generate a large dataset of pairs of transformer weights and corresponding RASP programs. Based on this dataset, we then build and train a model, with the aim of recovering the RASP code from the compiled model. We demonstrate that the simple form of Tracr compiled transformer weights is interpretable for such a decompiler model. In an empirical evaluation, our model achieves exact reproductions on more than 30\% of the test objects, while the remaining 70\% can generally be reproduced with only few errors. Additionally, more than 70\% of the programs, produced by our model, are functionally equivalent to the ground truth, and therefore a valid decompilation of the Tracr compiled transformer weights.