collaboratively
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning--which distinguish between its many forms--correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.
GCAO: Group-driven Clustering via Gravitational Attraction and Optimization
Traditional clustering algorithms often struggle with high-dimensional and non-uniformly distributed data, where low-density boundary samples are easily disturbed by neighboring clusters, leading to unstable and distorted clustering results. To address this issue, we propose a Group-driven Clustering via Gravitational Attraction and Optimization (GCAO) algorithm. GCAO introduces a group-level optimization mechanism that aggregates low-density boundary points into collaboratively moving groups, replacing the traditional point-based contraction process. By combining local density estimation with neighborhood topology, GCAO constructs effective gravitational interactions between groups and their surroundings, enhancing boundary clarity and structural consistency. Using groups as basic motion units, a gravitational contraction strategy ensures globally stable and directionally consistent convergence. Experiments on multiple high-dimensional datasets demonstrate that GCAO outperforms 11 representative clustering methods, achieving average improvements of 37.13%, 52.08%, 44.98%, and 38.81% in NMI, ARI, Homogeneity, and ACC, respectively, while maintaining competitive efficiency and scalability. These results highlight GCAO's superiority in preserving cluster integrity, enhancing boundary separability, and ensuring robust performance on complex data distributions.
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning--which distinguish between its many forms--correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary.
CollaFuse: Collaborative Diffusion Models
Allmendinger, Simeon, Zipperling, Domenique, Struppek, Lukas, Kühl, Niklas
In the landscape of generative artificial intelligence, diffusion-based models have emerged as a promising method for generating synthetic images. However, the application of diffusion models poses numerous challenges, particularly concerning data availability, computational requirements, and privacy. Traditional approaches to address these shortcomings, like federated learning, often impose significant computational burdens on individual clients, especially those with constrained resources. In response to these challenges, we introduce a novel approach for distributed collaborative diffusion models inspired by split learning. Our approach facilitates collaborative training of diffusion models while alleviating client computational burdens during image synthesis. This reduced computational burden is achieved by retaining data and computationally inexpensive processes locally at each client while outsourcing the computationally expensive processes to shared, more efficient server resources. Through experiments on the common CelebA dataset, our approach demonstrates enhanced privacy by reducing the necessity for sharing raw data. These capabilities hold significant potential across various application areas, including the design of edge computing solutions. Thus, our work advances distributed machine learning by contributing to the evolution of collaborative diffusion models.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > Germany > Bavaria > Upper Franconia > Bayreuth (0.04)
Mind the Gap: Federated Learning Broadens Domain Generalization in Diagnostic AI Models
Arasteh, Soroosh Tayebi, Kuhl, Christiane, Saehn, Marwin-Jonathan, Isfort, Peter, Truhn, Daniel, Nebelung, Sven
Developing robust artificial intelligence (AI) models that generalize well to unseen datasets is challenging and usually requires large and variable datasets, preferably from multiple institutions. In federated learning (FL), a model is trained collaboratively at numerous sites that hold local datasets without exchanging them. So far, the impact of training strategy, i.e., local versus collaborative, on the diagnostic on-domain and off-domain performance of AI models interpreting chest radiographs has not been assessed. Consequently, using 610,000 chest radiographs from five institutions across the globe, we assessed diagnostic performance as a function of training strategy (i.e., local vs. collaborative), network architecture (i.e., convolutional vs. transformer-based), generalization performance (i.e., on-domain vs. off-domain), imaging finding (i.e., cardiomegaly, pleural effusion, pneumonia, atelectasis, consolidation, pneumothorax, and no abnormality), dataset size (i.e., from n=18,000 to 213,921 radiographs), and dataset diversity. Large datasets not only showed minimal performance gains with FL but, in some instances, even exhibited decreases. In contrast, smaller datasets revealed marked improvements. Thus, on-domain performance was mainly driven by training data size. However, off-domain performance leaned more on training diversity. When trained collaboratively across diverse external institutions, AI models consistently surpassed models trained locally for off-domain tasks, emphasizing FL's potential in leveraging data diversity. In conclusion, FL can bolster diagnostic privacy, reproducibility, and off-domain reliability of AI models and, potentially, optimize healthcare outcomes.
- North America > United States (0.14)
- Europe > Spain (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Inside Amazon's 'dystopian' new robot warehouses: Fears for company's 1.5 million human workers as Amazon employs humanoids to do 'mundane and repetitive' tasks
Humanoid robots which can pick up packages with robotic arms are working alongside human workers in an Amazon warehouse in the U.S., the retail giant has announced. The humanoid robot called Digit is under test in a warehouse in Texas, and has arms and legs and can grasp and handle packages like a human worker. Amazon now has 750,000 robots working in facilities around the world, but the move to humanoid robots is new - sparking fears for the future of human workers at the company. The company has denied that it intends to move to'robot only' warehouses. The bipedal robot is currently being used to shift empty tote boxes in the warehouse: it is five foot nine inches tall, weighing 140lb - and can pick up and carry objects weighing up to 35lb.
Modular Simulation Environment Towards OTN AI-based Solutions
Aleyadeh, Sam, Javadtalab, Abbas, Shami, Abdallah
The current trend for highly dynamic and virtualized networking infrastructure made automated networking a critical requirement. Multiple solutions have been proposed to address this, including the most sought-after machine learning ML-based solutions. However, the main hurdle when developing Next Generation Network is the availability of large datasets, especially in 5G and beyond and Optical Transport Networking (OTN) traffic. This need led researchers to look for viable simulation environments to generate the necessary volume with highly configurable real-life scenarios, which can be costly in setup and require subscription-based products and even the purchase of dedicated hardware, depending on the supplier. We aim to address this issue by generating high-volume and fidelity datasets by proposing a modular solution to adapt to the user's available resources. These datasets can be used to develop better-aforementioned ML solutions resulting in higher accuracy and adaptation to real-life networking traffic.
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > Ontario > Middlesex County > London (0.04)
- Asia > India (0.04)
- Information Technology (1.00)
- Telecommunications (0.69)
- Leisure & Entertainment > Games > Computer Games (0.61)
Learning across Data Owners with Joint Differential Privacy
Huang, Yangsibo, Jiang, Haotian, Liu, Daogao, Mahdian, Mohammad, Mao, Jieming, Mirrokni, Vahab
In this paper, we study the setting in which data owners train machine learning models collaboratively under a privacy notion called joint differential privacy [Kearns et al., 2018]. In this setting, the model trained for each data owner $j$ uses $j$'s data without privacy consideration and other owners' data with differential privacy guarantees. This setting was initiated in [Jain et al., 2021] with a focus on linear regressions. In this paper, we study this setting for stochastic convex optimization (SCO). We present an algorithm that is a variant of DP-SGD [Song et al., 2013; Abadi et al., 2016] and provides theoretical bounds on its population loss. We compare our algorithm to several baselines and discuss for what parameter setups our algorithm is more preferred. We also empirically study joint differential privacy in the multi-class classification problem over two public datasets. Our empirical findings are well-connected to the insights from our theoretical results.
- Europe (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Tray.io Cracks the Code on Hyperautomation for the Enterprise
Tray.io, the leader in low-code automation and integration, announced new capabilities designed to accelerate enterprise hyperautomation initiatives. Tray.io is extending end-to-end connectivity across user types with Connector Builder where low-code developers can quickly, easily, and visually develop reusable connectors on demand. Additionally, a new Connectivity API experience for developers simplifies the integration of thousands of underlying endpoints into just three API calls. The company also introduced new frameworks and features that give users the ability to build complex integrations across teams with the speed and governance required to increase enterprise velocity at scale. The number of APIs is growing at a conservative estimate of 10% annually and is expected to reach more than 300 million by 2030.
Civic AI Toolkit
This toolkit is for civil society organisations and local authorities who want to empower communities to address the climate crisis, using AI to help manage, maintain and augment civic assets. Organising large scale community responses can be a messy and complicated task, but AI can help cut through this complexity to coordinate action. Civic AI is a research project exploring where AI can help equip communities with the tools to collectively respond to the climate crisis and achieve the 2050 target of a carbon-neutral economy. What is in the toolkit? The Civic AI toolkit contains three "strategic blueprints" or visual guides that map out the different components (datasets, digital infrastructure, AI models, community contributions, etc) making up an open public service ecosystem and describe how the different parts interact.