Law
Analyzing Fairness of Classification Machine Learning Model with Structured Dataset
Rashed, Ahmed, Kallich, Abdelkrim, Eltayeb, Mohamed
Machine learning (ML) algorithms have become integral to decision making in various domains, including healthcare, finance, education, and law enforcement. However, concerns about fairness and bias in these systems pose significant ethical and social challenges. This study investigates the fairness of ML models applied to structured datasets in classification tasks, highlighting the potential for biased predictions to perpetuate systemic inequalities. A publicly available dataset from Kaggle was selected for analysis, offering a realistic scenario for evaluating fairness in machine learning workflows. To assess and mitigate biases, three prominent fairness libraries; Fairlearn by Microsoft, AIF360 by IBM, and the What If Tool by Google were employed. These libraries provide robust frameworks for analyzing fairness, offering tools to evaluate metrics, visualize results, and implement bias mitigation strategies. The research aims to assess the extent of bias in the ML models, compare the effectiveness of these libraries, and derive actionable insights for practitioners. The findings reveal that each library has unique strengths and limitations in fairness evaluation and mitigation. By systematically comparing their capabilities, this study contributes to the growing field of ML fairness by providing practical guidance for integrating fairness tools into real world applications. These insights are intended to support the development of more equitable machine learning systems.
The Three Social Dimensions of Chatbot Technology
The development and deployment of chatbot technology, while spanning decades and employing different techniques, require innovative frameworks to understand and interrogate their functionality and implications. A mere technocentric account of the evolution of chatbot technology does not fully illuminate how conversational systems are embedded in societal dynamics. This study presents a structured examination of chatbots across three societal dimensions, highlighting their roles as objects of scientific research, commercial instruments, and agents of intimate interaction. Through furnishing a dimensional framework for the evolution of conversational systems, from laboratories to marketplaces to private lives, this article contributes to the wider scholarly inquiry of chatbot technology and its impact in lived human experiences and dynamics.
P$^2$ Law: Scaling Law for Post-Training After Model Pruning
Chen, Xiaodong, Hu, Yuxuan, Zhang, Xiaokang, Wang, Yanling, Li, Cuiping, Chen, Hong, Zhang, Jing
Pruning has become a widely adopted technique for reducing the hardware requirements of large language models (LLMs). To recover model performance after pruning, post-training is commonly employed to mitigate the resulting performance degradation. While post-training benefits from larger datasets, once the dataset size is already substantial, increasing the training data provides only limited performance gains. To balance post-training cost and model performance, it is necessary to explore the optimal amount of post-training data.Through extensive experiments on the Llama-3 and Qwen-2.5 series models, pruned using various common pruning methods, we uncover the scaling \textbf{Law} for \textbf{P}ost-training after model \textbf{P}runing, referred to as the P$^2$ Law.This law identifies four key factors for predicting the pruned model's post-training loss: the model size before pruning, the number of post-training tokens, the pruning rate, and the model's loss before pruning. Moreover, P$^2$ Law can generalize to larger dataset sizes, larger model sizes, and higher pruning rates, offering valuable insights for the post-training of pruned LLMs.
Semantic Component Analysis: Discovering Patterns in Short Texts Beyond Topics
Eichin, Florian, Schuster, Carolin M., Groh, Georg, Hedderich, Michael A.
Topic modeling is a key method in text analysis, but existing approaches are limited by assuming one topic per document or fail to scale efficiently for large, noisy datasets of short texts. We introduce Semantic Component Analysis (SCA), a novel topic modeling technique that overcomes these limitations by discovering multiple, nuanced semantic components beyond a single topic in short texts which we accomplish by introducing a decomposition step to the clustering-based topic modeling framework. We evaluate SCA on Twitter datasets in English, Hausa and Chinese. It achieves competetive coherence and diversity compared to BERTopic, while uncovering at least double the semantic components and maintaining a noise rate close to zero. Furthermore, SCA is scalable and effective across languages, including an underrepresented one.
SoK: On Closing the Applicability Gap in Automated Vulnerability Detection
Shereen, Ezzeldin, Ristea, Dan, Vyas, Sanyam, McFadden, Shae, Dwyer, Madeleine, Hicks, Chris, Mavroudis, Vasilios
The frequent discovery of security vulnerabilities in both open-source and proprietary software underscores the urgent need for earlier detection during the development lifecycle. Initiatives such as DARPA's Artificial Intelligence Cyber Challenge (AIxCC) aim to accelerate Automated Vulnerability Detection (AVD), seeking to address this challenge by autonomously analyzing source code to identify vulnerabilities. This paper addresses two primary research questions: (RQ1) How is current AVD research distributed across its core components? (RQ2) What key areas should future research target to bridge the gap in the practical applicability of AVD throughout software development? To answer these questions, we conduct a systematization over 79 AVD articles and 17 empirical studies, analyzing them across five core components: task formulation and granularity, input programming languages and representations, detection approaches and key solutions, evaluation metrics and datasets, and reported performance. Our systematization reveals that the narrow focus of AVD research-mainly on specific tasks and programming languages-limits its practical impact and overlooks broader areas crucial for effective, real-world vulnerability detection. We identify significant challenges, including the need for diversified problem formulations, varied detection granularities, broader language support, better dataset quality, enhanced reproducibility, and increased practical impact. Based on these findings we identify research directions that will enhance the effectiveness and applicability of AVD solutions in software security.
No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts
Fama, Israel, Bueno, Bรกrbara, Alcoforado, Alexandre, Ferraz, Thomas Palmeira, Moya, Arnold, Costa, Anna Helena Reali
In a context where the Brazilian judiciary system, the largest in the world, faces a crisis due to the slow processing of millions of cases, it becomes imperative to develop efficient methods for analyzing legal texts. We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts. Our approach processes the full text regardless of its length while maintaining reasonable computational overhead. Our experiments demonstrate that uBERT achieves superior performance compared to BERT+LSTM when overlapping input is used and is significantly faster than ULMFiT for processing long legal documents.
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
Watson, William, Cho, Nicole, Srishankar, Nishan, Zeng, Zhen, Cecchi, Lucas, Scott, Daniel, Siddagangappa, Suchetha, Kaur, Rachneet, Balch, Tucker, Veloso, Manuela
Legal contracts in the custody and fund services domain govern critical aspects such as key provider responsibilities, fee schedules, and indemnification rights. However, it is challenging for an off-the-shelf Large Language Model (LLM) to ingest these contracts due to the lengthy unstructured streams of text, limited LLM context windows, and complex legal jargon. To address these challenges, we introduce LAW (Legal Agentic Workflows for Custody and Fund Services Contracts). LAW features a modular design that responds to user queries by orchestrating a suite of domain-specific tools and text agents. Our experiments demonstrate that LAW, by integrating multiple specialized agents and tools, significantly outperforms the baseline. LAW excels particularly in complex tasks such as calculating a contract's termination date, surpassing the baseline by 92.9% points. Furthermore, LAW offers a cost-effective alternative to traditional fine-tuned legal LLMs by leveraging reusable, domain-specific tools.
Cross-Document Event-Keyed Summarization
Walden, William, Kuchmiichuk, Pavlo, Martin, Alexander, Jin, Chihsheng, Cao, Angela, Sun, Claire, Allen, Curisia, White, Aaron Steven
Event-keyed summarization (EKS) requires summarizing a specific event described in a document given the document text and an event representation extracted from it. In this work, we extend EKS to the cross-document setting (CDEKS), in which summaries must synthesize information from accounts of the same event as given by multiple sources. We introduce SEAMUS (Summaries of Events Across Multiple Sources), a high-quality dataset for CDEKS based on an expert reannotation of the FAMUS dataset for cross-document argument extraction. We present a suite of baselines on SEAMUS -- covering both smaller, fine-tuned models, as well as zero- and few-shot prompted LLMs -- along with detailed ablations and a human evaluation study, showing SEAMUS to be a valuable benchmark for this new task.
Now Meta is trying to stop OpenAI's for-profit conversion too
Meta sent a letter to California's attorney general on Thursday urging him to stop OpenAI from converting to a for-profit company, a move that Meta says would be "wrong" and "could lead to a proliferation of similar start-up ventures that are notionally charitable until they are potentially profitable." The letter from Meta Platforms to Attorney General Rob Bonta, first reported on by The Wall Street Journal, comes on the heels of an injunction filed by Elon Musk at the end of November that also asked for OpenAI's conversion to be blocked. Meta argues in its letter, which The Verge has published in full, that OpenAI was able to raise billions of dollars from investors under its original nonprofit mission and now "wants to change its status while retaining all of the benefits that enabled it to reach the point it has today." It goes on to say, "OpenAI should not be allowed to flout the law by taking and reappropriating assets it built as a charity and using them for potentially enormous private gains." The letter also calls upon the attorney general to look into OpenAI's past practices as a nonprofit.
OpenAI whistleblower found dead in San Francisco apartment
OpenAI says its models are "trained on publicly available data". Mr Balaji left the company in August, telling the New York Times he had since been working on personal projects. He grew up in Cupertino, California, before going to study computer science at the University of California, Berkeley. A spokesperson for OpenAI said in a statement cited by CNBC News that it was "devastated to learn of this incredibly sad news today and our hearts go out to Suchir's loved ones during this difficult time". US and Canadian news publishers, including the New York Times, and a group of best-selling writers, including John Grisham, have filed lawsuits claiming the company was illegally using news articles to train its software.