louis
She didn't get an apartment because of an AI-generated score – and sued to help others avoid the same fate
That was the score Mary Louis was given by an AI-powered tenant screening tool. The software, SafeRent, didn't explain in its 11-page report how the score was calculated or how it weighed various factors. It didn't say what the score actually signified. It just displayed Louis's number and determined it was too low. Louis, who works as a security guard, had applied for an apartment in an eastern Massachusetts suburb.
- North America > United States > Massachusetts (0.25)
- North America > United States > Arkansas (0.05)
- Government > Regional Government > North America Government > United States Government (1.00)
- Banking & Finance (0.99)
- Law > Litigation (0.86)
Benchmarking Benchmark Leakage in Large Language Models
Xu, Ruijie, Wang, Zengzhi, Fan, Run-Ze, Liu, Pengfei
Amid the expanding use of pre-training data, the phenomenon of benchmark dataset leakage has become increasingly prominent, exacerbated by opaque training processes and the often undisclosed inclusion of supervised data in contemporary Large Language Models (LLMs). This issue skews benchmark effectiveness and fosters potentially unfair comparisons, impeding the field's healthy development. To address this, we introduce a detection pipeline utilizing Perplexity and N-gram accuracy, two simple and scalable metrics that gauge a model's prediction precision on benchmark, to identify potential data leakages. By analyzing 31 LLMs under the context of mathematical reasoning, we reveal substantial instances of training even test set misuse, resulting in potentially unfair comparisons. These findings prompt us to offer several recommendations regarding model documentation, benchmark setup, and future evaluations. Notably, we propose the "Benchmark Transparency Card" to encourage clear documentation of benchmark utilization, promoting transparency and healthy developments of LLMs. we have made our leaderboard, pipeline implementation, and model predictions publicly available, fostering future research.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (9 more...)
MultiPoT: Multilingual Program of Thoughts Harnesses Multiple Programming Languages
Luo, Xianzhen, Zhu, Qingfu, Zhang, Zhiming, Qin, Libo, Wang, Xu, Yang, Qing, Xu, Dongliang, Che, Wanxiang
Program of Thoughts (PoT) is an approach characterized by its executable intermediate steps, which ensure the accuracy of the numerical calculations in the reasoning process. Currently, PoT primarily uses Python. However, relying solely on a single language may result in suboptimal solutions and overlook the potential benefits of other programming languages. In this paper, we conduct comprehensive experiments on the programming languages used in PoT and find that no single language consistently delivers optimal performance across all tasks and models. The effectiveness of each language varies depending on the specific scenarios. Inspired by this, we propose a task and model agnostic approach called MultiPoT, which harnesses strength and diversity from various languages. Experimental results reveal that it significantly outperforms Python Self-Consistency. Furthermore, it achieves comparable or superior performance compared to the best monolingual PoT in almost all tasks across all models. In particular, MultiPoT achieves more than 4.6\% improvement on average on both Starcoder and ChatGPT (gpt-3.5-turbo).
- Asia > China > Heilongjiang Province > Harbin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (4 more...)
- Information Technology > Software > Programming Languages (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Dallas County man gets 3 years for $1.2M online romance scam
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. A Texas man who was part of a romance scam that bilked a Missouri woman out of $1.2 million was sentenced on Tuesday to three years in federal prison and ordered to repay the money. Rotimi Oladimeji, 38, of Richardson, Texas, was sentenced one year after he pleaded guilty to two counts of mail fraud, two counts of wire fraud and one count of conspiracy to commit mail fraud and wire fraud, the U.S. Attorney's office in St. Louis said in a news release. Oladimeji and two others spotted the victim on the "Silver Singles" online dating site, prosecutors said.
- North America > United States > Texas > Dallas County > Richardson (0.27)
- North America > United States > Missouri (0.27)
- Asia > Middle East > UAE (0.07)
- Africa > Nigeria (0.07)
- Law > Criminal Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Unsupervised Summarization Re-ranking
Ravaut, Mathieu, Joty, Shafiq, Chen, Nancy
With the rise of task-specific pre-training objectives, abstractive summarization models like PEGASUS offer appealing zero-shot performance on downstream summarization tasks. However, the performance of such unsupervised models still lags significantly behind their supervised counterparts. Similarly to the supervised setup, we notice a very high variance in quality among summary candidates from these models while only one candidate is kept as the summary output. In this paper, we propose to re-rank summary candidates in an unsupervised manner, aiming to close the performance gap between unsupervised and supervised models. Our approach improves the unsupervised PEGASUS by up to 7.27% and ChatGPT by up to 6.86% relative mean ROUGE across four widely-adopted summarization benchmarks ; and achieves relative gains of 7.51% (up to 23.73% from XSum to WikiHow) averaged over 30 zero-shot transfer setups (finetuning on a dataset, evaluating on another).
- North America > United States (1.00)
- Europe (1.00)
- Asia > Middle East > Republic of Türkiye (0.92)
- Research Report (1.00)
- Personal (0.93)
- Law (1.00)
- Health & Medicine (1.00)
- Government > Voting & Elections (1.00)
- (4 more...)
Internship – Data Engineering and Data Science at Xplor - St. Louis, MO, United States
Take a seat on the rocket ship and join us as a summer intern within our technology department. We're a global team of builders, listeners and problem-solvers who are relentlessly focused on making life simple, so our customers can get back to growing their business, engaging consumers and doing what they love. At Xplor, the Central Technology Team has one main purpose: to enable and complement the business strategies and goals while solving real problems for our customers and users. We have dozens of applications in our everyday-life verticals that all have their technology uniqueness and their individual purpose. We also use some of the latest technology in Microsoft Azure, AWS, and Containers and are constantly looking to find innovative new ways to meet the challenges of running a unique global business.
- Information Technology > Data Science (0.89)
- Information Technology > Artificial Intelligence (0.71)
Hyperparameter Tuning with Python: Boost your machine learning model's performance via hyperparameter tuning: Owen, Louis: 9781803235875: Amazon.com: Books
You'll start with an introduction to hyperparameter tuning and understand why it's important. Next, you'll learn the best methods for hyperparameter tuning for a variety of use cases and specific algorithm types. This book will not only cover the usual grid or random search but also other powerful underdog methods. Individual chapters are also dedicated to the three main groups of hyperparameter tuning methods: exhaustive search, heuristic search, Bayesian optimization, and multi-fidelity optimization. Later, you will learn about top frameworks like Scikit, Hyperopt, Optuna, NNI, and DEAP to implement hyperparameter tuning.
GitHub - jeffheaton/t81_558_deep_learning: Washington University (in St. Louis) Course T81-558: Applications of Deep Neural Networks
The content of this course changes as technology evolves, to keep up to date with changes follow me on GitHub. Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks that can handle tabular data, images, text, and audio as both input and output. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to classic neural network structures, Convolution Neural Networks (CNN), Long Short-Term Memory (LSTM), Gated Recurrent Neural Networks (GRU), General Adversarial Networks (GAN) and reinforcement learning.
Origami mini-robot does gymnastics for a good cause
Despite its small size, this soft robot can manoeuvre on solid ground and through water (pictured). A pea-sized origami robot can fold, unfold and perform a range of acrobatic moves -- potentially making it useful for many biomedical applications1. All prices are NET prices. VAT will be added later in the checkout. Tax calculation will be finalised during checkout.
- North America > United States > New York > New York County > New York City (0.09)
- North America > United States > Maryland > Baltimore (0.09)
- Europe > Germany > Saxony > Dresden (0.09)
- Information Technology > Artificial Intelligence > Robots (0.89)
- Information Technology > Artificial Intelligence > Games > Go (0.40)