herring
MultiZebraLogic: A Multilingual Logical Reasoning Benchmark
Bruun, Sofie Helene, Smart, Dan Saattrup
Measuring the full abilities of large language models (LLMs) requires benchmarks representing multiple tasks. We aim to create large, high-quality datasets for comparison of logical reasoning skills across several languages and of suitable difficulty for LLMs of various reasoning ability. We explore multiple ways of increasing difficulty. We generate zebra puzzles in multiple languages, themes, sizes and including 14 different clue types and 8 red herring types (uninformative clues). We find puzzle sizes 2x3 and 4x5 are sufficiently challenging for GPT-4o mini (a non-reasoning model) and o3-mini (a reasoning model), respectively. Including 5 red herrings decreases o3-mini puzzle-level accuracy on 4x5 puzzles by 15$\pm$7 %. Scores of o3-mini on 4x5 puzzles are not significantly affected by use of English vs. Danish or the common houses theme vs. the country-specific smoerrebroed theme. We find no correlation between difficulty and the selected clue types. Datasets of 128+1024 puzzles are published as MultiZebraLogic in each of nine Germanic languages for sizes 2x3 and 4x5. We publish code for puzzle generation, designed for adaptablity into more languages and themes.
- North America > Canada (0.04)
- Europe > Faroe Islands > Streymoy > Tórshavn (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia
Samir, Farhan, Park, Chan Young, Field, Anjalie, Shwartz, Vered, Tsvetkov, Yulia
To explain social phenomena and identify systematic biases, much research in computational social science focuses on comparative text analyses. These studies often rely on coarse corpus-level statistics or local word-level analyses, mainly in English. We introduce the InfoGap method -- an efficient and reliable approach to locating information gaps and inconsistencies in articles at the fact level, across languages. We evaluate InfoGap by analyzing LGBT people's portrayals, across 2.7K biography pages on English, Russian, and French Wikipedias. We find large discrepancies in factual coverage across the languages. Moreover, our analysis reveals that biographical facts carrying negative connotations are more likely to be highlighted in Russian Wikipedia. Crucially, InfoGap both facilitates large scale analyses, and pinpoints local document- and fact-level information gaps, laying a new foundation for targeted and nuanced comparative language analysis at scale.
- North America > United States (0.28)
- Europe > Ukraine (0.05)
- Europe > Russia (0.04)
- (15 more...)
What a successful AI team really looks like
As more companies scale AI projects, turning proof-of-concepts into drivers of business transformation, a clearer picture of what it takes to succeed with real-world AI is taking shape. When it comes to AI teams, a broader set of skills are required than previously known, with a particular need for people with experience in operations and in translating AI concepts into business terms and vice versa. Get the latest insights with our CIO Daily newsletter. In fact, enterprises need blended teams to succeed with AI, says Louise Herring, partner at McKinsey & Co. "If you look at the technical side, the emphasis is increasingly on how we can make sure we have production-ready code and we have elements available for reuse throughout the organization," she says. "But the key area of emphasis that we see first of all is about translators: people who can make the connection between the business and the technical side."
How AI-powered tech landed man in jail with scant evidence
Michael Williams' wife pleaded with him to remember their fishing trips with the grandchildren, how he used to braid her hair, anything to jar him back to his world outside the concrete walls of Cook County Jail. His three daily calls to her had become a lifeline, but when they dwindled to two, then one, then only a few a week, the 65-year-old Williams felt he couldn't go on. He made plans to take his life with a stash of pills he had stockpiled in his dormitory. Williams was jailed last August, accused of killing a young man from the neighborhood who asked him for a ride during a night of unrest over police brutality in May. But the key evidence against Williams didn't come from an eyewitness or an informant; it came from a clip of noiseless security video showing a car driving through an intersection, and a loud bang picked up by a network of surveillance microphones. Prosecutors said technology powered by a secret algorithm that analyzed noises detected by the sensors indicated Williams shot and killed the man. "I kept trying to figure out, how can they get away with using the technology like that against me?" said Williams, speaking publicly for the first time about his ordeal. Williams sat behind bars for nearly a year before a judge dismissed the case against him last month at the request of prosecutors, who said they had insufficient evidence.
- North America > United States > Illinois > Cook County > Chicago (0.07)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- South America (0.04)
- (15 more...)
Artificial Intelligence Used to Supercharge Battery Development for Electric Vehicles
Using machine learning, a Stanford-led research team has slashed battery testing times – a key barrier to longer-lasting, faster-charging batteries for electric vehicles. Using a new machine learning method, a Stanford-led research team has slashed battery testing times – a key barrier to longer-lasting, faster-charging batteries for electric vehicles – by nearly fifteenfold. Battery performance can make or break the electric vehicle experience, from driving range to charging time to the lifetime of the car. Now, artificial intelligence has made dreams like recharging an EV in the time it takes to stop at a gas station a more likely reality, and could help improve other aspects of battery technology. For decades, advances in electric vehicle batteries have been limited by a major bottleneck: evaluation times.
- Transportation > Ground > Road (1.00)
- Transportation > Electric Vehicle (1.00)
Herring, Not Herring: Deep Learning Accelerates Detection and Classification of Underwater Species
Canadian machine learning researchers from the University of Victoria have teamed up with government marine biologists and private remote sensing specialists to develop a system for improved detection and classification of schools of herring. The world's oceans are home to some 200,000 species of sea animals, including over 18,000 species of fish, more than 1,800 sea stars, 816 squids, 93 whales and dolphins and 8,900 clams and other bivalves, according to a 2015 report from the World Register of Marine Species. Ocean fishes come in a variety of shapes, sizes, and colors and live in many different depth and temperature environments. This diverse marine world is however under threat. A 2016 United Nations Food and Agriculture Organization's World Fisheries and Aquaculture report reveals that 89.5 percent of the world's fish stocks are either fully fished (catches are close to the maximum sustainable yield) or overfished (catches are unsustainable).
- Food & Agriculture > Fishing (0.74)
- Food & Agriculture > Agriculture (0.71)
Researchers use AI to detect schools of herring from acoustic data
Tracking the health of underwater species is critical to understanding the effects of climate change on marine ecosystems. Unfortunately, it's a time-consuming process -- biologists conduct studies with echosounders that use sonar to determine water and object depth, and they manually interpret the resulting 2D echograms. These interpretations are often prone to error and require pricey software like Echoview. Fortunately, a team of research scientists hailing from the University of Victoria in Canada are developing a machine learning method for detecting specific biological targets in acoustic survey data. In a preprint paper ("A Deep Learning based Framework for the Detection of Schools of Herring in Echograms"), they say that their approach -- which they tested on schools of herring -- might measurably improve the accuracy of environmental monitoring.
- Oceania > Australia > Queensland (0.06)
- North America > Canada > British Columbia (0.06)
A Deep Learning-based Framework for the Detection of Schools of Herring in Echograms
Rezvanifar, Alireza, Marques, Tunai Porto, Cote, Melissa, Albu, Alexandra Branzan, Slonimer, Alex, Tolhurst, Thomas, Ersahin, Kaan, Mudge, Todd, Gauthier, Stephane
Tracking the abundance of underwater species is crucial for understanding the effects of climate change on marine ecosystems. Biologists typically monitor underwater sites with echosounders and visualize data as 2D images (echograms); they interpret these data manually or semi-automatically, which is time-consuming and prone to inconsistencies. This paper proposes a deep learning framework for the automatic detection of schools of herring from echograms. Experiments demonstrated that our approach outperforms a traditional machine learning algorithm using hand-crafted features. Our framework could easily be expanded to detect more species of interest to sustainable fisheries.
- North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.15)
- Southern Ocean (0.04)
- South America > Chile (0.04)
- (2 more...)
Hopes are dim as U.S. and China resume high-stakes trade talks
WASHINGTON – President Donald Trump and China's Xi Jinping have plenty of reasons to call off their trade war. Both face weakening economies that would likely further deteriorate if their conflict escalates. Both are up against a formidable adversary that shows no inclination to yield. Both are tangled in political turmoil -- Trump with impeachment proceedings, Xi with angry protests in Hong Kong. Both, in short, would welcome some good news.
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Foreign Policy (1.00)
- Government > Commerce (1.00)
Predicting Vehicular Travel Times by Modeling Heterogeneous Influences Between Arterial Roads
Achar, Avinash (Tata Consultancy Services) | Sarangan, Venkatesh (Tata Consultancy Services) | Regikumar, Rohith (Tata Consultancy Services) | Sivasubramaniam, Anand (Pennsylvania State University)
Predicting travel times of vehicles in urban settings is a useful and tangible quantity of interest in the context of intelligent transportation systems. We address the problem of travel time prediction in arterial roads using data sampled from probe vehicles. There is only a limited literature on methods using data input from probe vehicles. The spatio-temporal dependencies captured by existing data driven approaches are either too detailed or very simplistic. We strike a balance of the existing data driven approaches to account for varying degrees of influence a given road may experience from its neighbors, while controlling the number of parameters to be learnt. Specifically, we use a NoisyOR conditional probability distribution (CPD) in conjunction with a dynamic Bayesian network (DBN) to model state transitions of various roads. We propose an efficient algorithm to learn model parameters. We also propose an algorithm for predicting travel times on trips of arbitrary durations. Using synthetic and real world data traces we demonstrate the superior performance of the proposed method under different traffic conditions.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > Portugal > Porto > Porto (0.04)
- (3 more...)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)