lamppost
Large Language Models and Mathematical Reasoning Failures
This paper investigates the mathematical reasoning capabilities of large language models (LLMs) using 50 newly constructed high-school-level word problems. Unlike prior studies that focus solely on answer correctness, we rigorously analyze both final answers and solution steps to identify reasoning failures. Evaluating eight state-of-the-art models - including Mixtral, Llama, Gemini, GPT-4o, and OpenAI's o1 variants - we find that while newer models (e.g., o3-mini, deepseek-r1) achieve higher accuracy, all models exhibit errors in spatial reasoning, strategic planning, and arithmetic, sometimes producing correct answers through flawed logic. Common failure modes include unwarranted assumptions, over-reliance on numerical patterns, and difficulty translating physical intuition into mathematical steps. Manual analysis reveals that models struggle with problems requiring multi-step deduction or real-world knowledge, despite possessing broad mathematical knowledge. Our results underscore the importance of evaluating reasoning processes, not just answers, and caution against overestimating LLMs' problem-solving proficiency. The study highlights persistent gaps in LLMs' generalization abilities, emphasizing the need for targeted improvements in structured reasoning and constraint handling.
Five ways you might already encounter AI in cities (and not realise it)
You'd probably notice if the car that cut you off or pulled up beside you at a light didn't have a driver. In the UK, self-driving cars are still required by law to have a safety driver at the wheel, so it is difficult to notice them. But car companies have been testing automated vehicles on UK roads at least since 2017. Self-driving cars use Artificial Intelligence (AI) technology to steer themselves and navigate around obstacles. This technology is being introduced in many different ways, for example in cameras that detect whether people are speeding or using mobile phones while driving.
- Automobiles & Trucks (0.76)
- Transportation > Ground > Road (0.71)
A Federated Learning-enabled Smart Street Light Monitoring Application: Benefits and Future Challenges
Anand, Diya, Mavromatis, Ioannis, Carnelli, Pietro, Khan, Aftab
Data-enabled cities are recently accelerated and enhanced with automated learning for improved Smart Cities applications. In the context of an Internet of Things (IoT) ecosystem, the data communication is frequently costly, inefficient, not scalable and lacks security. Federated Learning (FL) plays a pivotal role in providing privacy-preserving and communication efficient Machine Learning (ML) frameworks. In this paper we evaluate the feasibility of FL in the context of a Smart Cities Street Light Monitoring application. FL is evaluated against benchmarks of centralised and (fully) personalised machine learning techniques for the classification task of the lampposts operation. Incorporating FL in such a scenario shows minimal performance reduction in terms of the classification task, but huge improvements in the communication cost and the privacy preserving. These outcomes strengthen FL's viability and potential for IoT applications.
- Oceania > Australia > New South Wales > Sydney (0.06)
- Europe > United Kingdom > England > Bristol (0.05)
- Europe > United Kingdom > England > South Gloucestershire (0.04)
- (3 more...)
Divergent thinking and true AI innovation - DataScienceCentral.com
Research and Markets estimated that annual global sales of information technology reached nearly $8.4 trillion in 2021. At that level, IT sales made up just less than 9% total estimated global annual gross domestic product (GDP). Global IT sales tend to grow about 6.6 percent annually. For the sake of argument, let's assume that the annual IT sales growth averages 6.6 percent from 2022 through 2030. This assumption includes global GDP growth for the period averaging just over 3.0 percent annually.
Travellers at Gatwick airport will have their cars parked by ROBOTS
Passengers heading to Gatwick airport and leaving their car there will soon have it whisked away by a robot valet. The fleet of droids will put cars closer to one another than is possible with human drivers and therefore be able to fit a third more cars in the same area. A trial is starting in August which will see customers leave their car in a drop-off zone before summoning a robot through a designated app. Military grade GPS will guide the machine to the car where forklift-like equipment will approach the car from the front, slide under the car's body and move it to a specific spot. Passengers heading to Gatwick airport and leaving their car there will soon have it whisked away by a robot valet.
- Europe > United Kingdom > England > West Sussex (0.83)
- North America > United States > New York (0.06)
- Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.06)
- Europe > France > Île-de-France > Val-d'Oise > Roissy (0.06)
- Transportation > Infrastructure & Services > Airport (0.99)
- Transportation > Air (0.99)
Singapore to Test Facial Recognition on Lampposts, Stoking Privacy Fears
A spokeswoman for SenseTime, a facial-recognition software company dual-based in Beijing and Hong Kong, said it was "exploring the situation" and declined further comment. The company includes Singapore's state investor Temasek as one of its backers following a $600 million funding round which closed on Monday.