Goto

Collaborating Authors

 changed


Stuck in the Quicksand of Numeracy, Far from AGI Summit: Evaluating LLMs' Mathematical Competency through Ontology-guided Perturbations

Hong, Pengfei, Ghosal, Deepanway, Majumder, Navonil, Aditya, Somak, Mihalcea, Rada, Poria, Soujanya

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have showcased striking results on existing logical reasoning benchmarks, with some models even surpassing human performance. However, the true depth of their competencies and robustness, in mathematical reasoning tasks, remains an open question. In response, we develop (i) an ontology of perturbations of maths questions, (ii) a semi-automatic method of perturbation, and (iii) a dataset of perturbed maths questions to probe the limits of LLM capabilities in mathematical reasoning tasks. These controlled perturbations span across multiple fine dimensions of the structural and representational aspects of maths questions. Using GPT-4, we generated the MORE dataset by perturbing randomly selected five seed questions from GSM8K. This process was guided by our ontology and involved a thorough automatic and manual filtering process, yielding a set of 216 maths problems. We conducted comprehensive evaluation of both closed-source and open-source LLMs on MORE. The results show a significant performance drop across all the models against the perturbed questions. This strongly suggests that current LLMs lack robust mathematical skills and deep reasoning abilities. This research not only identifies multiple gaps in the capabilities of current models, but also highlights multiple potential directions for future development. Our dataset will be made publicly available at https://huggingface.co/datasets/declare-lab/GSM8k_MORE.


3 Lectures That Changed My Data Science Career

#artificialintelligence

There is a lot of excitement around AI. Recently there has been an incredible amount of buzz around the demos of models like ChatGPT and Dall-E-2. As impressive as these systems are, I think it becomes increasingly important to keep a level head, and not get carried away in a sea of excitement. The following videos/lectures are more focused on how to think about data science projects, and how to attack a problem. I've found these lectures to be highly impactful in my career and enabled me to build effective and practical solutions that fit the exact needs of the companies I've worked for.


'Breath of the Wild' Changed the Way I Play Video Games

WIRED

At a certain point in my gaming life, everything changed. After spending most of my twenties marathoning titles for hours on end, emerging bleary-eyed from all-day gaming stints, my priorities shifted. I can't binge-play now, even if I still hear the call of the console and yearn to be swept up into a game. Moderation is key, but finding a way to unlearn unhealthy gaming habits is tough. Or, at least, it was until The Legend of Zelda: Breath of the Wild.


How Data Has Changed the World of HR

#artificialintelligence

In this "On the Job" segment from Cheddar News, Amin Venjara, General Manager of Data Solutions at ADP, describes the importance of data and how human resources leaders are relying on real-time access to data now more than ever. Venjara offers real-world examples of data's impact on the top challenges faced by organizations today. Businesses big and small have been utilizing the latest tech and innovation to make the new remote and hybrid working environments possible. Speaking with Cheddar News, above, Amin Venjara (AV), says relying on quality and accessible data to take action is how today's HR teams are impacting the modern workforce. Q: How does data influence the role of human resources (HR)?


The Turkish Drone That Changed the Nature of Warfare

The New Yorker

This content can also be viewed on the site it originates from. A video posted toward the end of February on the Facebook page of Valerii Zaluzhnyi, the commander-in-chief of Ukraine's armed forces, showed grainy aerial footage of a Russian military convoy approaching the city of Kherson. Russia had invaded Ukraine several days earlier, and Kherson, a shipbuilding hub at the mouth of the Dnieper River, was an important strategic site. At the center of the screen, a targeting system locked onto a vehicle in the middle of the convoy; seconds later, the vehicle exploded, and a tower of burning fuel rose into the sky. The Bayraktar TB2 is a flat, gray unmanned aerial vehicle (U.A.V.), with angled wings and a rear propeller.


Top 5 Ways in which Supercomputers have Changed Our Lives

#artificialintelligence

A myriad of developments including the growing volumes of data, and the emergence of new content and data-rich applications have increased our daily usage of artificial intelligence technologies. Amidst the advent of all these technologies, supercomputers have made their mark, taking AI-driven technologies to new highs. Nowadays, they are being used in every aspect of our lives, starting from developing medicines to detecting weather and playing online games, currently, supercomputers are playing a huge part in our lives. In this video, we will discuss the different ways in which these supercomputers have changed our lives. The National Weather Service now uses two room-sized supercomputers.


What Is Generative Art? And How Making It Changed My Understanding of the Body

#artificialintelligence

Generative art is anything that couples a code, or set of instructions, with a series of artificial events that can output endless variations, be it baskets, paintings, or NFTs. Before using computers, --a pioneer of generative art and one of the first women to use computers in her practice--created algorithmic art with nothing more than a pencil and a piece of paper. However, generative art before computers is a bit like astronomy before the telescope. Certain fields do not come into their own until a tool that bridges the gap between circumscribed human experience and uncircumscribed human imagination comes along. And that, with the advancement of computers and AI, is very much the case with generative art.


How AI Has Changed The World Of Gaming

#artificialintelligence

AI in video games has come a long way from older games where the software-controlled a Pong paddle or ghosts in Pac-Man. Modern AI can change behaviour and learn from players in real-time. It helps create entire worlds based on the player's choices and can even completely personalise a game and interactions with NPCs. Artificial intelligence or AI programmes are used in video games to predict player actions and react accordingly. It allows for more natural reactions and personalised stories and relationships with NPCs.


How AI and IoT has Changed the Sports Betting Industry for the Better

#artificialintelligence

With the coronavirus pandemic causing havoc across the globe, many companies, including sportsbooks, will have delayed responses when you reach out for help. When your smart devices are connected to the internet, you can get responses in real-time. For instance, if you have a transaction query, AIoT powered chatbots will answer your questions in real-time. Have you ever wondered why you see certain ads when you're on social media, checking out your email, or visiting any other website? Well, the AIoT stores your data whenever you visit a bookie site using your laptop or smartphone.


Covid Has Changed How We Work. With The Rise Of AI, Is Your Job At Risk?

#artificialintelligence

Because of Covid, employers and employees are redefining the way we will all work. We will see much more of a hybrid work environment and a lot more automation, led by AI-based solutions. As employers determine the best way to optimize work, they are embracing AI to get more done with a less-dedicated or hybridized workforce. Undoubtedly, many jobs are at risk from AI. What isn't so obvious is that AI will enhance many jobs.