Goto

Collaborating Authors

 pineapple


The Showdown Between Elon Musk and Sam Altman

WIRED

The relationship between Sam Altman and Elon Musk goes back to the early days of OpenAI--then, a non-profit research lab. But now, the two men find themselves in a very public feud over the billion dollar AI company. Today on the show, we catalogue their friendship-turned-feud and how the company that started it all still remains core to their beef. Write to us at uncannyvalley@wired.com. You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link.


Integrating LMM Planners and 3D Skill Policies for Generalizable Manipulation

Li, Yuelei, Yan, Ge, Macaluso, Annabella, Ji, Mazeyu, Zou, Xueyan, Wang, Xiaolong

arXiv.org Artificial Intelligence

The recent advancements in visual reasoning capabilities of large multimodal models (LMMs) and the semantic enrichment of 3D feature fields have expanded the horizons of robotic capabilities. These developments hold significant potential for bridging the gap between high-level reasoning from LMMs and low-level control policies utilizing 3D feature fields. In this work, we introduce LMM-3DP, a framework that can integrate LMM planners and 3D skill Policies. Our approach consists of three key perspectives: high-level planning, low-level control, and effective integration. For high-level planning, LMM-3DP supports dynamic scene understanding for environment disturbances, a critic agent with self-feedback, history policy memorization, and reattempts after failures. For low-level control, LMM-3DP utilizes a semantic-aware 3D feature field for accurate manipulation. In aligning high-level and low-level control for robot actions, language embeddings representing the high-level policy are jointly attended with the 3D feature field in the 3D transformer for seamless integration. We extensively evaluate our approach across multiple skills and long-horizon tasks in a real-world kitchen environment. Our results show a significant 1.45x success rate increase in low-level control and an approximate 1.5x improvement in high-level planning accuracy compared to LLM-based baselines. Demo videos and an overview of LMM-3DP are available at https://lmm-3dp-release.github.io.


GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution

Lu, Yining, Yu, Haoping, Khashabi, Daniel

arXiv.org Artificial Intelligence

Augmenting large language models (LLM) to use external tools enhances their performance across a variety of tasks. However, prior works over-rely on task-specific demonstration of tool use that limits their generalizability and computational cost due to making many calls to large-scale LLMs. We introduce GEAR, a computationally efficient query-tool grounding algorithm that is generalizable to various tasks that require tool use while not relying on task-specific demonstrations. GEAR achieves better efficiency by delegating tool grounding and execution to small language models (SLM) and LLM, respectively; while leveraging semantic and pattern-based evaluation at both question and answer levels for generalizable tool grounding. We evaluate GEAR on 14 datasets across 6 downstream tasks, demonstrating its strong generalizability to novel tasks, tools and different SLMs. Despite offering more efficiency, GEAR achieves higher precision in tool grounding compared to prior strategies using LLM prompting, thus improving downstream accuracy at a reduced computational cost. For example, we demonstrate that GEAR-augmented GPT-J and GPT-3 outperform counterpart tool-augmented baselines because of better tool use.


Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Shridhar, Kumar, Macina, Jakub, El-Assady, Mennatallah, Sinha, Tanmay, Kapur, Manu, Sachan, Mrinmaya

arXiv.org Artificial Intelligence

Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers. In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning. On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.


Google's AI Spotlights a Human Cognitive Glitch: Mistaking Fluent Speech for Fluent Thought

#artificialintelligence

When you read a sentence like this one, your past experience tells you that it's written by a thinking, feeling human. And, in this case, there is indeed a human typing these words: [Hi, there!]. But these days, some sentences that appear remarkably humanlike are actually generated by artificial intelligence systems trained on massive amounts of human text. People are so accustomed to assuming that fluent language comes from a thinking, feeling human that evidence to the contrary can be difficult to wrap your head around. How are people likely to navigate this relatively uncharted territory?


Using Machine Intelligence to Prioritise Code Review Requests

Saini, Nishrith, Britto, Ricardo

arXiv.org Artificial Intelligence

Modern Code Review (MCR) is the process of reviewing new code changes that need to be merged with an existing codebase. As a developer, one may receive many code review requests every day, i.e., the review requests need to be prioritised. Manually prioritising review requests is a challenging and time-consuming process. To address the above problem, we conducted an industrial case study at Ericsson aiming at developing a tool called Pineapple, which uses a Bayesian Network to prioritise code review requests. To validate our approach/tool, we deployed it in a live software development project at Ericsson, wherein more than 150 developers develop a telecommunication product. We focused on evaluating the predictive performance, feasibility, and usefulness of our approach. The results indicate that Pineapple has competent predictive performance (RMSE = 0.21 and MAE = 0.15). Furthermore, around 82.6% of Pineapple's users believe the tool can support code review request prioritisation by providing reliable results, and around 56.5% of the users believe it helps reducing code review lead time. As future work, we plan to evaluate Pineapple's predictive performance, usefulness, and feasibility through a longitudinal investigation.


What Makes Neural Networks Fragile

#artificialintelligence

What do the images below have in common? Most readers will quickly catch on that they are all seats, as in places to sit. It may have taken you less than a second to recognize this common characteristic. If I heed Andrew Ng's suggestion that anything a human can do in less than a second can be automated by a Neural Network, then I should be able to create an image classifier that recognizes seats. I could write a standard classifier using off-the-shelf python libraries.


Machine Learning Basics with the K-Nearest Neighbors Algorithm

#artificialintelligence

The k-nearest neighbors (KNN) algorithm is a simple, easy-to-implement supervised machine learning algorithm that can be used to solve both classification and regression problems. A supervised machine learning algorithm (as opposed to an unsupervised machine learning algorithm) is one that relies on labeled input data to learn a function that produces an appropriate output when given new unlabeled data. Imagine a computer is a child, we are its supervisor (e.g. We will show the child several different pictures, some of which are pigs and the rest could be pictures of anything (cats, dogs, etc). When we see a pig, we shout "pig!"


SA insurtech startup Pineapple to launch insurance product in activation on Friday [Updated] – Ventureburn

#artificialintelligence

After two years of development insurtech startup Pineapple is finally set to launch their insurance product -- with the startup claiming today in a statement, to have made insurance as simple "as snapping a picture". The startup's co-founder Ndabenhle Junior Ngulube told Ventureburn in an email today that the startup will on Friday hold an activation for the brand in Johannesburg, Pretoria and Cape Town -- saying more details would be revealed closer to the time. He was able to tell Ventureburn how premiums would be priced. "Once a user has snapped an image of an item they want to insure, we have artificial intelligence (AI) that recognises what the image is and proceeds to place that image in an appropriate category for pricing purposes -- if the AI fails to categorise the item, we allow the user to manually select the appropriate category for the item). We then require the user to enter the value of that item. "Based on this value and it's associated category -- as well as a few ...


What Intelligent Machines Can Do, And What They Can't - InformationWeek

@machinelearnbot

Are killer machines coming to annihilate mankind? Are we headed for a dystopian future where robots are our overlords? Are the Cylons already among us? Are concerns voiced by industry icons such as Elon Musk, who has donated millions to The Future of Life Institute, warranted? Oliver Schabenberger recently added a more measured voice to this debate in this commentary piece that he wrote for InformationWeek, pointing out that machines "are not surpassing us in thinking or learning."