Goto

Collaborating Authors

 beth


Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning

arXiv.org Artificial Intelligence

Do large language models (LLMs) have theory of mind? A plethora of papers and benchmarks have been introduced to evaluate if current models have been able to develop this key ability of social intelligence. However, all rely on limited datasets with simple patterns that can potentially lead to problematic blind spots in evaluation and an overestimation of model capabilities. We introduce ExploreToM, the first framework to allow large-scale generation of diverse and challenging theory of mind data for robust training and evaluation. Our approach leverages an A* search over a custom domain-specific language to produce complex story structures and novel, diverse, yet plausible scenarios to stress test the limits of LLMs. Our evaluation reveals that state-of-the-art LLMs, such as Llama-3.1-70B and GPT-4o, show accuracies as low as 0% and 9% on ExploreToM-generated data, highlighting the need for more robust theory of mind evaluation. As our generations are a conceptual superset of prior work, fine-tuning on our data yields a 27-point accuracy improvement on the classic ToMi benchmark (Le et al., 2019). ExploreToM also enables uncovering underlying skills and factors missing for models to show theory of mind, such as unreliable state tracking or data imbalances, which may contribute to models' poor performance on benchmarks.


Roku Breach Hits 567,000 Users

WIRED

After months of delays, the US House of Representatives voted on Friday to extend a controversial warrantless wiretap program for two years. Known as Section 702, the program authorizes the US government to collect the communications of foreigners overseas. But this collection also includes reams of communications from US citizens, which are stored for years and can later be warrantlessly accessed by the FBI, which has heavily abused the program. An amendment that would require investigators to obtain such a warrant failed to pass. A group of US lawmakers on Sunday unveiled a proposal that they hope will become the country's first nationwide privacy law.


Heads-Up Computing

Communications of the ACM

When queried about the larger significance of the Heads-Up vision, the authors reflect on a regular weekday in their lives--eight hours spent in front of a computer and another two hours on the smartphone. Achievements in digital productivity come too often at the cost of being removed from the real world. What wonderful digital technology humans have come to create, perhaps the most significant in the history of our co-evolution with tools. Could computing systems be so well-integrated that they not only support but enhance our experience of physical reality? The ability to straddle both worlds--the digital and non-digital--is increasingly pertinent, and we believe it is time for a paradigm shift. We invite individuals and organizations to join us in our journey to design for more seamless computing support, improving the way future generations live, learn, work and play.


Artificial Intelligence in Horse Racing Prediction

#artificialintelligence

I found Beth to be a well presented and easy to use platform for horse race betting. After I had set the parameters of what I wanted to bet on, I found regular recommendations would come through allowing me to bet in a more informed and analysed fashion than I otherwise would. I could then use Beth to fully track these bets throughout the day and look on as my profit accumulated.


Japan figuring out how to deliver goods untouched by people amid pandemic

The Japan Times

Getting products from one place to another with as little human contact as possible is becoming an imperative for businesses as retailers, warehouses and transport providers adapt to the coronavirus pandemic, seeking to minimize the risk of infections to their employees and customers. Tsubakimoto Chain Co. is seeing more demand for its sorting and conveyor systems as companies seek ways to move things around, while startup Hacobu sees an opportunity to boost use of its online platform for trucks to exchange information as they load and unload goods at warehouses, a process that's still mostly done on paper. The need for automation is especially acute in Japan, where a labor shortage was already putting pressure on companies to find ways to run their businesses with less people. Now, that transition is being spurred on by the pandemic, which has boosted online buying and raised concerns among shoppers about being infected by items delivered to their doors. All told, the market for next-generation logistics systems in Japan is set to more than double to ยฅ651 billion ($6 billion) through 2025 from 2018, according to Fuji Keizai Co., a Tokyo-based research firm.


Aurรจce Vettier: Humans and Machines Beyond Collaboration

#artificialintelligence

They are both engineers and started working together a few years ago, as strategy consultants. In 2016 they co-founded (with a third colleague and friend) a start-up called daco, later acquired by Veepee, with the idea of helping retailers to achieve growth through a deep knowledge of their competitors. The start-up leveraged the power of AI and image recognition to gain insightful information about competitors' strategy, offer, pricing, discount and store network, classifying products and making them comparable. Their working partnership has not been limited to business: indeed an equal interest in art and science pushed them to pursue also an artistic collaboration that led to the creation of the collective Aurรจce Vettier. The duo is based in Paris and investigates the space between real and imaginary. Their interest in expanding the creativity of both humans and machines pushes the concepts of creator and created, process and practice, leaving room for fascinating discussions between art, engineering and a territory still unexplored but capable of surprising.


How Investors Use Artificial Intelligence in the Stock Market - Beth.technology

#artificialintelligence

In episode 3 of Tech Lightning Rounds, Beth Kindig goes directly to the source of artificial intelligence expertise and hosts discussions with technologists who specialize in the field. Interviews are held in "lightning round" format, which are rapid interviews with tech experts for immediate depth on each topic. Artificial intelligence makes it possible for machines to learn from experience. To adjust for superior intelligence, the machines are trained as more data is analyzed. Highly publicized uses for AI include chess matches or self-driving cars.


Amazon's global career site

#artificialintelligence

Beth is a Senior Principal Technologist for Amazon Robotics. Beth has been Founder and CEO of several successful startups, most notably EXOS, Inc., which was venture capital backed and sold to Microsoft in 1996. Since then she has been involved in 30 start-ups in a variety of fields as a founder, investor, or advisor. She was an advisor and investor in Leap Frog and has been involved in entertainment and mobile companies. Beth is an acknowledged expert in VR, AR and the hand-device interface space and has been an expert in support of prior patent litigations.