dessert
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
A Dubai chocolate-inspired dessert has taken S Korea by storm
You must have heard of Dubai chocolate: the sticky, indulgent confectionary filled with pistachio cream, tahini and shreds of knafeh pastry, which has become a global sensation. Now the decadent bar has inspired South Korea's latest dessert craze. The Dubai chewy cookie has been selling like wildfire - and even restaurants that don't usually offer baked goods are trying to get a nibble of the market. Despite its name, the cookie's texture more closely resembles a rice cake, and is made by stuffing pistachio cream and knafeh shreds into a chocolate marshmallow. Shops are selling hundreds of cookies within minutes and the frenzy has sent prices of key ingredients surging, local media reported.
- Asia > South Korea (0.74)
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.50)
- North America > United States (0.16)
- (18 more...)
- Leisure & Entertainment (1.00)
- Government > Regional Government (0.98)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs
Sirdeshmukh, Ved, Deshpande, Kaustubh, Mols, Johannes, Jin, Lifeng, Cardona, Ed-Yeremai, Lee, Dean, Kritz, Jeremy, Primack, Willow, Yue, Summer, Xing, Chen
We present MultiChallenge, a pioneering benchmark evaluating large language models (LLMs) on conducting multi-turn conversations with human users, a crucial yet underexamined capability for their applications. MultiChallenge identifies four categories of challenges in multi-turn conversations that are not only common and realistic among current human-LLM interactions, but are also challenging to all current frontier LLMs. All 4 challenges require accurate instruction-following, context allocation, and in-context reasoning at the same time. We also develop LLM as judge with instance-level rubrics to facilitate an automatic evaluation method with fair agreement with experienced human raters. Despite achieving near-perfect scores on existing multi-turn evaluation benchmarks, all frontier models have less than 50% accuracy on MultiChallenge, with the top-performing Claude 3.5 Sonnet (June 2024) achieving just a 41.4% average accuracy.
- Africa > Middle East > Egypt (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- (13 more...)
- Workflow (1.00)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Personal (0.68)
- Media > Film (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Consumer Health (1.00)
- (5 more...)
Transparent Neighborhood Approximation for Text Classifier Explanation
Cai, Yi, Zimek, Arthur, Ntoutsi, Eirini, Wunder, Gerhard
Recent literature highlights the critical role of neighborhood construction in deriving model-agnostic explanations, with a growing trend toward deploying generative models to improve synthetic instance quality, especially for explaining text classifiers. These approaches overcome the challenges in neighborhood construction posed by the unstructured nature of texts, thereby improving the quality of explanations. However, the deployed generators are usually implemented via neural networks and lack inherent explainability, sparking arguments over the transparency of the explanation process itself. To address this limitation while preserving neighborhood quality, this paper introduces a probability-based editing method as an alternative to black-box text generators. This approach generates neighboring texts by implementing manipulations based on in-text contexts. Substituting the generator-based construction process with recursive probability-based editing, the resultant explanation method, XPROB (explainer with probability-based editing), exhibits competitive performance according to the evaluation conducted on two real-world datasets. Additionally, XPROB's fully transparent and more controllable construction process leads to superior stability compared to the generator-based explainers.
- Europe > Germany > Berlin (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (2 more...)
- Government (0.68)
- Education (0.46)
Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision
Wu, Haoning, Zhang, Zicheng, Zhang, Erli, Chen, Chaofeng, Liao, Liang, Wang, Annan, Li, Chunyi, Sun, Wenxiu, Yan, Qiong, Zhai, Guangtao, Lin, Weisi
The rapid evolution of Multi-modality Large Language Models (MLLMs) has catalyzed a shift in computer vision from specialized models to general-purpose foundation models. Nevertheless, there is still an inadequacy in assessing the abilities of MLLMs on low-level visual perception and understanding. To address this gap, we present Q-Bench, a holistic benchmark crafted to systematically evaluate potential abilities of MLLMs on three realms: low-level visual perception, low-level visual description, and overall visual quality assessment. a) To evaluate the low-level perception ability, we construct the LLVisionQA dataset, consisting of 2,990 diverse-sourced images, each equipped with a human-asked question focusing on its low-level attributes. We then measure the correctness of MLLMs on answering these questions. b) To examine the description ability of MLLMs on low-level information, we propose the LLDescribe dataset consisting of long expert-labelled golden low-level text descriptions on 499 images, and a GPT-involved comparison pipeline between outputs of MLLMs and the golden descriptions. c) Besides these two tasks, we further measure their visual quality assessment ability to align with human opinion scores. Specifically, we design a softmax-based strategy that enables MLLMs to predict quantifiable quality scores, and evaluate them on various existing image quality assessment (IQA) datasets. Our evaluation across the three abilities confirms that MLLMs possess preliminary low-level visual skills. However, these skills are still unstable and relatively imprecise, indicating the need for specific enhancements on MLLMs towards these abilities. We hope that our benchmark can encourage the research community to delve deeper to discover and enhance these untapped potentials of MLLMs. Project Page: https://q-future.github.io/Q-Bench.
- Transportation > Air (1.00)
- Media (0.67)
- Transportation > Passenger (0.67)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (0.92)
RaTE: a Reproducible automatic Taxonomy Evaluation by Filling the Gap
Gao, Tianjian, Langlais, Phillipe
Taxonomies are an essential knowledge representation, yet most studies on automatic taxonomy construction (ATC) resort to manual evaluation to score proposed algorithms. We argue that automatic taxonomy evaluation (ATE) is just as important as taxonomy construction. We propose RaTE, an automatic label-free taxonomy scoring procedure, which relies on a large pre-trained language model. We apply our evaluation procedure to three state-of-the-art ATC algorithms with which we built seven taxonomies from the Yelp domain, and show that 1) RaTE correlates well with human judgments and 2) artificially degrading a taxonomy leads to decreasing RaTE score.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- (2 more...)
Explaining text classifiers through progressive neighborhood approximation with realistic samples
Cai, Yi, Zimek, Arthur, Ntoutsi, Eirini, Wunder, Gerhard
The importance of neighborhood construction in local explanation methods has been already highlighted in the literature. And several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models. Although the generators produce more realistic samples, the intuitive sampling approaches in the existing solutions leave the latent space underexplored. To overcome this problem, our work, focusing on local model-agnostic explanations for text classifiers, proposes a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage interpolation using counterfactuals as landmarks. We explicitly specify the two properties that should be satisfied by generative models, the reconstruction ability and the locality-preserving property, to guide the selection of generators for local explanation methods. Moreover, noticing the opacity of generative models during the study, we propose another method that implements progressive neighborhood approximation with probability-based editions as an alternative to the generator-based solution. The explanation results from both methods consist of word-level and instance-level explanations benefiting from the realistic neighborhood. Through exhaustive experiments, we qualitatively and quantitatively demonstrate the effectiveness of the two proposed methods.
- Europe > Germany > Berlin (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (3 more...)
- Research Report (0.64)
- Overview (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.76)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
At SoftBank cafe in Tokyo, Pepper the robot will take your order
Soon the Japanese capital's trendsetting Shibuya district will boast a cafe staffed by humanoid robots that can recommend perfect desserts for customers. SoftBank Robotics on Tuesday unveiled to the press its directly run Pepper Parlor cafe, where robots take orders, engage in small talk with customers and clean up among other tasks. Customers place orders through Pepper robots placed near the entrance. They will also help customers decide what dessert to order based on the facial expression of a customer. "Let me recommend a waffle that is perfect for you," a robot told one customer.
- Telecommunications (0.68)
- Information Technology (0.68)
Michael's in Santa Monica still looks like 1979, but it tastes very 2017
Have you been to Michael's lately? Because the Stellas are still on the walls, the Charles Garabedian drawings are still kind of naughty, and the guys at the front bar are still drinking complicated things that involve whiskey more expensive than you can afford. It's all very disco-era until you get out to the tented patio, where it is still pretty late-'70s except that the Robert Graham frieze is as good as anything you've seen at a museum lately and the foliage springs eternal; the seaside California we all wish we still lived in, where the people at the next table are just back from the Venice Biennale and you could probably throw together a gallery exhibit featuring nothing more than the customers' shoes. But that bowl in front of you -- it might contain a bit of chopped summer squash, some cherries, rose geranium-scented cream and crisped grain; a vegetable appetizer that could pass as dessert. The wine in your glass is likely to be an orangey-pink skin-contact white from Slovenia instead of a Napa Sauvignon Blanc, and the bread on the table is dark and profoundly sour.
- North America > United States > California > Los Angeles County > Santa Monica (0.43)
- North America > United States > California > Monterey County > Seaside (0.25)
- Europe > Slovenia (0.25)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)