circular
Visualization Biases MLLM's Decision Making in Network Data Tasks
Brand, Timo, Förster, Henry, Kobourov, Stephen G., Miller, Jacob
We evaluate how visualizations can influence the judgment of MLLMs about the presence or absence of bridges in a network. We show that the inclusion of visualization improves confidence over a structured text-based input that could theoretically be helpful for answering the question. On the other hand, we observe that standard visualization techniques create a strong bias towards accepting or refuting the presence of a bridge -- independently of whether or not a bridge actually exists in the network. While our results indicate that the inclusion of visualization techniques can effectively influence the MLLM's judgment without compromising its self-reported confidence, they also imply that practitioners must be careful of allowing users to include visualizations in generative AI applications so as to avoid undesired hallucinations.
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Crowdsourced human-based computational approach for tagging peripheral blood smear sample images from Sickle Cell Disease patients using non-expert users
Rubio, José María Buades, Moyà-Alcover, Gabriel, Jaume-i-Capó, Antoni, Petrović, Nataša
Supervised machine learning methods rely on tagged training data [1]. The more tagged training data that is available, the more accurately the model can learn to recognize patterns and generalize to unseen data. Crowdsourcing and Human-Based Computation (HBC) has become an increasingly popular approach for acquiring training labels in machine learning classification tasks, as it can be a cost-effective way to share the labeling effort among a large number of annotators. This approach can be particularly useful in cases where expert labeling is expensive or not feasible, or where a large amount of labeled data is needed to train a machine learning model [2]. There exist various tactics for human users to contribute their problem-solving skills [3]: Altruistic contribution: This strategy involves appealing to the altruistic nature of individuals willing to contribute their time and skills to solve problems for the common good [4-6]. Gamification: This strategy involves creating engaging and fun video games incorporating problem-solving tasks [7-9].
- Europe > Spain > Balearic Islands > Mallorca > Palma (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Iowa (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Hematology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
- Leisure & Entertainment > Games > Computer Games (0.68)
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Wei, Fangyun, Chen, Xi, Luo, Lin
Despite their sophisticated capabilities, large language models (LLMs) encounter a major hurdle in effective assessment. This paper first revisits the prevalent evaluation method-multiple choice question answering (MCQA), which allows for straightforward accuracy measurement. Through a comprehensive evaluation of 24 models across 11 benchmarks, we highlight several potential drawbacks of MCQA, for instance, the inconsistency between the MCQA evaluation and the generation of open-ended responses in practical scenarios. In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge. Each LLM receives an Elo rating thereafter. This system is designed to mirror real-world usage, and for this purpose, we have compiled a new benchmark called ``Real-world questions'' (RWQ), comprising 20,772 authentic user inquiries. Additionally, we thoroughly analyze the characteristics of our system and compare it with prior leaderboards like AlpacaEval and MT-Bench. Our analysis reveals the stability of our RWQ-Elo system, the feasibility of registering new models, and its potential to reshape LLM leaderboards.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia (0.04)
- Media > Music (0.67)
- Leisure & Entertainment > Games (0.55)
Peggy Smedley Show: Textiles Go Circular
Only about 15% of used clothes and other textiles in the United States get reused or recycle. The rest head straight to the landfill or incinerator. Peggy talks about how to address this, citing research in a new report about how to facilitate a circular economy for textiles. She also discusses: The biggest hurdles facing the textile industry. What can be done to address these hurdles. Actions that businesses can take now to move to a more circular economy. (5/17/22 - 771) IoT, Internet of Things, Peggy Smedley, artificial intelligence, machine learning, big data, digital transformation, cybersecurity, blockchain, 5G, cloud, sustainability, future of work, podcast This episode is available on all major streaming platforms. If you enjoyed this segment, please consider leaving a review on Apple Podcasts.
Turkey's National Artificial Intelligence Strategy has been Published
The National Artificial Intelligence Strategy (2021-2025) is prepared by the Digital Transformation Office of the Presidency and the Ministry of Industry and Technology, taking the opinions of other stakeholders in order to determine a roadmap for the studies carried out in the field of artificial intelligence ("AI") in Turkey. Within this scope, Circular numbered 2021/18 on the National Artificial Intelligence Strategy ("Circular") was published in the Official Gazette dated 20 August 2021 and numbered 31574, and the National Artificial Intelligence Strategy Document ("Strategy") on Digital Transformation Office of the Presidency's website on 24 August 2021. It has been decided to establish a "National Artificial Intelligence Strategy Steering Committee" ("Steering Committee") with the participation of the Head of the Digital Transformation Office of the Presidency and the Deputy Minister of the Ministry of Industry and Technology in order to develop policies at the national level, disseminate the use of artificial intelligence technologies and monitor the applications within this scope. The Steering Committee will convene at least once every three months and may form sub-committees, advisory and working groups when it deems necessary. The vision is determined as generating value on a global scale with an agile and sustainable AI ecosystem for Turkey.