badge
Spotify adds 'Verified' badges to distinguish human artists from AI
Spotify adds'Verified' badges to distinguish human artists from AI Spotify is introducing a'Verified' badge to help users identify when artists on its platform are human, not AI-generated. The world's most-used music streaming service said the'Verified by Spotify' text and green checkmark icon would appear next to artist names when they meet defined standards demonstrating authenticity. This could include having linked social accounts on their artist profile, consistent listener activity or other signals of a real artist behind the profile, the company said, such as merchandise or concert dates. In its blog post, Spotify said more than 99% of the artists listeners actively search for will be verified, representing hundreds of thousands of artists. It said the process would prioritise acts with important contributions to music culture and history, rather than content farms, with the platform rolling out verification and badges over the coming weeks.
Gone Fishing: Neural Active Learning with Fisher Embeddings
There is an increasing need for effective active learning algorithms that are compatible with deep neural networks. This paper motivates and revisits a classic, Fisher-based active selection objective, and proposes BAIT, a practical, tractable, and high-performing algorithm that makes it viable for use with neural models. BAIT draws inspiration from the theoretical analysis of maximum likelihood estimators (MLE) for parametric models. It selects batches of samples by optimizing a bound on the MLE error in terms of the Fisher information, which we show can be implemented efficiently at scale by exploiting linear-algebraic structure especially amenable to execution on modern hardware. Our experiments demonstrate that BAIT outperforms the previous state of the art on both classification and regression problems, and is flexible enough to be used with a variety of model architectures.
Reflections on the Reproducibility of Commercial LLM Performance in Empirical Software Engineering Studies
Angermeir, Florian, Amougou, Maximilian, Kreitz, Mark, Bauer, Andreas, Linhuber, Matthias, Fucci, Davide, C., Fabiola Moyรณn, Mendez, Daniel, Gorschek, Tony
Large Language Models have gained remarkable interest in industry and academia. The increasing interest in LLMs in academia is also reflected in the number of publications on this topic over the last years. For instance, alone 78 of the around 425 publications at ICSE 2024 performed experiments with LLMs. Conducting empirical studies with LLMs remains challenging and raises questions on how to achieve reproducible results, for both researchers and practitioners. One important step towards excelling in empirical research on LLM and their application is to first understand to what extent current research results are eventually reproducible and what factors may impede reproducibility. This investigation is within the scope of our work. We contribute an analysis of the reproducibility of LLM-centric studies, provide insights into the factors impeding reproducibility, and discuss suggestions on how to improve the current state. In particular, we studied the 85 articles describing LLM-centric studies, published at ICSE 2024 and ASE 2024. Of the 85 articles, 18 provided research artefacts and used OpenAI models. We attempted to replicate those 18 studies. Of the 18 studies, only five were sufficiently complete and executable. For none of the five studies, we were able to fully reproduce the results. Two studies seemed to be partially reproducible, and three studies did not seem to be reproducible. Our results highlight not only the need for stricter research artefact evaluations but also for more robust study designs to ensure the reproducible value of future publications.
#ICML2025 outstanding position paper: Interview with Jaeho Kim on addressing the problems with conference reviewing
At this year's International Conference on Machine Learning (ICML2025), Jaeho Kim, Yunseok Lee and Seulki Lee won an outstanding position paper award for their work Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards. We hear from Jaeho about the problems they were trying to address, and their proposed author feedback mechanism and reviewer reward system. Our position paper addresses the problems plaguing current AI conference peer review systems, while also raising questions about the future direction of peer review. The imminent problem with the current peer review system in AI conferences is the exponential growth in paper submissions driven by increasing interest in AI. To put this with numbers, NeurIPS received over 30,000 submissions this year, while ICLR saw a 59.8% increase in submissions in just one year.
Introducing Answered with Evidence -- a framework for evaluating whether LLM responses to biomedical questions are founded in evidence
Baldwin, Julian D, Dinh, Christina, Mukerji, Arjun, Sanghavi, Neil, Gombar, Saurabh
The growing use of large language models (LLMs) for biomedical question answering raises concerns about the accuracy and evidentiary support of their responses. To address this, we present Answered with Evidence, a framework for evaluating whether LLM-generated answers are grounded in scientific literature. We analyzed thousands of physician-submitted questions using a comparative pipeline that included: (1) Alexandria, fka the Atropos Evidence Library, a retrieval-augmented generation (RAG) system based on novel observational studies, and (2) two PubMed-based retrieval-augmented systems (System and Perplexity). We found that PubMed-based systems provided evidence-supported answers for approximately 44% of questions, while the novel evidence source did so for about 50%. Combined, these sources enabled reliable answers to over 70% of biomedical queries. As LLMs become increasingly capable of summarizing scientific content, maximizing their value will require systems that can accurately retrieve both published and custom-generated evidence--or generate such evidence in real time.
Entrepreneur 'humiliated' after London Tech Week turns her and baby away
An entrepreneur has told how she was left feeling "humiliated" after being turned away from London Tech Week, an annual corporate event, because she was with her baby daughter. Davina Schonle was prevented from entering the event on Monday after travelling for three hours with her eight-month-old and had to cancel meetings with potential suppliers to her tech startup. Schonle told TheBusinessDesk.com that as she went to the entrance with her daughter in her pram: "I was asked if I was a VIP. I was then told I wasn't allowed in with a baby. I went to get my badge, but was then taken over to the organisers from Informa, who told me they weren't insured. But they asked again if I was a VIP or speaker, and later another lady came over and twisted my badge around to see, clearly checking to see if I was a VIP."
Rotten Tomatoes further dilutes its utility with 'Verified Hot' badge
Rotten Tomatoes just added a new "Verified Hot" badge that indicates an overall positive user score that will join the "Certified Fresh" badge for critic scores. To qualify for this designation, a movie or show needs to have a Verified Audience Score of 90 percent or higher. Finally, the dregs will be slapped with a "Stale" badge, which is for any show or movie that falls beneath 60 percent. Rotten Tomatoes is trying to get around review bombing here by mandating that user reviews be from people who actually saw the movie in question. There are a couple of little problems with this. It verifies that a consumer saw the movie via the ticketing firm Fandango, and there are plenty of other ticketing firms out there, including, you know, the theater cashier.
Meta changes its labels for AI-generated images after complaints from photographers
Meta is updating its "Made with AI" labels after widespread complaints from photographers that the company was mistakenly flagging non-AI-generated content. In an update, the company said that it will change the wording to "AI info" because the current labels "weren't always aligned with people's expectations and didn't always provide enough context." The company introduced the "Made with AI" labels earlier this year after criticism from the Oversight Board about its "manipulated media" policy. Meta said that, like many of its peers, it would rely on "industry standard" signals to determine when generative AI had been used to create an image. However, it wasn't long before photographers began noticing that Facebook and Instagram were applying the badge on images that hadn't actually been created with AI.
How gamification took over the world
For some, this phenomenon leads to an interest in flow states and immersion. For others, it's simply a reason to play more games. For a handful of consultants, startup gurus, and game designers in the late 2000s, it became the key to unlocking our true human potential. In her 2010 TED Talk, "Gaming Can Make a Better World," the game designer Jane McGonigal called this engaged state "blissful productivity." "There's a reason why the average World of Warcraft gamer plays for 22 hours a week," she said.