unique challenge
Causal Attribution of Model Performance Gaps in Medical Imaging Under Distribution Shifts
Gordaliza, Pedro M., Molchanova, Nataliia, Banus, Jaume, Sanchez, Thomas, Cuadra, Meritxell Bach
Deep learning models for medical image segmentation suffer significant performance drops due to distribution shifts, but the causal mechanisms behind these drops remain poorly understood. We extend causal attribution frameworks to high-dimensional segmentation tasks, quantifying how acquisition protocols and annotation variability independently contribute to performance degradation. We model the data-generating process through a causal graph and employ Shapley values to fairly attribute performance changes to individual mechanisms. Our framework addresses unique challenges in medical imaging: high-dimensional outputs, limited samples, and complex mechanism interactions. Validation on multiple sclerosis (MS) lesion segmentation across 4 centers and 7 annotators reveals context-dependent failure modes: annotation protocol shifts dominate when crossing annotators (7.4% $\pm$ 8.9% DSC attribution), while acquisition shifts dominate when crossing imaging centers (6.5% $\pm$ 9.1%). This mechanism-specific quantification enables practitioners to prioritize targeted interventions based on deployment context.
Generative AI for Education (GAIED): Advances, Opportunities, and Challenges
Denny, Paul, Gulwani, Sumit, Heffernan, Neil T., Kรคser, Tanja, Moore, Steven, Rafferty, Anna N., Singla, Adish
This survey article has grown out of the GAIED (pronounced "guide") workshop organized by the authors at the NeurIPS 2023 conference. We organized the GAIED workshop as part of a community-building effort to bring together researchers, educators, and practitioners to explore the potential of generative AI for enhancing education. This article aims to provide an overview of the workshop activities and highlight several future research directions in the area of GAIED.
Language Detection for Transliterated Content
S, Selva Kumar, Khan, Afifah Khan Mohammed Ajmal, Manjeshwar, Chirag, Banday, Imadh Ajaz
In the contemporary digital era, the Internet functions as an unparalleled catalyst, dismantling geographical and linguistic barriers particularly evident in texting. This evolution facilitates global communication, transcending physical distances and fostering dynamic cultural exchange. A notable trend is the widespread use of transliteration, where the English alphabet is employed to convey messages in native languages, posing a unique challenge for language technology in accurately detecting the source language. This paper addresses this challenge through a dataset of phone text messages in Hindi and Russian transliterated into English utilizing BERT for language classification and Google Translate API for transliteration conversion. The research pioneers innovative approaches to identify and convert transliterated text, navigating challenges in the diverse linguistic landscape of digital communication. Emphasizing the pivotal role of comprehensive datasets for training Large Language Models LLMs like BERT, our model showcases exceptional proficiency in accurately identifying and classifying languages from transliterated text. With a validation accuracy of 99% our models robust performance underscores its reliability. The comprehensive exploration of transliteration dynamics supported by innovative approaches and cutting edge technologies like BERT, positions our research at the forefront of addressing unique challenges in the linguistic landscape of digital communication. Beyond contributing to language identification and transliteration capabilities this work holds promise for applications in content moderation, analytics and fostering a globally connected community engaged in meaningful dialogue.
Minds of machines: The great AI consciousness conundrum
Chalmers was an eminently sensible choice to speak about AI consciousness. He'd earned his PhD in philosophy at an Indiana University AI lab, where he and his computer scientist colleagues spent their breaks debating whether machines might one day have minds. In his 1996 book, The Conscious Mind, he spent an entire chapter arguing that artificial consciousness was possible. If he had been able to interact with systems like LaMDA and ChatGPT back in the '90s, before anyone knew how such a thing might work, he would have thought there was a good chance they were conscious, Chalmers says. But when he stood before a crowd of NeurIPS attendees in a cavernous New Orleans convention hall, clad in his trademark leather jacket, he offered a different assessment.
Growing and Serving Large Open-domain Knowledge Graphs
Ilyas, Ihab F., Lacerda, JP, Li, Yunyao, Minhas, Umar Farooq, Mousavi, Ali, Pound, Jeffrey, Rekatsinas, Theodoros, Sumanth, Chiraag
Applications of large open-domain knowledge graphs (KGs) to real-world problems pose many unique challenges. In this paper, we present extensions to Saga our platform for continuous construction and serving of knowledge at scale. In particular, we describe a pipeline for training knowledge graph embeddings that powers key capabilities such as fact ranking, fact verification, a related entities service, and support for entity linking. We then describe how our platform, including graph embeddings, can be leveraged to create a Semantic Annotation service that links unstructured Web documents to entities in our KG. Semantic annotation of the Web effectively expands our knowledge graph with edges to open-domain Web content which can be used in various search and ranking problems. Finally, we leverage annotated Web documents to drive Open-domain Knowledge Extraction. This targeted extraction framework identifies important coverage issues in the KG, then finds relevant data sources for target entities on the Web and extracts missing information to enrich the KG. Finally, we describe adaptations to our knowledge platform needed to construct and serve private personal knowledge on-device. This includes private incremental KG construction, cross-device knowledge sync, and global knowledge enrichment.
Regulating AI: 3 experts explain why it's difficult to do and important to get right
From fake photos of Donald Trump being arrested by New York City police officers to a chatbot describing a very-much-alive computer scientist as having died tragically, the ability of the new generation of generative artificial intelligence systems to create convincing but fictional text and images is setting off alarms about fraud and misinformation on steroids. Indeed, a group of artificial intelligence researchers and industry figures urged the industry on March 29, 2023, to pause further training of the latest AI technologies or, barring that, for governments to "impose a moratorium." These technologies โ image generators like DALL-E, Midjourney and Stable Diffusion, and text generators like Bard, ChatGPT, Chinchilla and LLaMA โ are now available to millions of people and don't require technical knowledge to use. Given the potential for widespread harm as technology companies roll out these AI systems and test them on the public, policymakers are faced with the task of determining whether and how to regulate the emerging technology. The Conversation asked three experts on technology policy to explain why regulating AI is such a challenge โ and why it's so important to get it right.
Machine Learning Engineering for Edge AI: Challenges and Best Practices
Machine learning engineering is the field of developing, implementing, and maintaining machine learning systems. It involves the application of engineering principles to the design, development, and deployment of machine learning models, algorithms, and applications. The primary focus of ML engineering is to build scalable and efficient machine learning systems that can process large volumes of data and generate accurate predictions. It involves various tasks such as data preparation, model development, model training, model deployment, and model monitoring. ML engineering requires a combination of skills in computer science, mathematics, statistics, and domain-specific knowledge.
Time Series Transformation for Deep Learning
As the field of deep learning continues to evolve and expand, time series data is becoming increasingly important in a variety of applications, including finance, medicine, and manufacturing. However, working with time series data presents unique challenges, including the need for proper transformation in order to effectively train deep learning models. In this article, we'll explore the ins and outs of time series transformation for deep learning and how it can help you achieve better results. Time series data is a sequence of observations recorded over a period of time, such as stock prices, weather patterns, or sensor readings. In many cases, time series data contains important information that can be used to make predictions or identify patterns, but working with this type of data presents unique challenges.
AI Copywriting: A Unique Way of Creating Content for Your Business
If you're looking to write content for your business, you're going to need to find a way to get it in front of your audience. You can't just publish a bunch of random articles and expect your audience to find them. You need to put in the time, effort, and resources that go into creating and distributing content in order to make it successful. If you're looking for a solid guide to writing content for your business, this article is for you. We'll also talk about how AI can make your content writing even more effective.
Top 3 Web Scraping Challenges Solved by AI in 2022
Web scraping has transformed many business processes, but it also has many technical challenges. The end-to-end process of collecting web data can be demonstrated as in Figure 1. When the number of pages and the complexity of websites to be scraped increase, each of these steps face with unique challenges. Artificial intelligence methods help web scraping to overcome the unique challenges of each step. In this article, we will introduce you the top 3 ways that AI enable web scraping to overcome technical challenges which may be helpful as your web scraping needs scale up and get more complicated.