behavioral test
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > Netherlands (0.04)
- (3 more...)
- Leisure & Entertainment > Games (1.00)
- Health & Medicine (0.93)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > Netherlands (0.04)
- (3 more...)
- Leisure & Entertainment > Games (1.00)
- Health & Medicine (0.93)
Automatic Differential Diagnosis using Transformer-Based Multi-Label Sequence Classification
Sadi, Abu Adnan, Khan, Mohammad Ashrafuzzaman, Saber, Lubaba Binte
As the field of artificial intelligence progresses, assistive technologies are becoming more widely used across all industries. The healthcare industry is no different, with numerous studies being done to develop assistive tools for healthcare professionals. Automatic diagnostic systems are one such beneficial tool that can assist with a variety of tasks, including collecting patient information, analyzing test results, and diagnosing patients. However, the idea of developing systems that can provide a differential diagnosis has been largely overlooked in most of these research studies. In this study, we propose a transformer-based approach for providing differential diagnoses based on a patient's age, sex, medical history, and symptoms. We use the DDXPlus dataset, which provides differential diagnosis information for patients based on 49 disease types. Firstly, we propose a method to process the tabular patient data from the dataset and engineer them into patient reports to make them suitable for our research. In addition, we introduce two data modification modules to diversify the training data and consequently improve the robustness of the models. We approach the task as a multi-label classification problem and conduct extensive experiments using four transformer models. All the models displayed promising results by achieving over 97% F1 score on the held-out test set. Moreover, we design additional behavioral tests to get a broader understanding of the models. In particular, for one of our test cases, we prepared a custom test set of 100 samples with the assistance of a doctor. The results on the custom set showed that our proposed data modification modules improved the model's generalization capabilities. We hope our findings will provide future researchers with valuable insights and inspire them to develop reliable systems for automatic differential diagnosis.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- North America > Dominican Republic (0.04)
- Europe > Bulgaria > Varna Province > Varna (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Can AI and humans genuinely communicate?
Can AI and humans genuinely communicate? In this article, after giving some background and motivating my proposal (sections 1 to 3), I explore a way to answer this question that I call the "mental-behavioral methodology" (sections 4 and 5). This methodology follows the following three steps: First, spell out what mental capacities are sufficient for human communication (as opposed to communication more generally). Second, spell out the experimental paradigms required to test whether a behavior exhibits these capacities. Third, apply or adapt these paradigms to test whether an AI displays the relevant behaviors. If the first two steps are successfully completed, and if the AI passes the tests with human-like results, this constitutes evidence that this AI and humans can genuinely communicate. This mental-behavioral methodology has the advantage that we don't need to understand the workings of black-box algorithms, such as standard deep neural networks. This is comparable to the fact that we don't need to understand how human brains work to know that humans can genuinely communicate. This methodology also has its disadvantages and I will discuss some of them (section 6).
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (4 more...)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)
UKP-SQuARE: An Interactive Tool for Teaching Question Answering
Fang, Haishuo, Puerto, Haritz, Gurevych, Iryna
The exponential growth of question answering (QA) has made it an indispensable topic in any Natural Language Processing (NLP) course. Additionally, the breadth of QA derived from this exponential growth makes it an ideal scenario for teaching related NLP topics such as information retrieval, explainability, and adversarial attacks among others. In this paper, we introduce UKP-SQuARE as a platform for QA education. This platform provides an interactive environment where students can run, compare, and analyze various QA models from different perspectives, such as general behavior, explainability, and robustness. Therefore, students can get a first-hand experience in different QA techniques during the class. Thanks to this, we propose a learner-centered approach for QA education in which students proactively learn theoretical concepts and acquire problem-solving skills through interactive exploration, experimentation, and practical assignments, rather than solely relying on traditional lectures. To evaluate the effectiveness of UKP-SQuARE in teaching scenarios, we adopted it in a postgraduate NLP course and surveyed the students after the course. Their positive feedback shows the platform's effectiveness in their course and invites a wider adoption.
- Questionnaire & Opinion Survey (1.00)
- Research Report (0.64)
- Instructional Material > Course Syllabus & Notes (0.46)
- Government (1.00)
- Education > Curriculum (0.68)
- Information Technology > Security & Privacy (0.52)
- Education > Educational Setting (0.46)
Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex
Lovering, Charles, Forde, Jessica Zosa, Konidaris, George, Pavlick, Ellie, Littman, Michael L.
AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex. While researchers and game commentators have suggested that AlphaZero uses concepts that humans consider important, it is unclear how these concepts are captured in the network. We investigate AlphaZero's internal representations in the game of Hex using two evaluation techniques from natural language processing (NLP): model probing and behavioral tests. In doing so, we introduce new evaluation tools to the RL community, and illustrate how evaluations other than task performance can be used to provide a more complete picture of a model's strengths and weaknesses. Our analyses in the game of Hex reveal interesting patterns and generate some testable hypotheses about how such models learn in general. For example, we find that MCTS discovers concepts before the neural network learns to encode them. We also find that concepts related to short-term end-game planning are best encoded in the final layers of the model, whereas concepts related to long-term planning are encoded in the middle layers of the model.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > Netherlands (0.04)
- (3 more...)
What the Rat Brain Tells Us About Yours - Issue 47: Consciousness
A little more than a decade ago, Mike Mendl developed a new test for gauging a laboratory rat's level of happiness. Mendl, an animal welfare researcher in the veterinary school at the University of Bristol in England, was looking for an objective way to tell whether animals in captivity were suffering. Specifically, he wanted to be able to measure whether, and how much, disruptions in lab rats' routines--being placed in an unfamiliar cage, say, or experiencing a change in the light/dark cycle of the room in which they were housed--were bumming them out. He and his colleagues explicitly drew on an extensive literature in psychology that describes how people with mood disorders such as depression process information and make decisions: They tend to focus on and recall more negative events and to judge ambiguous things in a more negative way. You might say that they tend to see the proverbial glass as half-empty rather than half-full. "We thought that it's easier to measure cognitive things than emotional ones, so we devised a test that would give us some indication of how animals responded under ambiguity," Mendl says.
- Europe > United Kingdom > England (0.24)
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > Massachusetts > Hampshire County > Northampton (0.04)
- Africa > Mozambique (0.04)
- Research Report > New Finding (0.47)
- Research Report > Experimental Study (0.47)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Letters to the Editor
Cronin, Matthew R., Firschein, Oscar, Ogasawara, Gary, Rich, Elaine
As a communication scholar, I am This latest computer revolution well aware that many traditionalists has taken shape only within the view the respective disciplines of past five years. My recently completed These two revolutions have been master's thesis argues against this operating independently with limited view. Many concepts from the field success, instead of together with The workshops on Artificial Intelligence of communication have been used by potentially phenomenal success. The and Statistics have broadened the flow artificial intelligence researchers and multimedia revolution has successfully of information between the two fields scholars in the development of AI. broken into the marketplace on and encouraged interdisciplinary work. The central argument of my perspective all levels, but lacks the key component General Chair: R.W. Oldford (U. is that artificial intelligence is (symbolic reasoning) needed for Waterloo); man Program Chair: P. Cheese Sponsers: Sot. for A.I. and potential to provide the current multimedia By transcending traditional Stats., Int'l Ass. for Stat.
- North America > Canada > Saskatchewan (0.25)
- North America > United States > California > Santa Clara County > San Jose (0.05)
- Leisure & Entertainment (0.69)
- Government > Regional Government > North America Government > United States Government (0.49)
- Government > Space Agency (0.30)
- Information Technology > Artificial Intelligence > Games (0.47)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (0.32)