incompetence
Automated Extraction of Fluoropyrimidine Treatment and Treatment-Related Toxicities from Clinical Notes Using Natural Language Processing
Wu, Xizhi, Kreider, Madeline S., Empey, Philip E., Li, Chenyu, Wang, Yanshan
Objective: Fluoropyrimidines are widely prescribed for colorectal and breast cancers, but are associated with toxicities such as hand-foot syndrome and cardiotoxicity. Since toxicity documentation is often embedded in clinical notes, we aimed to develop and evaluate natural language processing (NLP) methods to extract treatment and toxicity information. Materials and Methods: We constructed a gold-standard dataset of 236 clinical notes from 204,165 adult oncology patients. Domain experts annotated categories related to treatment regimens and toxicities. We developed rule-based, machine learning-based (Random Forest, Support Vector Machine [SVM], Logistic Regression [LR]), deep learning-based (BERT, ClinicalBERT), and large language models (LLM)-based NLP approaches (zero-shot and error-analysis prompting). Models used an 80:20 train-test split. Results: Sufficient data existed to train and evaluate 5 annotated categories. Error-analysis prompting achieved optimal precision, recall, and F1 scores (F1=1.000) for treatment and toxicities extraction, whereas zero-shot prompting reached F1=1.000 for treatment and F1=0.876 for toxicities extraction.LR and SVM ranked second for toxicities (F1=0.937). Deep learning underperformed, with BERT (F1=0.873 treatment; F1= 0.839 toxicities) and ClinicalBERT (F1=0.873 treatment; F1 = 0.886 toxicities). Rule-based methods served as our baseline with F1 scores of 0.857 in treatment and 0.858 in toxicities. Discussion: LMM-based approaches outperformed all others, followed by machine learning methods. Machine and deep learning approaches were limited by small training data and showed limited generalizability, particularly for rare categories. Conclusion: LLM-based NLP most effectively extracted fluoropyrimidine treatment and toxicity information from clinical notes, and has strong potential to support oncology research and pharmacovigilance.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Sandbagging in a Simple Survival Bandit Problem
Dyer, Joel, Ornia, Daniel Jarne, Bishop, Nicholas, Calinescu, Anisoara, Wooldridge, Michael
Evaluating the safety of frontier AI systems is an increasingly important concern, helping to measure the capabilities of such models and identify risks before deployment. However, it has been recognised that if AI agents are aware that they are being evaluated, such agents may deliberately hide dangerous capabilities or intentionally demonstrate suboptimal performance in safety-related tasks in order to be released and to avoid being deactivated or retrained. Such strategic deception - often known as "sandbagging" - threatens to undermine the integrity of safety evaluations. For this reason, it is of value to identify methods that enable us to distinguish behavioural patterns that demonstrate a true lack of capability from behavioural patterns that are consistent with sandbagging. In this paper, we develop a simple model of strategic deception in sequential decision-making tasks, inspired by the recently developed survival bandit framework. We demonstrate theoretically that this problem induces sandbagging behaviour in optimal rational agents, and construct a statistical test to distinguish between sandbagging and incompetence from sequences of test scores. In simulation experiments, we investigate the reliability of this test in allowing us to distinguish between such behaviours in bandit models. This work aims to establish a potential avenue for developing robust statistical procedures for use in the science of frontier model evaluations.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Senator, former combat pilot says it's not just Russian aggression that caused midair crash
Mark Kelly, D-Ariz., weighs in on challenges Border Patrol faces as more migrants flood the southern border on'Special Report.' The crash between a Russian fighter jet and a U.S. drone likely resulted from the pilot's aggression and "incompetence," according to former astronaut and Navy captain, Sen. Mark Kelly. "Look at the level of incompetence – I mean when we saw the flanker yesterday, which basically had a midair with the MQ-9 [drone], with a reaper … I spent 15 years in the astronaut office, I used to fly with Russian fighter pilots in the backseat of my plane," Kelly, D-AZ., told Fox News chief political anchor and host of "Special Report with Brett Baier" during an interview Thursday. "The level of incompetence in the Russian pilots that I flew with was shocking to me." Russia has denied that its plane crashed into the U.S. drone despite video evidence showing the plane make at least two fly-bys, including one in which it appeared to dump fuel on the drone before the feed abruptly cut off, and the drone crashed into the Black Sea. On Friday, Russian Minister of Defense Sergei Shoigu reportedly presented the pilots responsible for crashing the drone with state awards, saying the pilots prevented the drone from "violating the boundaries of the temporary airspace regime established for the special military operation," referring to the invasion of Ukraine.
- Asia > Russia (1.00)
- Europe > Ukraine (0.34)
- Atlantic Ocean > Black Sea (0.26)
- (5 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Regional Government > Europe Government > Russia Government (1.00)
- Government > Regional Government > Asia Government > Russia Government (1.00)
- Government > Military > Air Force (1.00)
A Robust Bias Mitigation Procedure Based on the Stereotype Content Model
Ungless, Eddie L., Rafferty, Amy, Nag, Hrichika, Ross, Björn
The Stereotype Content model (SCM) states that we tend to perceive minority groups as cold, incompetent or both. In this paper we adapt existing work to demonstrate that the Stereotype Content model holds for contextualised word embeddings, then use these results to evaluate a fine-tuning process designed to drive a language model away from stereotyped portrayals of minority groups. We find the SCM terms are better able to capture bias than demographic agnostic terms related to pleasantness. Further, we were able to reduce the presence of stereotypes in the model through a simple fine-tuning procedure that required minimal human and computer resources, without harming downstream performance. We present this work as a prototype of a debiasing procedure that aims to remove the need for a priori knowledge of the specifics of bias in the model.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom (0.14)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (6 more...)
For the good of humanity, AI needs to know when it's incompetent
Everyone's had that coworker, the one who never asks for help even when fully out of their depth, unaware of their own incompetence. But what happens when your colleague isn't a human suffering from Dunning-Kruger but artificial intelligence? That's a question Vishal Chatrath has had to consider as the CEO and co-founder of Prowler.io, an AI platform for generalised decision making for businesses that aims to augment human work with machine learning. "The decision-making process can be quite similar [across different businesses], if abstracted at a low-enough level," he says. "In some cases, the decisions are fully automated, in some cases, there's a human in the loop. Keeping a human as part of the process is partially because of a lack of trust in machine-based decision making, but it's also an admission by Chatrath that we remain in the early years of AI. Such systems aren't perfect, and likely never will be, and one failing of AI is it doesn't inherently understand its own competency. If a human worker needs help, they can ask for it -- but how do you build an understanding of personal limitations into code? "In both crashes, the commonality was that the autopilot did not understand its own incompetence," Chatrath says. Prowler.io built an awareness of incompetence into its system, teaching its AI to not only understand its limitations but to forecast when it's going to reach a situation where it has no experience or background. "Then it gently taps the human on the shoulder, so to speak, for the human to take control," he says. The system can learn from those interactions, and after enough training may eventually be able to stop asking for help. Such limits to AI could be placed by regulators, as is the case in the financial industry where levels of risk are carefully weighed, or by the business itself. The fourth consideration is how are we even sure the AI is asking the right questions. "There is no cookie cutter answer to these," he says. If there's a 10 per cent chance a logistics scheduler is wrong, and a lorry is therefore a bit late, that's okay. If there's a 10 per cent chance that shape in front of a driverless car is a human, the car should stop -- the risk are too high for any uncertainty. "Rather than doing stupid things like running someone over, it brings the human into the [process]," Chatrath explains, as it's been told when the risks are too high for it to screw up. That's important, says Taha Yasseri, a researcher at the Oxford Internet Institute and the Alan Turing Institute for Data Science, because while we can delegate decision making to machines, we can't delegate responsibility. "The ultimate responsibility in implementing the decisions made by machines are on us," he says. In practice, whenever the expected accuracy of a human is higher than a machine, it is practically justified to use human judgment to overlook machine decisions."
- Information Technology (0.56)
- Banking & Finance (0.51)
- Transportation > Passenger (0.35)
The 10 Worst Bad Boss Behaviors
A former manager of mine once told me, "If I am going down, you are going down with me." That was in relation to having a bad sales quarter. From that point on, there was no need to pay attention to her. It becomes difficult to continue to feel passion for the job once you see a lapse in integrity in your boss. It only got worse from there, so I knew it was time to start planning my exit strategy.
Fabio, The Pepper Robot, Fired For 'Incompetence' At Edinburgh Store
The advancement in technology heralds a future world dominated by artificial intelligence with robots taking over many of the jobs currently being done by humans. However, with the firing of Britain's first cyborg shop assistant for "incompetence," it seems unlikely that the day is anywhere in the near future. Fabio, the Pepper robot, who was deployed as a retail assistant at the upmarket store Margiotta in Edinburgh, Scotland, was let go after only a week at the job after it was found that the robot was confusing the patrons, who preferred assistance from its human colleagues. The robot was an experiment run by Heriot-Watt University for the BBC's Six Robots & US. It was programmed with directions to hundreds of items in the company's flagship Edinburgh store.
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.26)
- Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.06)
Veteran Sues VA Department For Surgery That Left Scalpel In His Body
A veteran filed a lawsuit against a Connecticut hospital claiming a scalpel was left in his body for four years after his surgery, reports said Monday. Faxon Law Group filed the lawsuit against the Department of Veterans Affairs on behalf of 61-year-old United States Army veteran Glenford Turner, who had his cancerous prostate removed in robot-assisted laparoscopic surgery at the VA Connecticut Healthcare System, West Haven Campus in 2013. Turner's lawyer, Joel Faxon told the Boston Globe that the surgery took five hours instead of the one hour it should have taken. There was no standard measure of X-ray performed afterward to ensure that no surgical materials had been left behind. Turner returned to the VA hospital on March 29, 2017 for an MRI after complaining of dizziness and long-term abdominal pain.
- Law > Litigation (1.00)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (0.95)
- Government > Military > Army (0.57)
The top 10 tech stories of 2017
This was a wild and tumultuous year in tech, with game-changing scandals, staggering displays of incompetence, and chin-scratching revelations. And while we may prefer to forget the dumpster fire that was the last 12 months, it's worth taking a moment to sit back, kick your feet up, pour a tall glass of something strong, and stroll down this perfectly decrepit memory lane. After all, heck, you made it through. And so, without further ado, here are the top 10 tech stories of the year, ranked. The news may be fake but that smile just has to be real.
- North America > United States > California (0.05)
- North America > United States > Arizona (0.05)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Information Technology > Services (0.97)
- Transportation > Ground > Road (0.48)