secondary model
Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images
Ding, Yuzhen, Holmes, Jason M., Feng, Hongying, Li, Baoxin, McGee, Lisa A., Rwigema, Jean-Claude M., Vora, Sujay A., Ma, Daniel J., Foote, Robert L., Patel, Samir H., Liu, Wei
In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imaging dose, thus unfavorable for pediatric patients. A solution to this dilemma is to reconstruct 3D CT from kV images obtained at the treatment position. Here, we propose a dual-models framework built with hierarchical ViT blocks. Unlike a proof-of-concept approach, our framework considers kV images as the solo input and can synthesize accurate, full-size 3D CT in real time(within milliseconds). We demonstrate the feasibility of the proposed approach on 10 patients with head and neck (H&N) cancer using image quality(MAE: <45HU), dosimetrical accuracy(Gamma passing rate (2%/2mm/10%)>97%) and patient position uncertainty(shift error: <0.4mm). The proposed framework can generate accurate 3D CT faithfully mirroring real-time patient position, thus significantly improving patient setup accuracy, keeping imaging dose minimum, and maintaining treatment veracity.
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- North America > United States > Minnesota > Olmsted County > Rochester (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Tandem Transformers for Inference Efficient LLMs
S, Aishwarya P, Nair, Pranav Ajit, Samaga, Yashas, Boyd, Toby, Kumar, Sanjiv, Jain, Prateek, Netrapalli, Praneeth
The autoregressive nature of conventional large language models (LLMs) inherently limits inference speed, as tokens are generated sequentially. While speculative and parallel decoding techniques attempt to mitigate this, they face limitations: either relying on less accurate smaller models for generation or failing to fully leverage the base LLM's representations. We introduce a novel architecture, Tandem transformers, to address these issues. This architecture uniquely combines (1) a small autoregressive model and (2) a large model operating in block mode (processing multiple tokens simultaneously). The small model's predictive accuracy is substantially enhanced by granting it attention to the large model's richer representations. On the PaLM2 pretraining dataset, a tandem of PaLM2-Bison and PaLM2-Gecko demonstrates a 3.3% improvement in next-token prediction accuracy over a standalone PaLM2-Gecko, offering a 1.16x speedup compared to a PaLM2-Otter model with comparable downstream performance. We further incorporate the tandem model within the speculative decoding (SPEED) framework where the large model validates tokens from the small model. This ensures that the Tandem of PaLM2-Bison and PaLM2-Gecko achieves substantial speedup (around 1.14x faster than using vanilla PaLM2-Gecko in SPEED) while maintaining identical downstream task accuracy.
- Asia > India (0.05)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (4 more...)
Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models
Mogridge, Rhiannon, Close, George, Sutherland, Robert, Hain, Thomas, Barker, Jon, Goetze, Stefan, Ragni, Anton
Neural networks have been successfully used for non-intrusive speech intelligibility prediction. Recently, the use of feature representations sourced from intermediate layers of pre-trained self-supervised and weakly-supervised models has been found to be particularly useful for this task. This work combines the use of Whisper ASR decoder layer representations as neural network input features with an exemplar-based, psychologically motivated model of human memory to predict human intelligibility ratings for hearing-aid users. Substantial performance improvement over an established intrusive HASPI baseline system is found, including on enhancement systems and listeners unseen in the training data, with a root mean squared error of 25.3 compared with the baseline of 28.7.
- Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
- Europe > France (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
- Health & Medicine > Therapeutic Area > Otolaryngology (0.38)
- Health & Medicine > Health Care Technology (0.38)
Bergeron: Combating Adversarial Attacks through a Conscience-Based Alignment Framework
Pisano, Matthew, Ly, Peter, Sanders, Abraham, Yao, Bingsheng, Wang, Dakuo, Strzalkowski, Tomek, Si, Mei
Modern Large language models (LLMs) can still generate responses that may not be aligned with human expectations or values. While many weight-based alignment methods have been proposed, many of them still leave models vulnerable to attacks when used on their own. To help mitigate this issue, we introduce Bergeron, a framework designed to improve the robustness of LLMs against adversarial attacks. Bergeron employs a two-tiered architecture. Here, a secondary LLM serves as a simulated conscience that safeguards a primary LLM. We do this by monitoring for and correcting potentially harmful text within both the prompt inputs and the generated outputs of the primary LLM. Empirical evaluation shows that Bergeron can improve the alignment and robustness of several popular LLMs without costly fine-tuning. It aids both open-source and black-box LLMs by complementing and reinforcing their existing alignment training.
- Oceania > Australia (0.04)
- North America > Canada (0.04)
- Europe > Russia (0.04)
- (2 more...)
- Government > Military (1.00)
- Information Technology > Security & Privacy (0.85)
Generalised learning of time-series: Ornstein-Uhlenbeck processes
Süzen, Mehmet, Yegenoglu, Alper
In machine learning, statistics, econometrics and statistical physics, $k$-fold cross-validation (CV) is used as a standard approach in quantifying the generalization performance of a statistical model. Applying this approach directly to time series models is avoided by practitioners due to intrinsic nature of serial correlations in the ordered data due to implications like absurdity of using future data to predict past and non-stationarity issues. In this work, we propose a technique called {\it reconstructive cross validation} ($rCV$) that avoids all these issues enabling generalized learning in time-series as a meta-algorithm. In $rCV$, data points in the test fold, randomly selected points from the time series, are first removed. Then, a secondary time series model or a technique is used in reconstructing the removed points from the test fold, i.e., imputation or smoothing. Thereafter, the primary model is build using new dataset coming from the secondary model or a technique. The performance of the primary model on the test set by computing the deviations from the originally removed and out-of-sample (OSS) data are evaluated simultaneously. This amounts to reconstruction and prediction errors. By this procedure serial correlations and data order is retained and $k$-fold cross-validation is reached generically. If reconstruction model uses a technique whereby the existing data points retained exactly, such as Gaussian process regression, the reconstruction itself will not result in information loss from non-reconstructed portion of the original data points. We have applied $rCV$ to estimate the general performance of the model build on simulated Ornstein-Uhlenbeck process. We have shown an approach to build a time-series learning curves utilizing $rCV$.
Densely Connected Convolutional Networks and Signal Quality Analysis to Detect Atrial Fibrillation Using Short Single-Lead ECG Recordings
Rubin, Jonathan, Parvaneh, Saman, Rahman, Asif, Conroy, Bryan, Babaeizadeh, Saeed
The development of new technology such as wearables that record high-quality single channel ECG, provides an opportunity for ECG screening in a larger population, especially for atrial fibrillation screening. The main goal of this study is to develop an automatic classification algorithm for normal sinus rhythm (NSR), atrial fibrillation (AF), other rhythms (O), and noise from a single channel short ECG segment (9-60 seconds). For this purpose, signal quality index (SQI) along with dense convolutional neural networks was used. Two convolutional neural network (CNN) models (main model that accepts 15 seconds ECG and secondary model that processes 9 seconds shorter ECG) were trained using the training data set. If the recording is determined to be of low quality by SQI, it is immediately classified as noisy. Otherwise, it is transformed to a time-frequency representation and classified with the CNN as NSR, AF, O, or noise. At the final step, a feature-based post-processing algorithm classifies the rhythm as either NSR or O in case the CNN model's discrimination between the two is indeterminate. The best result achieved at the official phase of the PhysioNet/CinC challenge on the blind test set was 0.80 (F1 for NSR, AF, and O were 0.90, 0.80, and 0.70, respectively).
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)