lct
Mitigating hallucinations and omissions in LLMs for invertible problems: An application to hardware logic design automation
Cassidy, Andrew S., Garreau, Guillaume, Sivagnaname, Jay, Grassi, Mike, Brezzo, Bernard, Arthur, John V., Modha, Dharmendra S.
We show for invertible problems that transform data from a source domain (for example, Logic Condition Tables (LCTs)) to a destination domain (for example, Hardware Description Language (HDL) code), an approach of using Large Language Models (LLMs) as a lossless encoder from source to destination followed by as a lossless decoder back to the source, comparable to lossless compression in information theory, can mitigate most of the LLM drawbacks of hallucinations and omissions. Specifically, using LCTs as inputs, we generate the full HDL for a two-dimensional network-on-chip router (13 units, 1500-2000 lines of code) using seven different LLMs, reconstruct the LCTs from the auto-generated HDL, and compare the original and reconstructed LCTs. This approach yields significant productivity improvements, not only confirming correctly generated LLM logic and detecting incorrectly generated LLM logic but also assisting developers in finding design specification errors.
Training Over a Distribution of Hyperparameters for Enhanced Performance and Adaptability on Imbalanced Classification
Lieberman, Kelsey, Ravindran, Swarna Kamlam, Yuan, Shuai, Tomasi, Carlo
Although binary classification is a well-studied problem, training reliable classifiers under severe class imbalance remains a challenge. Recent techniques mitigate the ill effects of imbalance on training by modifying the loss functions or optimization methods. We observe that different hyperparameter values on these loss functions perform better at different recall values. We propose to exploit this fact by training one model over a distribution of hyperparameter values-instead of a single value-via Loss Conditional Training (LCT). Experiments show that training over a distribution of hyperparameters not only approximates the performance of several models but actually improves the overall performance of models on both CIFAR and real medical imaging applications, such as melanoma and diabetic retinopathy detection. Furthermore, training models with LCT is more efficient because some hyperparameter tuning can be conducted after training to meet individual needs without needing to retrain from scratch. Consider a classifier that takes images of skin lesions and predicts whether they are melanoma or benign (Rotemberg et al., 2020). Such a system could be especially valuable in underdeveloped countries where expert resources for screening are scarce (Cassidy et al., 2022). The dataset for this problem, along with many other practical problems, is inherently imbalanced (i.e., there are far more benign samples than melanoma samples). Furthermore, there are un-even costs associated with misclassifying the two classes because predicting a benign lesion as melanoma would result in the cost of a biopsy while predicting a melanoma lesion as benign could result in the melanoma spreading before the patient can receive appropriate treatment. Unfortunately, the exact difference in the misclassification costs may not be known a priori and may even change after deployment. For example, the costs may change depending on the amount of biopsy resources available or the prior may change depending on the age and condition of the patient. Thus, a good classifier for this problem should (a) have good performance across a wide range of Precision-Recall tradeoffs and (b) be able to adapt to changes in the prior or misclassification costs.
Optimizing for ROC Curves on Class-Imbalanced Data by Training over a Family of Loss Functions
Lieberman, Kelsey, Yuan, Shuai, Ravindran, Swarna Kamlam, Tomasi, Carlo
Although binary classification is a well-studied problem in computer vision, training reliable classifiers under severe class imbalance remains a challenging problem. Recent work has proposed techniques that mitigate the effects of training under imbalance by modifying the loss functions or optimization methods. While this work has led to significant improvements in the overall accuracy in the multi-class case, we observe that slight changes in hyperparameter values of these methods can result in highly variable performance in terms of Receiver Operating Characteristic (ROC) curves on binary problems with severe Figure 1: Distribution of Area Under the ROC Curve (AUC) imbalance. To reduce the sensitivity to hyperparameter values obtained by training the same model on the SIIM-choices and train more general models, ISIC Melanoma classification dataset with 48 different combinations we propose training over a family of loss functions, of hyperparameters on VS loss. Results are shown instead of a single loss function. We develop at three different imbalance ratios. As the imbalance becomes a method for applying Loss Conditional more severe, model performance drops and the Training (LCT) to an imbalanced classification variance in performance drastically increases.
OSNet & MNetO: Two Types of General Reconstruction Architectures for Linear Computed Tomography in Multi-Scenarios
Wang, Zhisheng, Deng, Zihan, Liu, Fenglin, Huang, Yixing, Yu, Haijun, Cui, Junning
Recently, linear computed tomography (LCT) systems have actively attracted attention. To weaken projection truncation and image the region of interest (ROI) for LCT, the backprojection filtration (BPF) algorithm is an effective solution. However, in BPF for LCT, it is difficult to achieve stable interior reconstruction, and for differentiated backprojection (DBP) images of LCT, multiple rotation-finite inversion of Hilbert transform (Hilbert filtering)-inverse rotation operations will blur the image. To satisfy multiple reconstruction scenarios for LCT, including interior ROI, complete object, and exterior region beyond field-of-view (FOV), and avoid the rotation operations of Hilbert filtering, we propose two types of reconstruction architectures. The first overlays multiple DBP images to obtain a complete DBP image, then uses a network to learn the overlying Hilbert filtering function, referred to as the Overlay-Single Network (OSNet). The second uses multiple networks to train different directional Hilbert filtering models for DBP images of multiple linear scannings, respectively, and then overlays the reconstructed results, i.e., Multiple Networks Overlaying (MNetO). In two architectures, we introduce a Swin Transformer (ST) block to the generator of pix2pixGAN to extract both local and global features from DBP images at the same time. We investigate two architectures from different networks, FOV sizes, pixel sizes, number of projections, geometric magnification, and processing time. Experimental results show that two architectures can both recover images. OSNet outperforms BPF in various scenarios. For the different networks, ST-pix2pixGAN is superior to pix2pixGAN and CycleGAN. MNetO exhibits a few artifacts due to the differences among the multiple models, but any one of its models is suitable for imaging the exterior edge in a certain direction.
Prediction of Success or Failure for Final Examination using Nearest Neighbor Method to the Trend of Weekly Online Testing
Using the outputs obtained from the online testing, it is not so difficult to collect a large-scale of learning data. We may be able to actively tackle the collected data to find the optimal strategies for better learning methods. It is also important to analyze the data theoretically (see [23]). This paper is aimed at obtaining effective learning strategies for students at risk for failing courses and/or dropping out, using a large-scale of learning data collected from the online testings. In this paper, unlike the conventional methods using the correct answer rate (CAR) to identify the ability of a student (e.g., see [13]), we use the ability obtained from the item response theory (IRT, e.g., see [1], [4], [17]), and we show a new method to identify students at risk as early as possible using the IRT results.