occ
Machine learning in an expectation-maximisation framework for nowcasting
Wilsens, Paul, Antonio, Katrien, Claeskens, Gerda
Decision making often occurs in the presence of incomplete information, leading to the under- or overestimation of risk. Leveraging the observable information to learn the complete information is called nowcasting. In practice, incomplete information is often a consequence of reporting or observation delays. In this paper, we propose an expectation-maximisation (EM) framework for nowcasting that uses machine learning techniques to model both the occurrence as well as the reporting process of events. We allow for the inclusion of covariate information specific to the occurrence and reporting periods as well as characteristics related to the entity for which events occurred. We demonstrate how the maximisation step and the information flow between EM iterations can be tailored to leverage the predictive power of neural networks and (extreme) gradient boosting machines (XGBoost). With simulation experiments, we show that we can effectively model both the occurrence and reporting of events when dealing with high-dimensional covariate information. In the presence of non-linear effects, we show that our methodology outperforms existing EM-based nowcasting frameworks that use generalised linear models in the maximisation step. Finally, we apply the framework to the reporting of Argentinian Covid-19 cases, where the XGBoost-based approach again is most performant.
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- Oceania > New Zealand (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Epidemiology (1.00)
- Banking & Finance > Insurance (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.88)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York > Onondaga County > Syracuse (0.04)
- (5 more...)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York > Onondaga County > Syracuse (0.04)
- (5 more...)
VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM
Jin, Lesheng, Ruan, Zhenyuan, Mai, Haohui, Shang, Jingbo
Modern GPUs evolve rapidly, yet production compilers still rely on hand-crafted register allocation heuristics that require substantial re-tuning for each hardware generation. We introduce VeriLocc, a framework that combines large language models (LLMs) with formal compiler techniques to enable generalizable and verifiable register allocation across GPU architectures. VeriLocc fine-tunes an LLM to translate intermediate representations (MIRs) into target-specific register assignments, aided by static analysis for cross-architecture normalization and generalization and a verifier-guided regeneration loop to ensure correctness. Evaluated on matrix multiplication (GEMM) and multi-head attention (MHA), VeriLocc achieves 85-99% single-shot accuracy and near-100% pass@100. Case study shows that VeriLocc discovers more performant assignments than expert-tuned libraries, outperforming rocBLAS by over 10% in runtime.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
A Variable Occurrence-Centric Framework for Inconsistency Handling (Extended Version)
In this paper, we introduce a syntactic framework for analyzing and handling inconsistencies in propositional bases. Our approach focuses on examining the relationships between variable occurrences within conflicts. We propose two dual concepts: Minimal Inconsistency Relation (MIR) and Maximal Consistency Relation (MCR). Each MIR is a minimal equivalence relation on variable occurrences that results in inconsistency, while each MCR is a maximal equivalence relation designed to prevent inconsistency. Notably, MIRs capture conflicts overlooked by minimal inconsistent subsets. Using MCRs, we develop a series of non-explosive inference relations. The main strategy involves restoring consistency by modifying the propositional base according to each MCR, followed by employing the classical inference relation to derive conclusions. Additionally, we propose an unusual semantics that assigns truth values to variable occurrences instead of the variables themselves. The associated inference relations are established through Boolean interpretations compatible with the occurrence-based models.
- North America > United States > Michigan > Wayne County > Detroit (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (7 more...)
DynFrs: An Efficient Framework for Machine Unlearning in Random Forest
Wang, Shurong, Shen, Zhuoyang, Qiao, Xinbao, Zhang, Tongning, Zhang, Meng
Random Forests are widely recognized for establishing efficacy in classification and regression tasks, standing out in various domains such as medical diagnosis, finance, and personalized recommendations. These domains, however, are inherently sensitive to privacy concerns, as personal and confidential data are involved. With increasing demand for the right to be forgotten, particularly under regulations such as GDPR and CCPA, the ability to perform machine unlearning has become crucial for Random Forests. However, insufficient attention was paid to this topic, and existing approaches face difficulties in being applied to real-world scenarios. Addressing this gap, we propose the DynFrs framework designed to enable efficient machine unlearning in Random Forests while preserving predictive accuracy. Dynfrs leverages subsampling method Occ(q) and a lazy tag strategy Lzy, and is still adaptable to any Random Forest variant. In essence, Occ(q) ensures that each sample in the training set occurs only in a proportion of trees so that the impact of deleting samples is limited, and Lzy delays the reconstruction of a tree node until necessary, thereby avoiding unnecessary modifications on tree structures. In experiments, applying Dynfrs on Extremely Randomized Trees yields substantial improvements, achieving orders of magnitude faster unlearning performance and better predictive accuracy than existing machine unlearning methods for Random Forests.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Singapore (0.04)
- Asia > China (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.94)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
LiDAR-based Quadrotor for Slope Inspection in Dense Vegetation
Liu, Wenyi, Ren, Yunfan, Guo, Rui, Kong, Vickie W. W., Hung, Anthony S. P., Zhu, Fangcheng, Cai, Yixi, Zou, Yuying, Zhang, Fu
This work presents a LiDAR-based quadrotor system for slope inspection in dense vegetation environments. Cities like Hong Kong are vulnerable to climate hazards, which often result in landslides. To mitigate the landslide risks, the Civil Engineering and Development Department (CEDD) has constructed steel flexible debris-resisting barriers on vulnerable natural catchments to protect residents. However, it is necessary to carry out regular inspections to identify any anomalies, which may affect the proper functioning of the barriers. Traditional manual inspection methods face challenges and high costs due to steep terrain and dense vegetation. Compared to manual inspection, unmanned aerial vehicles (UAVs) equipped with LiDAR sensors and cameras have advantages such as maneuverability in complex terrain, and access to narrow areas and high spots. However, conducting slope inspections using UAVs in dense vegetation poses significant challenges. First, in terms of hardware, the overall design of the UAV must carefully consider its maneuverability in narrow spaces, flight time, and the types of onboard sensors required for effective inspection. Second, regarding software, navigation algorithms need to be designed to enable obstacle avoidance flight in dense vegetation environments. To overcome these challenges, we develop a LiDAR-based quadrotor, accompanied by a comprehensive software system. The goal is to deploy our quadrotor in field environments to achieve efficient slope inspection. To assess the feasibility of our hardware and software system, we conduct functional tests in non-operational scenarios. Subsequently, invited by CEDD, we deploy our quadrotor in six field environments, including five flexible debris-resisting barriers located in dense vegetation and one slope that experienced a landslide. These experiments demonstrated the superiority of our quadrotor in slope inspection.
- Asia > China > Hong Kong (0.26)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
- Transportation > Air (0.68)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
- Energy > Power Industry (0.46)
Learning Anchor Planes for Classification L'ubor Ladický Philip H.S. Torr Amir Saffari
Local Coordinate Coding (LCC) [18] is a method for modeling functions of data lying on non-linear manifolds. It provides a set of anchor points which form a local coordinate system, such that each data point on the manifold can be approximated by a linear combination of its anchor points, and the linear weights become the local coordinate coding. In this paper we propose encoding data using orthogonal anchor planes, rather than anchor points. Our method needs only a few orthogonal anchor planes for coding, and it can linearize any (α, β, p)-Lipschitz smooth nonlinear function with a fixed expected value of the upper-bound approximation error on any high dimensional data. In practice, the orthogonal coordinate system can be easily learned by minimizing this upper bound using singular value decomposition (SVD). We apply our method to model the coordinates locally in linear SVMs for classification tasks, and our experiment on MNIST shows that using only 50 anchor planes our method achieves 1.72% error rate, while LCC achieves 1.90% error rate using 4096 anchor points.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)