balance
Are GATs Out of Balance?
While the expressive power and computational capabilities of graph neural networks (GNNs) have been theoretically studied, their optimization and learning dynamics, in general, remain largely unexplored. Our study undertakes the Graph Attention Network (GAT), a popular GNN architecture in which a node's neighborhood aggregation is weighted by parameterized attention coefficients. We derive a conservation law of GAT gradient flow dynamics, which explains why a high portion of parameters in GATs with standard initialization struggle to change during training. This effect is amplified in deeper GATs, which perform significantly worse than their shallow counterparts. To alleviate this problem, we devise an initialization scheme that balances the GAT network. Our approach i) allows more effective propagation of gradients and in turn enables trainability of deeper networks, and ii) attains a considerable speedup in training and convergence time in comparison to the standard initialization. Our main theorem serves as a stepping stone to studying the learning dynamics of positive homogeneous models with attention mechanisms.
The Benefits of Balance: From Information Projections to Variance Reduction
Data balancing across multiple modalities and sources appears in various forms in foundation models in machine learning and AI, e.g., in CLIP and DINO. We show that data balancing across modalities and sources actually offers an unsuspected benefit: variance reduction. We present a non-asymptotic statistical bound that quantifies this variance reduction effect and relates it to the eigenvalue decay of Markov operators. Furthermore, we describe how various forms of data balancing in contrastive multimodal learning and self-supervised clustering can be better understood, and even improved upon, owing to our variance reduction viewpoint.
BALanCe: Deep Bayesian Active Learning via Equivalence Class Annealing
Zhang, Renyu, Khan, Aly A., Grossman, Robert L., Chen, Yuxin
Active learning has demonstrated data efficiency in many fields. Existing active learning algorithms, especially in the context of deep Bayesian active models, rely heavily on the quality of uncertainty estimations of the model. However, such uncertainty estimates could be heavily biased, especially with limited and imbalanced training data. In this paper, we propose BALanCe, a Bayesian deep active learning framework that mitigates the effect of such biases. Concretely, BALanCe employs a novel acquisition function which leverages the structure captured by equivalence hypothesis classes and facilitates differentiation among different equivalence classes. Intuitively, each equivalence class consists of instantiations of deep models with similar predictions, and BALanCe adaptively adjusts the size of the equivalence classes as learning progresses. Besides the fully sequential setting, we further propose Batch-BALanCe -- a generalization of the sequential algorithm to the batched setting -- to efficiently select batches of training examples that are jointly effective for model improvement. We show that Batch-BALanCe achieves state-of-the-art performance on several benchmark datasets for active learning, and that both algorithms can effectively handle realistic challenges that often involve multi-class and imbalanced data.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
AAAI News
If you are an undergraduate or graduate student enrolled in a degree program at any college or university and would like to assist the staff members during AAAI-91 in Anaheim, California (14-19 July 1991), please contact Paul O'Rorke or Michael Pazzani at the address given here. In exchange for assisting AAAI staff members during your volunteer shift, you will receive a complimentary AAAI-9 1 conference registration, the AAAI-91 conference proceedings, and a special AAAI-91 T-shirt. Our 1991 volunteer coordinators, Paul O'Rorke and Michael Pazzani, can be reached by mail at AAAI, 445 Burgess Drive, Menlo Park, CA 94025 or by email at ncai@ics.uci.edu. All inquiries should include your name, address, telephone, adviser's name, and email address. Abstracts of Ph.D. dissertations submittted for publication in AI Magazine should be limited to 1500 words.
- Banking & Finance (1.00)
- Education > Educational Setting > Higher Education (0.55)
Case-Based Reasoning
Workshop Report The 1994 Workshop on Case-Based Reasoning (CBR) focused on the evaluation of CBR theories, models, systems, and system components. The CBR community addressed the evaluation of theories and implemented systems, with the consensus that a balance between novel innovations and evaluations could maximize progress. The 4 invited talks, 14 paper presentations, 19 poster presentations, and 1 summary panel discussion were attended by 66 participants. The four invited speakers discussed how CBR approaches can be evaluated in research projects, industrial applications, and military tasks. Katia Sycara (Carnegie Mellon University [CMU]) outlined an exhaustive set of measures for evaluating CBR systems and discussed how she applied some of these measures in empirical comparisons with other approaches for solving job shop scheduling problems.
AAAI News
Requests for information and suggestions for future news columns can be sent by electronic mail to aaai-news@sumex-aim.stanford edu or by US mail to the AAAI office Suggestions, comments, and questions on all aspects of the society are welcome -William 7 Clancey 4 Claudia Mnzzetti Present were William Clancey, Hector Levesque, Kathy McKeown, Daniel Bobrow, Bruce Buchanan, Lynn Conway, William Woods, Raj Reddy, Elaine Rich, Reid Smith, Geoffrey Hinton, Douglas Lenat, Nils Nilsson, Claudia Mazzetti, and Peter Patel-Schneider. Financial Committee (reported by Bruce Buchanan): The final financial report for the conference's net income was incomplete at this time The anticipated interest income for 1988 will be $340,000. Raj Reddy and Bruce Buchanan proposed a policy on the allocation of the AAAI annual interest income for philanthropic purposes. It was suggested and approved that the annual interest income would be used to support grants, scholarships and other projects [such as the electronic library) However, during years when anticipated revenues decrease, the interest income will support the operation of the office and philanthropic funding will decrease At this time, the Council approved the following activities. Smith described the number of submissions, the overall topical distribution, and the ratio of accepted to rejected papers Reid noted that the survey talks were increasing in popularity and were becoming a feature of the conference.
481
Workshop Grants Conference-AAAI-84, paid in 1985 (6.870) Misc. Income 130 Gross Profit: combined 944,383 Operating Expenses (256,698) Net Income 687,685 Fund Balance, beginning of year 874;634 Fund Balance, end of year $ 1.562:319 We have examined the balance sheet of the American Association for Artificial Intelligence as of December 31, 1985.
A Knowledge-Based
Within the academic and professional auditing communities, there has been growing concern about how to accurately assess the various risks associated with performing an audit. These risks are difficult to conceptualize in terms of numeric estimates. Models of decision making under conditions of risk are well established in decisiontheory literature. In these models, risk and return (payoffs) are specified in terms of numeric estimates, and the goal is to make a decision that maximizes some expected value. In addition, new information can be combined using a decision rule (such as Bayes' rule) for deriving revised estimates of risk.