Goto

Collaborating Authors

 nci


Structural Concentration in Weighted Networks: A Class of Topology-Aware Indices

arXiv.org Machine Learning

This paper develops a unified framework for measuring concentration in weighted systems embedded in networks of interactions. While traditional indices such as the Herfindahl-Hirschman Index capture dispersion in weights, they neglect the topology of relationships among the elements receiving those weights. To address this limitation, we introduce a family of topology-aware concentration indices that jointly account for weight distributions and network structure. At the core of the framework lies a baseline Network Concentration Index (NCI), defined as a normalized quadratic form that measures the fraction of potential weighted interconnection realized along observed network links. Building on this foundation, we construct a flexible class of extensions that modify either the interaction structure or the normalization benchmark, including weighted, density-adjusted, null-model, degree-constrained, transformed-data, and multi-layer variants. This family of indices preserves key properties such as normalization, invariance, and interpretability, while allowing concentration to be evaluated across different dimensions of dependence, including intensity, higher-order interactions, and extreme events. Theoretical results characterize the indices and establish their relationship with classical concentration and network measures. Empirical and simulation evidence demonstrate that systems with identical weight distributions may exhibit markedly different levels of structural concentration depending on network topology, highlighting the additional information captured by the proposed framework. The approach is broadly applicable to economic, financial, and complex systems in which weighted elements interact through networks.



A Neural Corpus Indexer for Document Retrieval

Neural Information Processing Systems

Current state-of-the-art document retrieval solutions mainly follow an index-retrieve paradigm, where the index is hard to be directly optimized for the final retrieval target. In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query. To optimize the recall performance of NCI, we invent a prefix-aware weight-adaptive decoder architecture, and leverage tailored techniques including query generation, semantic document identifiers, and consistency-based regularization. Empirical studies demonstrated the superiority of NCI on two commonly used academic benchmarks, achieving +21.4% and +16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on TriviaQA dataset, respectively, compared to the best baseline method.



A Neural Corpus Indexer for Document Retrieval

Neural Information Processing Systems

Current state-of-the-art document retrieval solutions mainly follow an index-retrieve paradigm, where the index is hard to be directly optimized for the final retrieval target. In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query. To optimize the recall performance of NCI, we invent a prefix-aware weight-adaptive decoder architecture, and leverage tailored techniques including query generation, semantic document identifiers, and consistency-based regularization. Empirical studies demonstrated the superiority of NCI on two commonly used academic benchmarks, achieving 21.4% and 16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on TriviaQA dataset, respectively, compared to the best baseline method.


Autoassociative Learning of Structural Representations for Modeling and Classification in Medical Imaging

arXiv.org Artificial Intelligence

Annotation of medical imaging is notoriously time-consuming, prone to human biases, and hard to reconcile with the insatiable demands of contemporary machine learning. Deep Learning (DL) models trained on annotated data are often narrow in focusing on features that are specific to a given context (anomaly, pathology, etc.) rather than discovering and capturing general characteristics of observed structures and processes, which may make them susceptible to deceptive image features and lead to inferior generalization. We posit that one of the primary causes of this challenge is the unstructured character of DL architectures. Contemporary DL models are essentially intertwined compositions of dot products and nonlinearities, conglomerates of often billions of unsophisticated units that process data in a highly distributed and continuous, non-symbolic fashion. Their training requires large volumes of data, which are often hard to come by, and involves exorbitant amounts of compute and energy. If the task is posed within the supervised learning paradigm, those data need to be not only curated, but also annotated (labeled), which limits their availability even further. Last but not least, as each processing unit takes care only of a minuscule fraction of inference, it is very hard to explain the model and its decisions to a human in a transparent and succinct fashion. In this study, we argue for stronger involvement of unlabeled data in the construction of analytic and diagnostic ML models and propose ASR, a neurosymbolic architecture trained to form Auto-associative Structural Representations, in which a generative decoder synthesizes physically plausible structural models that explain the observed image.


Generative Dense Retrieval: Memory Can Be a Burden

arXiv.org Artificial Intelligence

Generative Retrieval (GR), autoregressively decoding relevant document identifiers given a query, has been shown to perform well under the setting of small-scale corpora. By memorizing the document corpus with model parameters, GR implicitly achieves deep interaction between query and document. However, such a memorizing mechanism faces three drawbacks: (1) Poor memory accuracy for fine-grained features of documents; (2) Memory confusion gets worse as the corpus size increases; (3) Huge memory update costs for new documents. To alleviate these problems, we propose the Generative Dense Retrieval (GDR) paradigm. Specifically, GDR first uses the limited memory volume to achieve inter-cluster matching from query to relevant document clusters. Memorizing-free matching mechanism from Dense Retrieval (DR) is then introduced to conduct fine-grained intra-cluster matching from clusters to relevant documents. The coarse-to-fine process maximizes the advantages of GR's deep interaction and DR's scalability. Besides, we design a cluster identifier constructing strategy to facilitate corpus memory and a cluster-adaptive negative sampling strategy to enhance the intra-cluster mapping ability. Empirical results show that GDR obtains an average of 3.0 R@100 improvement on NQ dataset under multiple settings and has better scalability.


Artificial Intelligence

#artificialintelligence

Artificial intelligence (AI) is everywhere: personal digital assistants answer our questions, robo-advisors trade stocks for us, and driverless cars will someday take us where we want to go. AI has penetrated our lives, and its use is exploding in biomedical research and health care--including across all dimensions of cancer research, where the potential applications for AI are vast. Artificial Intelligence (AI) is a computer performing tasks commonly associated with human intelligence. Humans are coding or programing a computer to act, reason, and learn. An algorithm or model is the code that tells the computer how to act, reason, and learn.


Artificial Intelligence - FY2021 Annual Plan - National Cancer Institute

#artificialintelligence

Artificial intelligence (AI) is everywhere: personal digital assistants answer our questions, robo-advisors trade stocks for us, and driverless cars will someday take us where we want to go. AI has penetrated our lives, and its use is exploding in biomedical research and health care--including across all dimensions of cancer research, where the potential applications for AI are vast. Artificial Intelligence (AI) is a computer performing tasks commonly associated with human intelligence. Humans are coding or programing a computer to act, reason, and learn. An algorithm or model is the code that tells the computer how to act, reason, and learn.


PATH: New Organization Fostering AI, Automation and Healthcare

#artificialintelligence

Technological advances in artificial intelligence (AI), automation, and robotics--once only dreamed of--are beginning to take shape and are promising to revolutionize healthcare and biotechnology. By 2021, AI alone could generate $6.7 billion in revenue from healthcare globally according to Frost & Sullivan. Some of the frontrunning scientists and physicians, who have worked toward AI and robotics advances for decades, spoke at the recent first annual Partnership for Artificial Intelligence, Automation, and Robotics in Healthcare (PATH) Summit, offering a distinct message: As these advanced technologies become integrated into healthcare and biotechnology there will be a significant change in the development and delivery of medicine resulting in improved outcomes, increased productivity, and wider dissemination. It is clear that doctors, health systems, and research institutes who don't embrace this new wave of technology will be left behind. PATH--a mission-driven, membership-based group launched by Jonathan Linkous, founding CEO of the American Telemedicine Association (ATA), and Mary Ann Liebert, publisher & CEO of Genetic Engineering & Biotechnology News (GEN), is dedicated to bringing together all stakeholders involved in the advancement of AI, automation, and robotics, to help guide policy related to its use and to promote its integration into health systems.