AITopics | bit vector

Collaborating Authors

bit vector

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ReLU Neural Networks, Polyhedral Decompositions, and Persistent Homolog

Liu, Yajing, Cole, Christina M, Peterson, Chris, Kirby, Michael

arXiv.org Artificial IntelligenceJun-30-2023

A ReLU neural network leads to a finite polyhedral decomposition of input space and a corresponding finite dual graph. We show that while this dual graph is a coarse quantization of input space, it is sufficiently robust that it can be combined with persistent homology to detect homological signals of manifolds in the input space from samples. This property holds for a variety of networks trained for a wide range of purposes that have nothing to do with this topological application. We found this feature to be surprising and interesting; we hope it will also be useful.

artificial intelligence, machine learning, vector, (17 more...)

arXiv.org Artificial Intelligence

2306.17418

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > Colorado (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A transparent approach to data representation

Deyo, Sean, Elser, Veit

arXiv.org Artificial IntelligenceJun-5-2023

We take inspiration from the non-negative matrix factorization (NMF) problem. In NMF, one large m n In 2006 Netflix released a data set -- roughly 100 million matrix M with non-negative values is factored as a product ratings of 17770 titles, given by 480189 viewers -- of two smaller non-negative matrices R and C of size and posed a challenge: Use this training data to predict m l and l n, respectively (where l m,n). Imagining the ratings in a separate, hidden set of ratings involving the set of ratings as the M matrix, with each row the same movies and viewers. The first to do so with a corresponding to a viewer and each column corresponding root-mean-square prediction error (RMSE) at least 10% to a movie, one can think of each row of R as an lower than that of Netflix's own system would receive a attribute vector for the corresponding viewer.

artificial intelligence, machine learning, movie, (19 more...)

arXiv.org Artificial Intelligence

2304.14209

Country:

North America > United States > Alabama (0.05)
North America > United States > New York > Tompkins County > Ithaca (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.40)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Training Data Attribution for Diffusion Models

Dai, Zheng, Gifford, David K

arXiv.org Artificial IntelligenceJun-3-2023

Diffusion models have become increasingly popular for synthesizing high-quality samples based on training datasets. However, given the oftentimes enormous sizes of the training datasets, it is difficult to assess how training data impact the samples produced by a trained diffusion model. The difficulty of relating diffusion model inputs and outputs poses significant challenges to model explainability and training data attribution. Here we propose a novel solution that reveals how training data influence the output of diffusion models through the use of ensembles. In our approach individual models in an encoded ensemble are trained on carefully engineered splits of the overall training data to permit the identification of influential training examples. The resulting model ensembles enable efficient ablation of training data influence, allowing us to assess the impact of training data on model outputs. We demonstrate the viability of these ensembles as generative models and the validity of our approach to assessing influence.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.02174

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Dual Graphs of Polyhedral Decompositions for the Detection of Adversarial Attacks

Jamil, Huma, Liu, Yajing, Cole, Christina M., Blanchard, Nathaniel, King, Emily J., Kirby, Michael, Peterson, Christopher

arXiv.org Artificial IntelligenceDec-2-2022

Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2211.13305

Country:

North America > United States > Colorado > Larimer County > Fort Collins (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Industry:

Government > Military (1.00)
Information Technology > Security & Privacy (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference

Hakert, Christian, Chen, Kuan-Hsun, Chen, Jian-Jia

arXiv.org Artificial IntelligenceSep-9-2022

In many machine learning applications, e.g., tree-based ensembles, floating point numbers are extensively utilized due to their expressiveness. Nowadays performing data analysis on embedded devices from dynamic data masses becomes available, but such systems often lack hardware capabilities to process floating point numbers, introducing large overheads for their processing. Even if such hardware is present in general computing systems, using integer operations instead of floating point operations promises to reduce operation overheads and improve the performance. In this paper, we provide \mdname, a full precision floating point comparison for random forests, by only using integer and logic operations. To ensure the same functionality preserves, we formally prove the correctness of this comparison. Since random forests only require comparison of floating point numbers during inference, we implement \mdname~in low level realizations and therefore eliminate the need for floating point hardware entirely, by keeping the model accuracy unchanged. The usage of \mdname~basically boils down to a one-by-one replacement of conditions: For instance, a comparison statement in C: if(pX[3]<=(float)10.074347) becomes if((*(((int*)(pX))+3))<=((int)(0x41213087))). Experimental evaluation on X86 and ARMv8 desktop and server class systems shows that the execution time can be reduced by up to $\approx 30\%$ with our novel approach.

artificial intelligence, implementation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.04181

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.93)

Add feedback

Sentiment Analysis with KNIME - KDnuggets

#artificialintelligenceDec-7-2021, 01:55:33 GMT

Sentiment analysis of free-text documents is a common task in the field of text mining. In sentiment analysis predefined sentiment labels, such as "positive" or "negative" are assigned to texts. Texts (here called documents) can be reviews about products or movies, articles, tweets, etc. In this article, we show you how to assign predefined sentiment labels to documents, using the KNIME Text Processing extension in combination with traditional KNIME learner and predictor nodes. A set of 2000 documents has been sampled from the training set of the Large Movie Review Dataset v1.0.

node, sentiment label, vector, (13 more...)

#artificialintelligence

Country: Europe > Germany > Berlin (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.74)

Add feedback

Gaussian Process Regression on Molecules in GPflow

#artificialintelligenceJul-15-2020, 04:40:18 GMT

This post demonstrates how to train a Gaussian Process (GP) to predict molecular properties using the GPflow library by creating a custom-defined Tanimoto kernel to operate on Morgan fingerprints. In this example, we'll be trying to predict the experimentally-determined electronic transition wavelengths of molecular photoswitches, a class of molecule that undergoes a reversible transformation between its E and Z isomers upon irradiation by light. We'll start by importing all of the machine learning and chemistry libraries we're going to use. For our molecular representation, we're going to be working with the widely-used Morgan fingerprints. Under this representation, molecules are represented as bit vectors.

artificial intelligence, machine learning, modeling & simulation, (12 more...)

#artificialintelligence

Technology:

Information Technology > Modeling & Simulation (0.75)
Information Technology > Artificial Intelligence > Machine Learning (0.52)

Add feedback

Learning across label confidence distributions using Filtered Transfer Learning

Tonekaboni, Seyed Ali Madani, Brereton, Andrew E., Safikhani, Zhaleh, Windemuth, Andreas, Haibe-Kains, Benjamin, MacKinnon, Stephen

arXiv.org Machine LearningJun-3-2020

Performance of neural network models relies on the availability of large datasets with minimal levels of uncertainty. Transfer Learning (TL) models have been proposed to resolve the issue of small dataset size by letting the model train on a bigger, task-related reference dataset and then fine-tune on a smaller, task-specific dataset. In this work, we apply a transfer learning approach to improve predictive power in noisy data systems with large variable confidence datasets. We propose a deep neural network method called Filtered Transfer Learning (FTL) that defines multiple tiers of data confidence as separate tasks in a transfer learning setting. The deep neural network is fine-tuned in a hierarchical process by iteratively removing (filtering) data points with lower label confidence, and retraining. In this report we use FTL for predicting the interaction of drugs and proteins. We demonstrate that using FTL to learn stepwise, across the label confidence distribution, results in higher performance compared to deep neural network models trained on a single confidence range. We anticipate that this approach will enable the machine learning community to benefit from large datasets with uncertain labels in fields such as biology and medicine.

artificial intelligence, confidence range, machine learning, (16 more...)

arXiv.org Machine Learning

2006.02528

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

Materializing Inferred and Uncertain Knowledge in RDF Datasets

McGlothlin, James P. (The University of Texas at Dallas) | Khan, Latifur (The University of Texas at Dallas)

AAAI ConferencesJul-15-2010

There is a growing need for efficient and scalable semantic web queries that handle inference. There is also a growing interest in representing uncertainty in semantic web knowledge bases. In this paper, we present a bit vector schema specifically designed for RDF (Resource Description Framework) datasets. We propose a system for materializing and storing inferred knowledge using this schema. We show experimental results that demonstrate that our solution drastically improves the performance of inference queries. We also propose a solution for materializing uncertain information and probabilities using multiple bit vectors and thresholds.

bit vector, inference, vector, (12 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Dallas County > Richardson (0.05)

Genre: Research Report (0.35)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.97)

Add feedback

Materializing and Persisting Inferred and Uncertain Knowledge in RDF Datasets

McGlothlin, James P. (The University of Texas at Dallas) | Khan, Latifur (The University of Texas At Dallas)

AAAI ConferencesJul-15-2010

As the semantic web grows in popularity and enters the mainstream of computer technology, RDF (Resource Description Framework) datasets are becoming larger and more complex. Advanced semantic web ontologies, especially in medicine and science, are developing. As more complex ontologies are developed, there is a growing need for efficient queries that handle inference. In areas such as research, it is vital to be able to perform queries that retrieve not just facts but also inferred knowledge and uncertain information. OWL (Web Ontology Language) defines rules that govern provable inference in semantic web datasets. In this paper, we detail a database schema using bit vectors that is designed specifically for RDF datasets. We introduce a framework for materializing and storing inferred triples. Our bit vector schema enables storage of inferred knowledge without a query performance penalty. Inference queries are simplified and performance is improved. Our evaluation results demonstrate that our inference solution is more scalable and efficient than the current state-of-the-art. There are also standards being developed for representing probabilistic reasoning within OWL ontologies. We specify a framework for materializing uncertain information and probabilities using these ontologies. We define a multiple vector schema for representing probabilities and classifying uncertain knowledge using thresholds. This solution increases the breadth of information that can be efficiently retrieved.

artificial intelligence, information retrieval query processing, natural language, (15 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.46)

Add feedback