Goto

Collaborating Authors

How does the Mind store Information?

arXiv.org Artificial Intelligence

How we store information in our mind has been a major intriguing open question. We approach this question not from a physiological standpoint as to how information is physically stored in the brain, but from a conceptual and algorithm standpoint as to the right data structures to be used to organize and index information. Here we propose a memory architecture directly based on the recursive sketching ideas from this paper to store information in memory as concise sketches. We also give a high level, informal exposition of the recursive sketching idea from the paper that makes use of subspace embeddings to capture deep network computations into a concise sketch. These sketches form an implicit knowledge graph that can be used to find related information via sketches from the past while processing an event.


When Is Machine Learning Right for Enterprise Search?

#artificialintelligence

Artificial intelligence (AI) and its little brother machine learning (ML) are receiving tremendous hype -- and deservedly so. From smart cars to computer-assisted medical diagnoses, machine learning is an incredibly powerful technology that has only scratched the surface of the impact it will eventually have on the world. One obvious use case is for online search, where Google continually uses ML to refine results based on user behavior. For large corporations, cognitive search capabilities can provide employees with valuable insights from massive amounts of structured and unstructured data. Machine learning can play a key role here as well, but it's not appropriate in every situation.


Non-Gaussian Component Analysis via Lattice Basis Reduction

arXiv.org Machine Learning

Non-Gaussian Component Analysis (NGCA) is the following distribution learning problem: Given i.i.d. samples from a distribution on $\mathbb{R}^d$ that is non-gaussian in a hidden direction $v$ and an independent standard Gaussian in the orthogonal directions, the goal is to approximate the hidden direction $v$. Prior work \cite{DKS17-sq} provided formal evidence for the existence of an information-computation tradeoff for NGCA under appropriate moment-matching conditions on the univariate non-gaussian distribution $A$. The latter result does not apply when the distribution $A$ is discrete. A natural question is whether information-computation tradeoffs persist in this setting. In this paper, we answer this question in the negative by obtaining a sample and computationally efficient algorithm for NGCA in the regime that $A$ is discrete or nearly discrete, in a well-defined technical sense. The key tool leveraged in our algorithm is the LLL method \cite{LLL82} for lattice basis reduction.


Getting Rid of the Deep Learning Silo in the Data Center

#artificialintelligence

With an electrical engineering education from Purdue University (PhD) and the Indian Institute of Technology Bombay (BS, MS), and nearly 25 years of experience in the semiconductor, systems, and hyperscale service provider industries, he has a broad perspective on hardware design and deployment. The elasticity of cloud infrastructure is a key enabler for enterprises and internet services, creating a shared pool of compute resources that various tenants can draw from as their workloads ebb and flow. Cloud tenants are spared the details of capacity and supply planning. This worked well because modern server systems are very efficient at a multitude of general computing tasks. Deep learning, however, creates new complexities for this model.


Applications of Word Embeddings in NLP - DZone AI

#artificialintelligence

Word embeddings are basically a form of word representation that bridges the human understanding of language to that of a machine. Word embeddings are distributed representations of text in an n-dimensional space. These are essential for solving most NLP problems. Domain adaptation is a technique that allows Machine learning and Transfer Learning models to map niche datasets that are all written in the same language but are still linguistically different. For example, legal documents, customer survey responses, and news articles are all unique datasets that need to be analyzed differently.