Goto

Collaborating Authors

 Scientific Discovery


The Silent Rockstar of BigData: Machine Learning

#artificialintelligence

Sure, world is crying out loud that big-data's biggest problem will be resources. Demand has skyrocketed and everyone in the world is going into tailspin in meeting that demands. Companies are going frantic and overspending to hire data scientists to secure themselves from any upcoming shortfall. This is nothing but a sign that world needs our robot algorithm friends to pacify some demand and increase credibility to new paradigms. Who could forget Steve Balmer's famous quote comparing Big Data as a Machine Learning problem.


Key-Object – A New Paradigm in Search?

@machinelearnbot

Summary: The premise of this new Key Object architecture is that search is broken, at least as it applies to complex merchandise like computers, printers, and cameras. An innovative and workable solution is described. The question remains, is the pain sufficient to justify a switch? As we are all fond of saying, innovation follows pain points. Are we missing something in our uber-critical search capabilities that needs to be resolved?


Accelerating Science: A Computing Research Agenda

arXiv.org Artificial Intelligence

The emergence of "big data" offers unprecedented opportunities for not only accelerating scientific advances but also enabling new modes of discovery. Scientific progress in many disciplines is increasingly enabled by our ability to examine natural phenomena through the computational lens, i.e., using algorithmic or information processing abstractions of the underlying processes; and our ability to acquire, share, integrate and analyze disparate types of data. However, there is a huge gap between our ability to acquire, store, and process data and our ability to make effective use of the data to advance discovery. Despite successful automation of routine aspects of data management and analytics, most elements of the scientific process currently require considerable human expertise and effort. Accelerating science to keep pace with the rate of data acquisition and data processing calls for the development of algorithmic or information processing abstractions, coupled with formal methods and tools for modeling and simulation of natural processes as well as major innovations in cognitive tools for scientists, i.e., computational tools that leverage and extend the reach of human intellect, and partner with humans on a broad range of tasks in scientific discovery (e.g., identifying, prioritizing formulating questions, designing, prioritizing and executing experiments designed to answer a chosen question, drawing inferences and evaluating the results, and formulating new questions, in a closed-loop fashion). This calls for concerted research agenda aimed at: Development, analysis, integration, sharing, and simulation of algorithmic or information processing abstractions of natural processes, coupled with formal methods and tools for their analyses and simulation; Innovations in cognitive tools that augment and extend human intellect and partner with humans in all aspects of science.


Zaloni's new Mica release makes data discovery, curation and self-service data preparation more collaborative and intuit

#artificialintelligence

Zaloni, the data lake company, released today a new version of its Mica self-service data preparation platform at Strata Hadoop World. Mica provides users with an on-ramp for self-service data discovery, curation, and preparation of data in the data lake. With Mica, business users have the tools they need for rapidly discovering data sets, interacting with them and uncovering needed business insights. According to a January 2016 report, entitled Overcoming Obstacles That Prevent the Deployment and Use of a Modern BI and Analytics Platform, Gartner predicts that by "2017, most business users and analysts in organizations will have access to self-service tools to prepare data for analysis." "Data preparation can no longer exclusively be an IT function," said Ben Sharma, Zaloni's co-founder and CEO.


Is data science a new paradigm, or recycled material?

@machinelearnbot

Data science is the result of a new paradigm taking place in IT. The question was raised recently, and here I explain how and why data science is part of this new paradigm, and not recycled material. Many data science techniques are very different, if not the opposite of old techniques that were designed to be implemented on abacus, rather than computers. These new tools are often model-free. Indeed, old techniques such as logistic regression and classification trees don't even belong to data science, more stable techniques are used in data science.


Bayesian hypothesis testing for one bit compressed sensing with sensing matrix perturbation

arXiv.org Machine Learning

Bayesian hypothesis testing for one bit compressed sensing with sensing matrix perturbation H. Zayyani, M. Korki and F. Marvasti This letter proposes a low-computational Bayesian algorithm for noisy sparse recovery in the context of one bit compressed sensing with sensing matrix perturbation. The proposed algorithm which is called BHT -MLE comprises a sparse support detector and an amplitude estimator. The support detector utilizes Bayesian hypothesis test, while the amplitude estimator uses an ML estimator which is obtained by solving a convex optimization problem. Simulation results show that BHT -MLE algorithm offers more reconstruction accuracy than that of an ML estimator (MLE) at a low computational cost. Introduction: The one bit compressed sensing which is the extreme case of quantized compressed sensing [1] has been extensively investigated recently [2-9]. In the one bit compressed sensing framework, it is proved that accurate and stable recovery can be achieved by using only the sign of linear measurements [2].


Bayesian Hypothesis Testing for Block Sparse Signal Recovery

arXiv.org Machine Learning

This letter presents a novel Block Bayesian Hypothesis Testing Algorithm (Block-BHTA) for reconstructing block sparse signals with unknown block structures. The Block-BHTA comprises the detection and recovery of the supports, and the estimation of the amplitudes of the block sparse signal. The support detection and recovery is performed using a Bayesian hypothesis testing. Then, based on the detected and reconstructed supports, the nonzero amplitudes are estimated by linear MMSE. The effectiveness of Block-BHTA is demonstrated by numerical experiments.


Advances in Nonparametric Hypothesis Testing

AAAI Conferences

My research goal involves simultaneously addressing statistical and computational tradeoffs encountered in modern data analysis and high-dimensional machine learning (eg: hypothesis testing, regression, classification). My future interests include incorporating additional constraints like privacy or communication, and settings involving hidden utilities of multiple cooperative agents or competitive adversaries.


Everyone's Invited: A New Paradigm for Evaluation on Non-Transferable Datasets

AAAI Conferences

Social media data mining and analytics has stimulated a wide array of computational research. Traditionally, individual researchers are responsible for acquiring and managing their own datasets. However, the temporal nature of social data, the challenges involved in correctly preparing a dataset, the sheer scale of many datasets, and the proprietary nature of many data sources can make extending and comparing computational methods difficult and often impossible. In light of this, because replicability is a fundamental pillar of the scientific process and because method comparison is essential to characterizing computational advancements, we require an alternative to the traditional model of researcher-owned datasets. In this paper we propose FREESR, a framework that gives researchers the ability to develop and test method performance without requiring direct access to “shared” datasets. As a case study and first community resource, we have implemented the FREESR paradigm around the task of Tweet geolocation. The implementation showcases the clear suitability of this framework for the social media research context. Beyond the implementation, we see the FREESR paradigm as being an important step towards making study reproducibility and method comparison more principled and ubiquitous in the social media research community.


EXPERT SYSTEMS AND Al APPLICATIONS

AI Classics

Another concern has been to exploit (d) detection of metabolic disorders of genetic, developmental, toxic or infectious the AI methodology to understand better some fundamental questions in the origins by identification of organic constituents excreted in abnormal quantities philosophy of science, for example the processes by which explanatory hypotheses in human body fluids.