Collaborating Authors

Avoiding False Positive in Multi-Instance Learning

Neural Information Processing Systems

In multi-instance learning, there are two kinds of prediction failure, i.e., false negative and false positive. Current research mainly focus on avoding the former. We attempt to utilize the geometric distribution of instances inside positive bags to avoid both the former and the latter. Based on kernel principal component analysis, we define a projection constraint for each positive bag to classify its constituent instances far away from the separating hyperplane while place positive instances and negative instances at opposite sides. We apply the Constrained Concave-Convex Procedure to solve the resulted problem.

Predictive modelling, how to build ground-truth and extract features for action prediction? • /r/MachineLearning


I have a dataset of users, each user has has daily information about his activities (numerical values representing some measurements of his physical activities). In addition, each user in each day has a boolean value that represents if he/she took a particular action. The data set is not fixed, so new activities information and action are added for each user each new day. Build a model that predicts which user is likely to take the action in the near future (e.g. in any of the next 7 days). My approach is to build feature vectors representing the activity values for each users over a period of time, and use the action column as a source of ground-truth.

How Intelligent Is Your AI? - Digital Transformation Xperience


To evaluate whether the strategy or approach you're evaluating requires artificial intelligence, let's turn back to our definition of AI as any computer-based system that observes, analyzes, and learns. Thus, a true AI system is able to sense its own environment and augment its base of knowledge in close to real time. A Tesla's onboard computers analyze the images, blips, and other data it collects to make sense of its surroundings, allowing for the automation of several driving decisions. Using this data, companies and sales professionals are able to arrive at many counterintuitive insights -- for instance, calls with more positive sentiment are actually associated with lower closing rates than calls with less positive sentiment. The ability to test, learn, and improve is only available to the most advanced machine learning systems today.

Positive Semidefinite Metric Learning with Boosting

Neural Information Processing Systems

The learning of appropriate distance metrics is a critical problem in classification. In this work, we propose a boosting-based technique, termed BoostMetric, for learning a Mahalanobis distance metric. One of the primary difficulties in learning such a metric is to ensure that the Mahalanobis matrix remains positive semidefinite. Semidefinite programming is sometimes used to enforce this constraint, but does not scale well. BoostMetric is instead based on a key observation that any positive semidefinite matrix can be decomposed into a linear positive combination of trace-one rank-one matrices.

Precision and Recall


Imagine a machine learning algorithm is tasked with identifying the number of bananas within a bowl of fruit. In total, the bowl contains 10 pieces of fruit, 4 of which are bananas, and 6 are apples. The algorithm determines that there are 5 bananas, and 5 apples. The number of bananas that were counted correctly are known as true positives, while the items that were identified incorrectly as bananas are called false positives. In this example, there are 4 true positives, and one false positive, making the algorithms precision 4/5, and its recall is 4/10.