Goto

Collaborating Authors

 South America


Analysis of Variational Bayesian Latent Dirichlet Allocation: Weaker Sparsity Than MAP

Neural Information Processing Systems

Latent Dirichlet allocation (LDA) is a popular generative model of various objects such as texts and images, where an object is expressed as a mixture of latent topics. In this paper, we theoretically investigate variational Bayesian (VB) learning in LDA. More specifically, we analytically derive the leading term of the VB free energy under an asymptotic setup, and show that there exist transition thresholds in Dirichlet hyperparameters around which the sparsity-inducing behavior drastically changes. Then we further theoretically reveal the notable phenomenon that VB tends to induce weaker sparsity than MAP in the LDA model, which is opposed to other models. We experimentally demonstrate the practical validity of our asymptotic theory on real-world Last.FM music data.


Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction

Neural Information Processing Systems

Extracting 3D shape of deforming objects in monocular videos, a task known as non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic datasets and controlled environments. Typically, the objects to reconstruct are pre-segmented, they exhibit limited rotations and occlusions, or full-length trajectories are assumed. In order to integrate NRSfM into current video analysis pipelines, one needs to consider as input realistic -thus incomplete- tracking, and perform spatio-temporal grouping to segment the objects from their surroundings. Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e.g., drifting, segmentation ``leaking'', optical flow ``bleeding'' etc. In this paper, we make a first attempt towards this goal, and propose a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos. For each trajectory cluster, we compute multiple reconstructions by minimizing the reprojection error and the rank of the 3D shape under different rank bounds of the trajectory matrix. We show that dense 3D shape is extracted and trajectories are completed across occlusions and low textured regions, even under mild relative motion between the object and the camera. We achieve competitive results on a public NRSfM benchmark while using fixed parameters across all sequences and handling incomplete trajectories, in contrast to existing approaches. We further test our approach on popular video segmentation datasets. To the best of our knowledge, our method is the first to extract dense object models from realistic videos, such as those found in Youtube or Hollywood movies, without object-specific priors.


Worst-Case Linear Discriminant Analysis as Scalable Semidefinite Feasibility Problems

arXiv.org Artificial Intelligence

In this paper, we propose an efficient semidefinite programming (SDP) approach to worst-case linear discriminant analysis (WLDA). Compared with the traditional LDA, WLDA considers the dimensionality reduction problem from the worst-case viewpoint, which is in general more robust for classification. However, the original problem of WLDA is non-convex and difficult to optimize. In this paper, we reformulate the optimization problem of WLDA into a sequence of semidefinite feasibility problems. To efficiently solve the semidefinite feasibility problems, we design a new scalable optimization method with quasi-Newton methods and eigen-decomposition being the core components. The proposed method is orders of magnitude faster than standard interior-point based SDP solvers. Experiments on a variety of classification problems demonstrate that our approach achieves better performance than standard LDA. Our method is also much faster and more scalable than standard interior-point SDP solvers based WLDA. The computational complexity for an SDP with $m$ constraints and matrices of size $d$ by $d$ is roughly reduced from $\mathcal{O}(m^3+md^3+m^2d^2)$ to $\mathcal{O}(d^3)$ ($m>d$ in our case).


On Coarse Graining of Information and Its Application to Pattern Recognition

arXiv.org Machine Learning

One of the goals of any scientific study is to identify regularities in obs ervations and classify them into possibly separate and simpler structures or c ategories. These categories can in turn be used to make inferences on the obj ects of interest. The major advantage of this approach is that one breaks down a co mplicated reality into a collection of simpler structures. In a similar way, in patte rn recognition one is concern with discovery of regularities in data but t hrough use of computer algorithms which can be used to classify the data int o different categories [Bis06]. Independent of ones point of view, any such ana lysis must start with definition of the categories. If one has sufficient informa tion about the categories and their members, it is an easy task to establish a precis e definition. However, for most real life situations this is not the case and the no tion of category cannot be precisely defined. Under such conditions a fru itful approach is to consider a category as collection of objects which are likely to sh are the same properties.


Toward Automatic Character Identification in Unannotated Narrative Text

AAAI Conferences

We present a case-based approach to character identification in natural language text in the context of our Voz system. Voz first extracts entities from the text, and for each one of them, computes a feature-vector using both linguistic information and external knowledge. We propose a new similarity measure called Continuous Jaccard that exploits those feature-vectors to compute the similarity between a given entity and those in the case-base, and thus determine which entities are characters or not. We evaluate our approach by comparing it with different similarity measures and feature sets. Results show an identification accuracy of up to 93.49%, significantly higher than recent related work.


Minimal Narrative Annotation Schemes and Their Applications

AAAI Conferences

The increased use of large corpora in narrative research has created new opportunities for empirical research and intelligent narrative technologies. To best exploit the value of these corpora, several research groups are eschewing complex discourse analysis techniques in favor of high-level minimalist narrative annotation schemes that can be quickly applied, achieve high inter-rater agreement, and are amenable to automation using machine-learning techniques. In this paper we compare different annotation schemes that have been employed by two groups of researchers to annotate large corpora of narrative text. Using a dual-annotation methodology, we investigate the correlation between narrative clauses distinguished by their structural role (orientation, action, evaluation), their subjectivity, and their narrative level within the discourse. We find that each simple narrative annotation scheme captures a structurally distinct characteristic of real-world narratives, and each combination of labels is evident in a corpus of 19 weblog narratives (951 narrative clauses). We discuss several potential applications of minimalist narrative annotation schemes, noting the combination of label across these two annotation schemes that best support each task.


Temporal and Object Relations in Plan and Activity Recognition for Robots Using Topic Models

AAAI Conferences

For robots to effectively interact with human users, it is necessary that they recognize what people in the environment are doing. This is especially the case when robots are performing complementary tasks since the human users are not following any specific process. There is much uncertainty in how people act and the duration of time they need to perform their actions. In this work, we discuss the use of topic models for such plan and activity recognition tasks. We begin with the development of a domain-independent representation of human postural information obtained from RGB-D sensor data. This representation may be used with Latent Dirichlet Allocation (LDA) topic models as an integration of plan and activity recognition. This is followed by a proposition of extensions to LDA that allow temporal and object relational information to also be used in plan and activity recognition tasks.


Crowd-Training Machine Learning Systems for Human Rights Abuse Documentation

AAAI Conferences

In this talk, I will describe efforts being undertaken in a collaboration between human rights advocates and Social media and mobile phones with good cameras and computer scientists at Carnegie Mellon University to Internet access are dramatically changing the nature of develop tools, methods and algorithms that will make it human rights documentation, reporting and advocacy. Key to this process, and like YouTube, Live Leak, Vimeo, and Facebook every apropos of this session, is the development of mechanisms week. In Syria, more than 650,000 videos have been to enable "the crowd" (i.e., those individuals around the uploaded to social media sites since the conflict started world who care about human rights and have relevant three years ago. This trove of interest dies down or moves on to new issues or places. In presenting this relevant in the long-term, what is irrelevant to the project, I hope to get feedback from other participants in situation or repetitive, and what is patently false or the workshop on how to achieve this goal, particularly by misleading.


Adapting Collaborative Filtering to Personalized Audio Production

AAAI Conferences

Recommending media objects to users typically requires users to rate existing media objects so as to understand their preferences. The number of ratings required to produce good suggestions can be reduced through collaborative filtering. Collaborative filtering is more difficult when prior users have not rated the same set of media objects as the current user or each other. In this work, we describe an approach to applying prior user data in a way that does not require users to rate the same media objects and that does not require imputation (estimation) of prior user ratings of objects they have not rated. This approach is applied to the problem of finding good equalizer settings for music audio and is shown to greatly reduce the number of ratings the current user must make to find a good equalization setting.


Crowdsourcing for Participatory Democracies: Efficient Elicitation of Social Choice Functions

AAAI Conferences

We present theoretical and empirical results demonstrating the usefulness of social choice functions in crowdsourcing for participatory democracies. First, we demonstrate the scalability of social choice functions by defining a natural notion of epsilon-approximation, and giving algorithms which efficiently elicit such approximations for two prominent social choice functions: the Borda rule and the Condorcet winner. This result circumvents previous prohibitive lower bounds and is surprisingly strong: even if the number of ideas is as large as the number of participants, each participant will only have to make a logarithmic number of comparisons, an exponential improvement over the linear number of comparisons previously needed. Second, we apply these ideas to Finland's recent off-road traffic law reform, an experiment on participatory democracy in real life. This allows us to verify the scaling predicted in our theory and show that the constant involved is also not large. In addition, by collecting data on the time that users take to complete rankings of varying sizes, we observe that eliciting partial rankings can further decrease elicitation time as compared to the common method of eliciting pairwise comparisons.