wtm
The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses
Phoulady, Adrian, Granmo, Ole-Christoffer, Gorji, Saeed Rahimi, Phoulady, Hady Ahmady
The Tsetlin Machine (TM) is an interpretable mechanism for pattern recognition that constructs conjunctive clauses from data. The clauses capture frequent patterns with high discriminating power, providing increasing expression power with each additional clause. However, the resulting accuracy gain comes at the cost of linear growth in computation time and memory usage. In this paper, we present the Weighted Tsetlin Machine (WTM), which reduces computation time and memory usage by \emph{weighting} the clauses. Real-valued weighting allows one clause to replace multiple and supports fine-tuning the impact of each clause. Our novel scheme simultaneously learns both the composition of the clauses and their weights. Furthermore, we increase training efficiency by replacing $k$ Bernoulli trials of success probability $p$ with a uniform sample of average size $p k$, the size drawn from a binomial distribution. In our empirical evaluation, the WTM achieved the same accuracy as the TM on MNIST, IMDb, and Connect-4, requiring only $1/4$, $1/3$, and $1/50$ of the clauses, respectively. With the same number of clauses, the WTM outperformed the TM, obtaining peak test accuracies of respectively $98.58\%$, $90.15\%$, and $87.49\%$. Finally, our novel sampling scheme reduced sample generation time by a factor of $7$.
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > California > Sacramento County > Sacramento (0.04)
- Europe > Norway (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Did You Know? — Mining Interesting Trivia for Entities from Wikipedia
Prakash, Abhay (Indian Institute of Technology, Roorkee) | Chinnakotla, Manoj Kumar (Microsoft) | Patel, Dhaval (Indian Institute of Technology, Roorkee) | Garg, Puneet (Microsoft)
Trivia is any fact about an entity which is interesting due to its unusualness, uniqueness, unexpectedness or weirdness. In this paper, we propose a novel approach for mining entity trivia from their Wikipedia pages. Given an entity, our system extracts relevant sentences from its Wikipedia page and produces a list of sentences ranked based on their interestingness as trivia. At the heart of our system lies an interestingness ranker which learns the notion of interestingness, through a rich set of domain-independent linguistic and entity based features. Our ranking model is trained by leveraging existing user-generated trivia data available on the Web instead of creating new labeled data. We evaluated our system on movies domain and observed that the system performs significantly better than the defined baselines. A thorough qualitative analysis of the results revealed that our rich set of features indeed help in surfacing interesting trivia in the top ranks.
- North America > United States (0.14)
- Asia > South Korea > Busan > Busan (0.04)
- Asia > India > Uttarakhand > Roorkee (0.04)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)