Not enough data to create a plot.
Try a different view from the menu above.
Country
The Decision List Machine
Sokolova, Marina, Marchand, Mario, Japkowicz, Nathalie, Shawe-taylor, John S.
We introduce a new learning algorithm for decision lists to allow features that are constructed from the data and to allow a tradeoff betweenaccuracy and complexity. We bound its generalization error in terms of the number of errors and the size of the classifier it finds on the training data. We also compare its performance on some natural data sets with the set covering machine and the support vector machine.
Automatic Derivation of Statistical Algorithms: The EM Family and Beyond
Fischer, Bernd, Schumann, Johann, Buntine, Wray, Gray, Alexander G.
Machine learning has reached a point where many probabilistic methods canbe understood as variations, extensions and combinations of a much smaller set of abstract themes, e.g., as different instances of the EM algorithm. This enables the systematic derivation of algorithms customized fordifferent models. Here, we describe the AUTOBAYES system which takes a high-level statistical model specification, uses powerful symbolic techniques based on schema-based program synthesis and computer algebra to derive an efficient specialized algorithm for learning that model, and generates executable code implementing that algorithm. This capability is far beyond that of code collections such as Matlab toolboxes oreven tools for model-independent optimization such as BUGS for Gibbs sampling: complex new algorithms can be generated without newprogramming, algorithms can be highly specialized and tightly crafted for the exact structure of the model and data, and efficient and commented code can be generated for different languages or systems.
Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch
Saul, Lawrence K., Lee, Daniel D., Isbell, Charles L., Cun, Yann L.
We have implemented a real time front end for detecting voiced speech and estimating its fundamental frequency. The front end performs the signal processing for voice-driven agents that attend to the pitch contours of human speech and provide continuous audiovisual feedback. The algorithm weuse for pitch tracking has several distinguishing features: it makes no use of FFTs or autocorrelation at the pitch period; it updates the pitch incrementally on a sample-by-sample basis; it avoids peak picking and does not require interpolation in time or frequency to obtain high resolution estimates;and it works reliably over a four octave range, in real time, without the need for postprocessing to produce smooth contours. The algorithm is based on two simple ideas in neural computation: the introduction of a purposeful nonlinearity, and the error signal of a least squares fit. The pitch tracker is used in two real time multimedia applications: avoice-to-MIDI player that synthesizes electronic music from vocalized melodies,and an audiovisual Karaoke machine with multimodal feedback. Both applications run on a laptop and display the user's pitch scrolling across the screen as he or she sings into the computer.
Automatic Acquisition and Efficient Representation of Syntactic Structures
Solan, Zach, Ruppin, Eytan, Horn, David, Edelman, Shimon
The distributional principle according to which morphemes that occur in identical contexts belong, in some sense, to the same category [1] has been advanced as a means for extracting syntactic structures from corpus data. We extend this principle by applying it recursively, and by using mutualinformation for estimating category coherence. The resulting model learns, in an unsupervised fashion, highly structured, distributed representations of syntactic knowledge from corpora. It also exhibits promising behavior in tasks usually thought to require representations anchored in a grammar, such as systematicity.
Binary Tuning is Optimal for Neural Rate Coding with High Temporal Resolution
Bethge, Matthias, Rotermund, David, Pawelzik, Klaus
Here we derive optimal gain functions for minimum mean square reconstruction fromneural rate responses subjected to Poisson noise. The shape of these functions strongly depends on the length T of the time window within which spikes are counted in order to estimate the underlying firingrate. A phase transition towards pure binary encoding occurs if the maximum mean spike count becomes smaller than approximately three provided the minimum firing rate is zero. For a particular function class, we were able to prove the existence of a second-order phase transition analytically.The critical decoding time window length obtained from the analytical derivation is in precise agreement with the numerical results. We conclude that under most circumstances relevant to information processingin the brain, rate coding can be better ascribed to a binary (low-entropy) code than to the other extreme ofrich analog coding. 1 Optimal neuronal gain functions for short decoding time windows The use of action potentials (spikes) as a means of communication is the striking feature of neurons in the central nervous system.
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nashequilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated ifthe agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems:identifying the game and learning to play.
Clustering with the Fisher Score
Tsuda, Koji, Kawanabe, Motoaki, Mรผller, Klaus-Robert
Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classification problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserved inthe space of the Fisher score, which turns out that the Fisher score consists of a few important dimensions with class information and many nuisance dimensions. When we perform clustering with the Fisher score, K-Means type methods are obviously inappropriate because they make use of all dimensions. So we will develop a novel but simple clustering algorithmspecialized for the Fisher score, which can exploit important dimensions.This algorithm is successfully tested in experiments with artificial data and real data (amino acid sequences).
Selectivity and Metaplasticity in a Unified Calcium-Dependent Model
Yeung, Luk Chong, Blais, Brian S., Cooper, Leon N., Shouval, Harel Z.
A unified, biophysically motivated Calcium-Dependent Learning model has been shown to account for various rate-based and spike time-dependent paradigms for inducing synaptic plasticity. Here, we investigate the properties of this model for a multi-synapse neuron that receives inputs with different spike-train statistics. In addition, we present a physiological form of metaplasticity, an activity-driven regulation mechanism, that is essential for the robustness ofthe model.
String Kernels, Fisher Kernels and Finite State Automata
Saunders, Craig, Vinokourov, Alexei, Shawe-taylor, John S.
In this paper we show how the generation of documents can be thought of as a k-stage Markov process, which leads to a Fisher kernel fromwhich the n-gram and string kernels can be reconstructed. The Fisher kernel view gives a more flexible insight into the string kernel and suggests how it can be parametrised in a way that reflects thestatistics of the training corpus. Furthermore, the probabilistic modellingapproach suggests extending the Markov process to consider subsequences of varying length, rather than the standard fixed-length approach used in the string kernel. We give a procedure for determining which subsequences are informative features and hence generate a Finite State Machine model, which can again be used to obtain a Fisher kernel. By adjusting the parametrisation we can also influence the weighting received by the features. In this way we are able to obtain a logarithmic weighting in a Fisher kernel. Finally, experiments are reported comparing the different kernels using the standard Bag of Words kernel as a baseline.