Goto

Collaborating Authors

 Genre


Tuned Models of Peer Assessment in MOOCs

arXiv.org Machine Learning

In massive open online courses (MOOCs), peer grading serves as a critical tool for scaling the grading of complex, open-ended assignments to courses with tens or hundreds of thousands of students. But despite promising initial trials, it does not always deliver accurate results compared to human experts. In this paper, we develop algorithms for estimating and correcting for grader biases and reliabilities, showing significant improvement in peer grading accuracy on real data with 63,199 peer grades from Coursera's HCI course offerings --- the largest peer grading networks analysed to date. We relate grader biases and reliabilities to other student factors such as student engagement, performance as well as commenting style. We also show that our model can lead to more intelligent assignment of graders to gradees.


Bayesian Discovery of Multiple Bayesian Networks via Transfer Learning

arXiv.org Machine Learning

Bayesian network structure learning algorithms with limited data are being used in domains such as systems biology and neuroscience to gain insight into the underlying processes that produce observed data. Learning reliable networks from limited data is difficult, therefore transfer learning can improve the robustness of learned networks by leveraging data from related tasks. Existing transfer learning algorithms for Bayesian network structure learning give a single maximum a posteriori estimate of network models. Yet, many other models may be equally likely, and so a more informative result is provided by Bayesian structure discovery. Bayesian structure discovery algorithms estimate posterior probabilities of structural features, such as edges. We present transfer learning for Bayesian structure discovery which allows us to explore the shared and unique structural features among related tasks. Efficient computation requires that our transfer learning objective factors into local calculations, which we prove is given by a broad class of transfer biases. Theoretically, we show the efficiency of our approach. Empirically, we show that compared to single task learning, transfer learning is better able to positively identify true edges. We apply the method to whole-brain neuroimaging data.


Bridging Information Criteria and Parameter Shrinkage for Model Selection

arXiv.org Machine Learning

Model selection based on classical information criteria, such as BIC, is generally computationally demanding, but its properties are well studied. On the other hand, model selection based on parameter shrinkage by $\ell_1$-type penalties is computationally efficient. In this paper we make an attempt to combine their strengths, and propose a simple approach that penalizes the likelihood with data-dependent $\ell_1$ penalties as in adaptive Lasso and exploits a fixed penalization parameter. Even for finite samples, its model selection results approximately coincide with those based on information criteria; in particular, we show that in some special cases, this approach and the corresponding information criterion produce exactly the same model. One can also consider this approach as a way to directly determine the penalization parameter in adaptive Lasso to achieve information criteria-like model selection. As extensions, we apply this idea to complex models including Gaussian mixture model and mixture of factor analyzers, whose model selection is traditionally difficult to do; by adopting suitable penalties, we provide continuous approximators to the corresponding information criteria, which are easy to optimize and enable efficient model selection.


Machine Learning in Proof General: Interfacing Interfaces

arXiv.org Artificial Intelligence

It allows users to gather proof statistics related to shapes of goals, sequences of applied tactics, and proof tree structures from the libraries of interactive higher-order proofs written in Coq and SSReflect. The gathered data is clustered using the state-of-the-art machine learning algorithms available in MATLAB and Weka. ML4PG provides automated interfacing between Proof General and MATLAB/Weka. The results of clustering are used by ML4PG to provide proof hints in the process of interactive proof development.


A Dynamic Algorithm for the Longest Common Subsequence Problem using Ant Colony Optimization Technique

arXiv.org Artificial Intelligence

We present a dynamic algorithm for solving the Longest Common Subsequence Problem using Ant Colony Optimization Technique. The Ant Colony Optimization Technique has been applied to solve many problems in Optimization Theory, Machine Learning and Telecommunication Networks etc. In particular, application of this theory in NP-Hard Problems has a remarkable significance. Given two strings, the traditional technique for finding Longest Common Subsequence is based on Dynamic Programming which consists of creating a recurrence relation and filling a table of size . The proposed algorithm draws analogy with behavior of ant colonies function and this new computational paradigm is known as Ant System. It is a viable new approach to Stochastic Combinatorial Optimization. The main characteristics of this model are positive feedback, distributed computation, and the use of constructive greedy heuristic. Positive feedback accounts for rapid discovery of good solutions, distributed computation avoids premature convergence and greedy heuristic helps find acceptable solutions in minimum number of stages. We apply the proposed methodology to Longest Common Subsequence Problem and give the simulation results. The effectiveness of this approach is demonstrated by efficient Computational Complexity. To the best of our knowledge, this is the first Ant Colony Optimization Algorithm for Longest Common Subsequence Problem.


Achieving greater Explanatory Power and Forecasting Accuracy with Non-uniform spread Fuzzy Linear Regression

arXiv.org Artificial Intelligence

Fuzzy regression models have been applied to several Operations Research applications viz., forecasting and prediction. Earlier works on fuzzy regression analysis obtain crisp regression coefficients for eliminating the problem of increasing spreads for the estimated fuzzy responses as the magnitude of the independent variable increases. But they cannot deal with the problem of non-uniform spreads. In this work, a three-phase approach is discussed to construct the fuzzy regression model with non-uniform spreads to deal with this problem. The first phase constructs the membership functions of the least-squares estimates of regression coefficients based on extension principle to completely conserve the fuzziness of observations. They are then defuzzified by the centre of area method to obtain crisp regression coefficients in the second phase. Finally, the error terms of the method are determined by setting each estimated spread equal to its corresponding observed spread. The Tagaki-Sugeno inference system is used for improving the accuracy of forecasts. The simulation example demonstrates the strength of fuzzy linear regression model in terms of higher explanatory power and forecasting performance.


Fuzzy Integer Linear Programming Mathematical Models for Examination Timetable Problem

arXiv.org Artificial Intelligence

ETP is NP Hard combinatorial optimization problem. It has received tremendous research attention during the past few years given its wide use in universities. In this Paper, we develop three mathematical models for NSOU, Kolkata, India using FILP technique. To deal with impreciseness and vagueness we model various allocation variables through fuzzy numbers. The solution to the problem is obtained using Fuzzy number ranking method. Each feasible solution has fuzzy number obtained by Fuzzy objective function. The different FILP technique performance are demonstrated by experimental data generated through extensive simulation from NSOU, Kolkata, India in terms of its execution times. The proposed FILP models are compared with commonly used heuristic viz. ILP approach on experimental data which gives an idea about quality of heuristic. The techniques are also compared with different Artificial Intelligence based heuristics for ETP with respect to best and mean cost as well as execution time measures on Carter benchmark datasets to illustrate its effectiveness. FILP takes an appreciable amount of time to generate satisfactory solution in comparison to other heuristics. The formulation thus serves as good benchmark for other heuristics. The experimental study presented here focuses on producing a methodology that generalizes well over spectrum of techniques that generates significant results for one or more datasets. The performance of FILP model is finally compared to the best results cited in literature for Carter benchmarks to assess its potential. The problem can be further reduced by formulating with lesser number of allocation variables it without affecting optimality of solution obtained. FLIP model for ETP can also be adapted to solve other ETP as well as combinatorial optimization problems.


Discovering Stock Price Prediction Rules of Bombay Stock Exchange Using Rough Fuzzy Multi Layer Perception Networks

arXiv.org Artificial Intelligence

In India financial markets have existed for many years. A functionally accented, diverse, efficient and flexible financial system is vital to the national objective of creating a market driven, productive and competitive economy. Today markets of varying maturity exist in equity, debt, commodities and foreign exchange. In this work we attempt to generate prediction rules scheme for stock price movement at Bombay Stock Exchange using an important Soft Computing paradigm viz., Rough Fuzzy Multi Layer Perception. The use of Computational Intelligence Systems such as Neural Networks, Fuzzy Sets, Genetic Algorithms, etc. for Stock Market Predictions has been widely established. The process is to extract knowledge in the form of rules from daily stock movements. These rules can then be used to guide investors. To increase the efficiency of the prediction process, Rough Sets is used to discretize the data. The methodology uses a Genetic Algorithm to obtain a structured network suitable for both classification and rule extraction. The modular concept, based on divide and conquer strategy, provides accelerated training and a compact network suitable for generating a minimum number of rules with high certainty values. The concept of variable mutation operator is introduced for preserving the localized structure of the constituting Knowledge Based sub-networks, while they are integrated and evolved. Rough Set Dependency Rules are generated directly from the real valued attribute table containing Fuzzy membership values. The paradigm is thus used to develop a rule extraction algorithm. The extracted rules are compared with some of the related rule extraction techniques on the basis of some quantitative performance indices. The proposed methodology extracts rules which are less in number, are accurate, have high certainty factor and have low confusion with less computation time.


Trapezoidal Fuzzy Numbers for the Transportation Problem

arXiv.org Artificial Intelligence

Transportation Problem is an important problem which has been widely studied in Operations Research domain. It has been often used to simulate different real life problems. In particular, application of this Problem in NP Hard Problems has a remarkable significance. In this Paper, we present the closed, bounded and non empty feasible region of the transportation problem using fuzzy trapezoidal numbers which ensures the existence of an optimal solution to the balanced transportation problem. The multivalued nature of Fuzzy Sets allows handling of uncertainty and vagueness involved in the cost values of each cells in the transportation table. For finding the initial solution of the transportation problem we use the Fuzzy Vogel Approximation Method and for determining the optimality of the obtained solution Fuzzy Modified Distribution Method is used. The fuzzification of the cost of the transportation problem is discussed with the help of a numerical example. Finally, we discuss the computational complexity involved in the problem. To the best of our knowledge, this is the first work on obtaining the solution of the transportation problem using fuzzy trapezoidal numbers.


A Comparative study of Transportation Problem under Probabilistic and Fuzzy Uncertainties

arXiv.org Artificial Intelligence

Transportation Problem is an important aspect which has been widely studied in Operations Research domain. It has been studied to simulate different real life problems. In particular, application of this Problem in NP- Hard Problems has a remarkable significance. In this Paper, we present a comparative study of Transportation Problem through Probabilistic and Fuzzy Uncertainties. Fuzzy Logic is a computational paradigm that generalizes classical two-valued logic for reasoning under uncertainty. In order to achieve this, the notation of membership in a set needs to become a matter of degree. By doing this we accomplish two things viz., (i) ease of describing human knowledge involving vague concepts and (ii) enhanced ability to develop cost-effective solution to real-world problem. The multi-valued nature of Fuzzy Sets allows handling uncertain and vague information. It is a model-less approach and a clever disguise of Probability Theory. We give comparative simulation results of both approaches and discuss the Computational Complexity. To the best of our knowledge, this is the first work on comparative study of Transportation Problem using Probabilistic and Fuzzy Uncertainties.