python


Using Uninformed & Informed Search Algorithms to Solve 8-Puzzle (n-Puzzle) in Python

@machinelearnbot

For any such board, the empty space may be legally swapped with any tile horizontally or vertically adjacent to it. Given an initial state of the board, the combinatorial search problem is to find a sequence of moves that transitions this state to the goal state; that is, the configuration with all tiles arranged in ascending order 0,1,…,n 2 1. The search space is the set of all possible states reachable from the initial state. Thus, the total cost of path is equal to the number of moves made from the initial state to the goal state.


Is Python or Perl faster than R?

@machinelearnbot

Though a lot of statistical / machine learning algorithms are now being implemented in Python - see Python and R articles - and it seems that Python is more appropriate for production code and big data flowing in real time, while R is often used for EDA - exporatory data analysis - in manual mode. My question is, if you make a true apple-to-apple comparison, what kind of computations does Python perform much faster than R, (or the other way around) depending on data size / memory size? Here I have in mind algorithms such as classifying millions of keywords, something requiring trillions of operations and not easy to do with Hadoop, requiring very efficient algorithms designed for sparse data (sometimes called sparse computing). For instance, the following article topic (see data science book pp 118-122) shows a Perl script running 10 times faster than the R equivalent, to produce R videos, but it's not because of a language or compiler issue, it's because the Perl version pre-computes all video frames very fast and load them in memory, then the video is displayed (using R ironically), while the R version produces (and displays) one frame at a time and does the whole job in R. What about accelerating tools, such as the CUDA accelerator for R?


YOLO: Core ML versus MPSNNGraph

@machinelearnbot

The Core ML conversion tools do not support Darknet, so we'll first convert the Darknet files to Keras format. However, as I'm writing this the Core ML conversion tools only support Keras version 1.2.2. Now that we have YOLO in a format that the Core ML conversion tools support, we can write a Python script to turn it into the .mlmodel Note: You do not need to perform these steps if you just want to run the demo app. This means we need to put our input images into a CVPixelBuffer object somehow, and also resize this pixel buffer to 416 416 pixels -- or Core ML won't accept it.


Some Image and Video Processing: Motion Estimation with Block-Matching in Videos, Noisy and Motion-blurred Image Restoration with Inverse Filter in Python and OpenCV

@machinelearnbot

The following figure shows how the quality of the transformed image decreases when compared to the original image, when an nxn LPF is applied and how the quality (measured in terms of PSNR) degrades as n (LPF kernel width) increases. As we go on increasing the kernel size, the quality fo the final image obtained by down/up sampling the original image decreases as n increases, as shown in the following figure. The first one is the video of some students working on a university corridor, as shown below (obtained from youtube), extract some consecutive frames, mark a face in one image and use that image to mark all thew faces om the remaining frames that are consecutive to each other, thereby mark the entire video and estimate the motion using the simple block matching technique only. The following figure shows the frame with the face marked, now we shall use this image and block matching technique to estimate the motion of the student in the video, by marking his face in all the consecutive frames and reconstructing the video, as shown below.. As can be seen from the following figure, the optimal median filter size is 5 5, which generates the highest quality output, when compared to the original image.


Python: Implementing a k-means algorithm with sklearn

@machinelearnbot

The purpose of k-means clustering is to be able to partition observations in a dataset into a specific number of clusters in order to aid in analysis of the data. Specifically, the k-means scatter plot will illustrate the clustering of specific stock returns according to their dividend yield. Specifically, we are devising a range from 1 to 20 (which represents our number of clusters), and our score variable denotes the percentage of variance explained by the number of clusters. Therefore, we set n_clusters equal to 3, and upon generating the k-means output use the data originally transformed using pca in order to plot the clusters: From the above, we see that the clustering algorithm demonstrates an overall positive correlation between stock returns and dividend yields, implying that stocks paying higher dividend yields can be expected to have higher overall returns.


How to hire a great Data Scientist - Saikat Sarkar, Data Science Exper

#artificialintelligence

He is presently acting as the Subject Matter Expert of Python in Aegis School of Data Science. He also talks about the importance of the Data Science domain today and in the years to come. Typically, a company should look at candidates with fairly strong programming skills and statistical understanding of ML. Modern day ML concepts are not known by the senior people of the organisation, they have worked on old school statistics, so that's the field they try to drill the candidates in.


k-nearest neighbor algorithm using Python

@machinelearnbot

Here's the article (short version) In machine learning, you may often wish to build predictors that allows to classify things into categories based on some set of associated values. Our task is to predict the species labels of a set of flowers based on their flower measurements. Since you'll be building a predictor based on a set of known correct classifications, kNN is a type of supervised machine learning (though somewhat confusingly, in kNN there is no explicit training phase; see lazy learning). More generally, her research interests lie in data-intensive molecular biology, machine learning (especially deep learning) and data science.


Infiniteconf 2017 - the conference on Big Data and Fast Data 6th - 7th Jul 2017

#artificialintelligence

Big data is transforming almost every aspect of science and the humanities, driven by the emergence of a data society. Join us at Infiniteconf and learn how to use the amazing technologies, practical tools and methods available to data scientists and engineering teams in three days packed with talks, discussions and practical workshops. We're now ready to unveil the line-up of speakers and experts who will make InfiniteConf 2017 the go-to Big Data and Data Science conference! Want to stay in the loop with the latest developments within the data community?


How to produce sounds in Python, R, Java, C, Perl, Javascript or even Linux?

@machinelearnbot

I want to create music generated by mathematical algorithms, or even turning big data files into sound files, just like NASA turned electromagnetic signals from space into music. My next question is how to save this data (frequency and duration, for each note) as a sound file? Interestingly, many of the answers I've found on the Internet are dealing with producing a Beep on your machine, to make you aware of when a long program running in the background has completed. Anyway, none of the many solutions offered - some as simple as Beep (412, 100) or echo -e '\a' issued from the command line - worked on my Windows/Cygwin laptop, despite the fact that I routinely watch videos with audible sound on it.


The Guide to Learning Python for Data Science

@machinelearnbot

We will discuss steps you should take for learning Python accompanied with some essential resources, such as the free Python for Data Analysis courses and tutorials from DataCamp as well as reading and learning materials. The most convenient way to go about this is to download the free Anaconda package from Continuum Analytics, as it contains the core Python language, as well as all of the essential libraries including NumPy, Pandas, SciPy, Matplotlib, and IPython. The analytics begins with statistical modeling, machine learning algorithms, data mining techniques, inferences and so on. Of course, as Python is a general purpose programming language, you are also free to program your own methods when you become an advanced user, though make sure you are not replicating what already exists.