dataset


How Adversarial Attacks Work – XIX.ai

#artificialintelligence

Recent studies by Google Brain have shown that any machine learning classifier can be tricked to give incorrect predictions, and with a little bit of skill, you can get them to give pretty much any result you want. This fact steadily becomes worrisome as more and more systems are powered by artificial intelligence -- and many of them are crucial for our safe and comfortable life. Lately, safety concerns about AI were revolving around ethics -- today we are going to talk about more pressuring and real issues. Machine learning algorithms accept the input in a form of numeric vectors. Designing an input in a specific way to get the wrong result from the model is called an adversarial attack.


NIH Clinical Center provides one of the largest publicly available chest x-ray datasets to scientific community

#artificialintelligence

The dataset of scans is from more than 30,000 patients, including many with advanced lung disease. The NIH Clinical Center recently released over 100,000 anonymized chest x-ray images and their corresponding data to the scientific community. The release will allow researchers across the country and around the world to freely access the datasets and increase their ability to teach computers how to detect and diagnose disease. Ultimately, this artificial intelligence mechanism can lead to clinicians making better diagnostic decisions for patients. NIH compiled the dataset of scans from more than 30,000 patients, including many with advanced lung disease.


Profiting from Python & Machine Learning in the Financial Markets

#artificialintelligence

I finally beat the S&P 500 by 10%. This might not sound like much but when we're dealing with large amounts of capital and with good liquidity, the profits are pretty sweet for a hedge fund. More aggressive approaches have resulted in much higher returns. It all started after I read a paper by Gur Huberman titled "Contagious Speculation and a Cure for Cancer: A Non-Event that Made Stock Prices Soar," (with Tomer Regev, Journal of Finance, February 2001, Vol. "A Sunday New York Times article on a potential development of new cancer-curing drugs caused EntreMed's stock price to rise from 12.063 at the Friday close, to open at 85 and close near 52 on Monday.


Predicting Political Bias with Python – Linalgo – Medium

#artificialintelligence

Recent scandals around fake news have spurred an interest in programmatically gauging the journalistic quality of an article. Companies like Factmata and Full Fact have received funding from Google, and Facebook has launched its "Journalism Project" earlier this year to fight the spread of fake stories in its feed. Discriminating between facts and fake information is a daunting task but often times, looking at the publisher is a good proxy to gauge the journalistic quality of an article. And while there is no objective metric to evaluate the quality of a newspaper, its overall quality and political bias is generally agreed upon (one can for example refer to https://mediabiasfactcheck.com/). In this article, we present a few techniques to automatically assess the journalistic quality of a newspaper.


Cool Projects from Udacity Students – Self-Driving Cars – Medium

#artificialintelligence

I have a pretty awesome backlog of blog posts from Udacity Self-Driving Car students, partly because they're doing awesome things and partly because I fell behind on reviewing them for a bit. Here are five that look pretty neat. This is a great blog post if you're looking to get started with point cloud files. The most popular laptop among Silicon Valley software developers is the Macbook Pro. The current version of the Macbook Pro, however, does not include an NVIDIA GPU, which restricts its ability to use CUDA and cuDNN, NVIDIA's tools for accelerating deep learning.


The Art of Story Telling in Data Science and how to create data stories?

@machinelearnbot

The idea of storytelling is fascinating; to take an idea or an incident, and turn it into a story. It brings the idea to life and makes it more interesting. This happens in our day to day life. Whether we narrate a funny incident or our findings, stories have always been the "go-to" to draw interest from listeners and readers alike. For instance; when we talk of how one of our friends got scolded by a teacher, we tend to narrate the incident from the beginning so that a flow is maintained.


Becoming a Machine Learning Engineer Step 2: Pick a Process

@machinelearnbot

After a few applied machine learning problems, you usually develop a pattern or process for quickly getting started and achieving good results. Once you have this process it is trivial to use it again and again on project after project. The more developed your process, the faster you can get to results! Let me give you a head start and teach you a 5-step systematic process that I developed while becoming a machine learning engineer. This step is all about learning more about the problem at hand.


Generic Representation Learning

@machinelearnbot

The 2-dimensional embeddings (tSNE) of our representation for MIT places dataset ('library' category) and an unseen subset of our dataset are provided below. The representation organizes the images based on their 3D content (scene layout, relative camera pose to the scene, etc) and independent of their semantics (visible objects, architectural styles) or low-level properties (color, texture, etc). This suggests that the representation must have a notion of certain basic 3D concepts, though it was never provided with an explicit supervision for such tasks (especially for non-matching images, while all tSNE images are non-matching). The tSNE of our dataset also suggests the patches are organized based on their coarse surface normals (again, a task that the representation didn't receive a supervision for). See the section below for quantitative evaluation of our representation for surface normal estimation on NYUv2 dataset.


amir32002/3D_Street_View

@machinelearnbot

This repository shares a large dataset of street view images (25 million images and 118 million matching image pairs) with their 6DOF camera pose, 3D models of 8 cities, and extended metadata. The data comes in bundles of matching images; the content of the matching pairs show the same physical point while the camera viewpoint shows a large baseline (often 120 degrees). The dataset can be used for learning 6DOF camera pose estimation/visual odometry, image matching, and various 3D estimations. You can see a few sample image bundles from the dataset below and more examples here and here. The dataset was collected automatically without human annotation by developing a system to intergrate georeferenced 3D models of cities with google street view images and their geo-metadata.


tensorflow/lattice

@machinelearnbot

TensorFlow Lattice is a library that implements lattice based models which are fast-to-evaluate and interpretable (optionally monotonic) models, also known as interpolated look-up tables. It includes a collection of TensorFlow Lattice Estimators, which you can use like any TensorFlow Estimator, and it also includes lattices and piecewise linear calibration as layers that can be composed into custom models. Note that TensorFlow Lattice is not an official Google product. A lattice is an interpolated look-up table that can approximate arbitrary input-output relationships in your data. It overlaps a regular grid on your input space, and it learns values for the output in the vertices of the grid.