Researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) believe that analyzing photos like these could help us learn recipes and better understand people's eating habits. In a new paper with the Qatar Computing Research Institute (QCRI), the team trained an artificial intelligence system called Pic2Recipe to look at a photo of food and be able to predict the ingredients and suggest similar recipes. "In computer vision, food is mostly neglected because we don't have the large-scale datasets needed to make predictions," says Yusuf Aytar, an MIT postdoc who co-wrote a paper about the system with MIT Professor Antonio Torralba. The CSAIL team's project aims to build off of this work but dramatically expand in scope.
Researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) believe that analyzing photos like these could help us learn recipes and better understand people's eating habits. In a new paper with the Qatar Computing Research Institute (QCRI), the team trained an artificial intelligence system called Pic2Recipe to look at a photo of food and be able to predict the ingredients and suggest similar recipes. "In computer vision, food is mostly neglected because we don't have the large-scale datasets needed to make predictions," says Yusuf Aytar, an MIT postdoc who co-wrote a paper about the system with MIT Professor Antonio Torralba. They then used that data to train a neural network to find patterns and make connections between the food images and the corresponding ingredients and recipes.
An alternative to labelling huge amounts of data is to use synthetic images from a simulator. This is cheap as there is no labeling cost, but the synthetic images may not be realistic enough, resulting in poor generalization on real test images. To help close this performance gap, we've developed a method for refining synthetic images to make them look more realistic. We show that training models on these refined images leads to significant improvements in accuracy on various machine learning tasks.
The last few years have seen a growing range of technologies deployed to assist humanitarian efforts, whether it's peacekeeping drones, crowdsourcing, or image analytics. The paper uses AI to predict the gender of pre-paid mobile phone users with a high degree of accuracy. Rescue teams already use mobile phone data to help track those in need of assistance, but this new approach aims to go even further by helping to identify their gender, and therefore identify vulnerable groups such as women and children. Whilst mobile phones are almost ubiquitous, in the developing world, many are pre-paid, meaning that data often lacks key demographic identifiers.
But when competition in tech depends on machine learning systems powered by huge stockpiles of data, slaying a tech giant may be harder than ever. Their experiments on image recognition tied up 50 powerful graphics processors for two solid months, and used an unprecedentedly huge collection of 300 million labeled images (much work in image recognition uses a standard collection of just 1 million images). Crunching Google's giant dataset of 300 million images didn't produce a huge benefit--jumping from 1 million to 300 million images increased the object detection score achieved by just 3 percentage points--but the paper's authors say they think can widen that advantage by tuning their software to be better suited to super-large datasets. Tech companies do release data: Last year, Google released a vast dataset drawn from more than 7 million YouTube videos, and Salesforce opened up one drawn from Wikipedia to help algorithms work with language.
Computers have achieved near-human level accuracy for most of the tasks. Even if you somehow manage to live with the large size of models, the amount of run-time memory(RAM) required to run these models is way too high and limits their usage. Hence, the current trend is to deploy these models on servers with large graphical processing units(GPU), but issues like data privacy and internet connectivity demand usage of embedded deep learning. So huge efforts from people all around the world are geared towards accelerating the inference run-time of these networks, decreasing the size of the model and decreasing the run-time memory requirement.
A panel of experts at the recent 2017 Wharton Global Forum in Hong Kong outlined their views on the future for artificial intelligence (AI), robots, drones, other tech advances and how it all might affect employment in the future. Their comments came in a panel session titled, "Engineering the Future of Business," with Wharton Dean Geoffrey Garrett moderating and speakers Pascale Fung, a professor of electronic and computer engineering at Hong Kong University of Science and Technology; Vijay Kumar, dean of engineering at the University of Pennsylvania, and Nicolas Aguzin, Asian-Pacific chairman and CEO for J.P.Morgan. A fundamental problem is that most observers do not realize just how vast an amount of data is needed to operate in the physical world -- ever-increasing amounts, or, as Kumar calls it -- "exponential" amounts. "To have electric power and motors and batteries to power drones that can lift people in the air -- I think this is a pipe dream.
The authors have hypothesized that the development of intuition and creativity combined with the raw computing of AI heralds an age where well-designed and well-executed AI algorithms can solve complex medical problems, including the interpretation of diagnostic images, thereby replacing the microscopist. A separate image algorithm study was erroneously reported to differentiate between small cell and non–small cell lung carcinoma with the accuracy of expert pulmonary pathologists, but instead, multiple computational algorithms were used to subtype known non–small cell lung carcinomas and gliomas in separate experiments.4 The accuracy rate of each algorithm approached 70% to 85%. To this question, we believe that the answer is still a "no"--because this question is an erroneous comparison between 2 very dissimilar activities--high-level cognition (a human forte) versus high-level computation (an AI forte, at least for now). In the interim, because the language-based foundation of medicine is unlikely to disappear anytime soon, it would be more reasonable to see NGS, digital pathology, whole-slide imaging, and AI as synergistic technologies to human cognition.
The idea was that if the machine was taught to translate English to Korean and vice versa, and also English to Japanese and vice versa, could it translate Korean to Japanese, without resorting to English as a bridge between them? Based on the relation of various sentences with each other in the memory space of the neural network, Google's language experts and AI researchers believe this is possible. A specific type of neural network -- called a deep convolutional neural network -- optimised for image classification was trained to create an algorithm for automated detection of diabetic retinopathy and diabetic macular edema in retinal fundus photographs. The Neural Architecture Search Neural Network generated a new cell called the NASCell that outperforms all the previous human generated ones, so much that is already available in Tensorflow.