"What exactly is computer vision then? Computer vision is a research field working to equip computers with the ability to process and understand visual data, as sighted humans can. Human brains process the gigabytes of data passing through our eyes every second and translate that data into sight - that is, into discrete objects and entities we can recognise or understand. Similarly, computer vision aims to give computers the ability to understand what they are seeing, and act intelligently on that knowledge."
– Computer vision: Cheat Sheet. ZDNet.com (December 6, 2011), by Natasha Lomas.
Therefore, in this article, we focus on how to use a couple of utility methods from the Keras (TensorFlow) API to streamline the training of such models (specifically for a classification task) with a proper data pre-processing. In the end, we aim to write a single utility function, which can take just the name of your folder where training images are stored, and give you back a fully trained CNN model. We use a dataset consisting of 4000 images of flowers for this demo. The dataset can be downloaded from the Kaggle website here. The data collection is based on the data Flickr, Google images, Yandex images.
The recent surveys, studies, forecasts and other quantitative assessments of the health and progress of AI provided new numbers regarding business leaders' assessment of China as a global AI leader, the current worldwide ranking of China's AI-related entrepreneurial and research activities, plans for AI adoption by U.S. enterprises and expectations regarding its impact on jobs, and the use of AI in face recognition, physical security monitoring, cashierless retail, categorizing open-ended survey responses, and detecting plant diseases and atrial fibrillation. A doctor examines a magnetic resonance image on a computer screen during the CHAIN Cup at the China National Convention Center in Beijing, June 30, 2018. A computer running artificial intelligence software defeated two teams of human doctors in accurately recognizing maladies in magnetic resonance images in a contest that was billed as the world's first competition in neuroimaging between AI and human experts. The U.S. Department of Homeland Security estimates face recognition will scrutinize 97% of outbound airline passengers by 2023 [The Economist] More than 4.5 million websites use reCAPTCHA and the system collects hundreds of millions of daily solves or more than 100 person-years of labor every day; Google/reCAPTCHA has extracted to date over $7 billion of free labor [hcaptcha] The Bureau of Labor Statistics' injury and illness database is built upon text-based descriptions of work-related injuries and illnesses it receives from workplaces across the country each year; categorizing the description into actionable data used to be done manually, but this year, the BLS has done 80% of that automatically using deep neural networks [governmentCIO] The AI market worldwide is estimated to grow by $75.54 billion from 2019 to 2023 [Technavio] The AI market worldwide is estimated to reach $202.57 Data is eating the world quote of the week: "The market for data labeling passed $500 million in 2018 and it will reach $1.2 billion by 2023, according to the research firm Cognilytica. This kind of work, the study showed, accounted for 80 percent of the time spent building A.I. technology"--The New York Times AI is "mimicking the brain" quote of the week: "Computer vision… is nothing like the human sort"--The Economist Robots are eating the world quote of the week: "A human can certainly move a part faster than a cobot [collaborative robot]. However, it does not take coffee breaks and continues to work for several hours after we have already gone home"--Pekka Myller, Ket-Met Robots are eating the world quote of the 19th century: "[A Linotype] could work like six men and do everything but drink, swear, and go out on strike"--Mark Twain
Facial recognition technology is all around us--it's at concerts, airports, and apartment buildings. But its use by law enforcement agencies and courtrooms raises particular concerns about privacy, fairness, and bias, according to Jennifer Lynch, the Surveillance Litigation Director at the Electronic Frontier Foundation. Some studies have shown that some of the major facial recognition systems are inaccurate. Amazon's software misidentified 28 members of Congress and matched them with criminal mugshots. These inaccuracies tend to be far worse for people of color and women.
Jimmy Gomez is a California Democrat, a Harvard graduate and one of the few Hispanic lawmakers serving in the US House of Representatives. But to Amazon's facial recognition system, he looks like a potential criminal. Gomez was one of 28 US Congress members falsely matched with mugshots of people who've been arrested, as part of a test the American Civil Liberties Union ran last year of the Amazon Rekognition program. Nearly 40 percent of the false matches by Amazon's tool, which is being used by police, involved people of color. This is part of a CNET special report exploring the benefits and pitfalls of facial recognition.
The Socionext team highlighted our collaboration with Network Optix on AI neural network and computer vision solutions at Securetech 2019. Video data is exploding, and we aren't just talking about smartphone video clips or the latest hit streaming programs. All these cameras generate an enormous amount of video every day – way too much data for any number of human eyes to process effectively. Computer vision technology is a natural fit to analyze and categorize bulk video data. The obstacle, however, has been finding a way to cost-effectively and efficiently apply AI neural networking technology to video analytics at scale.
Object detection remains the primary driver for applications such as autonomous driving and intelligent video analytics. Object detection applications require substantial training using vast datasets to achieve high levels of accuracy. NVIDIA GPUs excel at the parallel compute performance required to train large networks in order to generate datasets for object detection inference. This post covers what you need to get up to speed using NVIDIA GPUs to run high performance object detection pipelines quickly and efficiently. Our python application takes frames from a live video stream and performs object detection on GPUs. We use a pre-trained Single Shot Detection (SSD) model with Inception V2, apply TensorRT's optimizations, generate a runtime for our GPU, and then perform inference on the video feed to get labels and bounding boxes.
SHANAHAN: Without question there are some of those. Computer vision is well-established in commercial industry. However, I have not seen in any case yet [where] you just take a commercial capability and immediately apply it to a military problem. First of all, the data set ... you've got to train it against the different kind of datasets. With Maven, we had to get real, full-motion video from the Middle East, tens of thousands of hours of it, curate it, label it, train against it.
Element AI, a company that builds artificial intelligence (AI) tools for enterprises, has raised CAD $200 million (USD $151 million) in a series B round of funding from a host of existing and new investors, including Gouvernement du Québec, Data Collective (DCVC), Hanwha Asset Management, BDC, Real Ventures, Caisse de dépôt et placement du Québec (CDPQ), and McKinsey & Company. Founded in 2016, Element AI develops AI software "that helps people work smarter," according to its marketing blurb. So far, the startup has focused on partnering with enterprises that want to use AI but lack the required expertise, connecting businesses with machine learning experts in-house and elsewhere to address specific problems. Earlier this year, Element AI officially launched its first products for enterprise customers in the form of "decision-making automation tools." Using computer vision, optical character recognition (OCR), and other AI mechanisms, Element AI promises to enable machines to do things like "read" documents or answer workers' questions about internal operations using natural language queries.