A new AI from Microsoft aims to automatically caption images in documents and emails so that software for visual impairments can read it out. Researchers from Microsoft explained their machine learning model in a paper on preprint repository arXiv. The model uses VIsual VOcabulary pre-training (VIVO) which leverages large amounts of paired image-tag data to learn a visual vocabulary. A second dataset of properly captioned images is then used to help teach the AI how to best describe the pictures. "Ideally, everyone would include alt text for all images in documents, on the web, in social media – as this enables people who are blind to access the content and participate in the conversation. But, alas, people don't," said Saqib Shaikh, a software engineering manager with Microsoft's AI platform group.
With its unique ability to identify and'learn' from data patterns and to develop predictive mappings between variables – machine and deep learning – artificial intelligence (AI) has proved to be an indispensable tool in the fight against the coronavirus pandemic. AI has enabled the deployment of predictive models of potential disease contagion and containment, and has been used for screening and tracking patients. AI has been deployed across the globe to improve understanding of the potential consequences of the viral infection for different economy sectors. Companies have increasingly relied on machine-learning-enabled systems to reengineer production delivery in the face of a massive disruption in supply chains. Policy-makers have also turned to AI technologies due to their great promise in strengthening the quality of remote education delivery, at times where schools and education systems struggle to remain accessible to learners.
Last month, an artificial intelligence agent defeated human F-16 pilots in a Defense Advanced Research Projects Agency challenge, reigniting discussions about lethal AI and whether it can be trusted. Allies, non-government organizations, and even the U.S. Defense Department have weighed in on whether AI systems can be trusted. But why is the U.S. military worried about trusting algorithms when it does not even trust its AI developers? Any organization's adoption of AI and machine learning requires three technical tools: usable digital data that machine learning algorithms learn from, computational capabilities to power the learning process, and the development environment that engineers use to code. However, the military's precious few uniformed data scientists, machine learning engineers, and data engineers who create AI-enabled applications are currently hamstrung by a lack of access to these tools.
Google detailed a host of new improvements at its "Search On" event that it will make to its foundational Google search service in the coming weeks and months. The changes are largely focused on using new AI and machine learning techniques to provide better search results for users. Chief among them: a new spell checking tool that Google promises will help identify even the most poorly spelled queries. According to Prabhakar Raghavan, Google's head of search, 15 percent of Google search queries each day are ones that Google has never seen before, meaning the company has to constantly work to improve its results. Part of that is because of poorly spelled queries.
Companies are leveraging data and artificial intelligence to create scalable solutions -- but they're also scaling their reputational, regulatory, and legal risks. For instance, Los Angeles is suing IBM for allegedly misappropriating data it collected with its ubiquitous weather app. Optum is being investigated by regulators for creating an algorithm that allegedly recommended that doctors and nurses pay more attention to white patients than to sicker black patients. Goldman Sachs is being investigated by regulators for using an AI algorithm that allegedly discriminated against women by granting larger credit limits to men than women on their Apple cards. Facebook infamously granted Cambridge Analytica, a political firm, access to the personal data of more than 50 million users.
In a proof-of-concept study, education and artificial intelligence researchers have demonstrated the use of a machine-learning model to predict how long individual museum visitors will engage with a given exhibit. The finding opens the door to a host of new work on improving user engagement with informal learning tools. "Education is an important part of the mission statement for most museums," says Jonathan Rowe, co-author of the study and a research scientist in North Carolina State University's Center for Educational Informatics (CEI). "The amount of time people spend engaging with an exhibit is used as a proxy for engagement and helps us assess the quality of learning experiences in a museum setting. It's not like school--you can't make visitors take a test."
Machine learning typically requires tons of examples. To get an AI model to recognize a horse, you need to show it thousands of images of horses. This is what makes the technology computationally expensive--and very different from human learning. A child often needs to see just a few examples of an object, or even only one, before being able to recognize it for life. In fact, children sometimes don't need any examples to identify something.
The Boston Fire Department started to use emerging technology to fight fires in the last couple of years. In collaboration with Karen Panetta, an IEEE fellow and dean of Graduate Education at Tufts University's School of Engineering, the department is using AI for object recognition. The goal is to be able to use a drone or robot that can locate objects in a burning building. Panetta worked with the department to develop prototype technology that leverages IoT sensors and AI in tandem with robotics to help first responders "see" through blazes to detect and locate objects – and people. The AI technology she developed analyzes data coming from sensors that firefighters wear, and it recognizes objects that can be navigated in a fire.
Modern machine learning methods are often overparametrized, allowing adaptation to the data at a fine level. This can seem puzzling; in the worst case, such models do not need to generalize. This puzzle inspired a great amount of work, arguing when overparametrization reduces test error, in a phenomenon called "double descent". Recent work aimed to understand in greater depth why overparametrization is helpful for generalization. This leads to discovering the unimodality of variance as a function of the level of parametrization, and to decomposing the variance into that arising from label noise, initialization, and randomness in the training data to understand the sources of the error.
Alexander Tong GRD '23, a computer science graduate student, and Smita Krishnaswamy, professor of genetics and computer science, won the award for best paper at the annual 2020 Machine Learning for Signal Processing conference, hosted by the Institute of Electrical and Electronics Engineers. From Sept. 21 to Sept. 24, the MLSP conference was hosted virtually at Aalto University in Espoo, Finland. Tong and Krishnaswamy's paper, "Fixing bias in reconstruction-based anomaly detection with Lipschitz discriminators," won the best student paper award alongside two other teams. The paper identified problems present in many machine learning based outlier detection models. The researchers found some cases where these systems do not work -- and this occurs quite often for data types found in bigger data sets.