This article is about the cognitive science of visual art. Artists create physical artifacts (such as sculptures or paintings) which depict people, objects, and events. These depictions are usually stylized rather than photo-realistic. How is it that humans are able to understand and create stylized representations? Does this ability depend on general cognitive capacities or an evolutionary adaptation for art? What role is played by learning and culture? Machine Learning can shed light on these questions. It's possible to train convolutional neural networks (CNNs) to recognize objects without training them on any visual art. If such CNNs can generalize to visual art (by creating and understanding stylized representations), then CNNs provide a model for how humans could understand art without innate adaptations or cultural learning. I argue that Deep Dream and Style Transfer show that CNNs can create a basic form of visual art, and that humans could create art by similar processes. This suggests that artists make art by optimizing for effects on the human object-recognition system. Physical artifacts are optimized to evoke real-world objects for this system (e.g. to evoke people or landscapes) and to serve as superstimuli for this system.
Currently, the digital media is in a transitional phase, where the format of the medium is changing from text-based to one with visuals. Due to this significant shift, advertising has to play catch up, to stay up-to-date with the latest trends the industry. On top of that, the marketing industry has to deal with ad-blockers, which blocks out intrusive advertisements. According to a study done by PageFair, there are at least 615 million devices that use Adblock regularly. As you can imagine, getting through these ad-blockers is an uphill task, because they keep disruptive advertisements at bay.
Pinterest is bringing full automation to Shop the Look, a feature that helps users buy products from companies that work with Pinterest, so you can, for example, buy a pair of jeans you see in a picture. Previously, Shop the Look, which made its debut in 2017, was powered by artificial intelligence to find objects in Pinned photos that resemble products in stock from vendors, but included a human in the loop for curation. The automated Shop the Look will begin with the home decor category. Full automation for Shop the Look also moves the company closer to its goal to make the world pinnable. Altogether, Pinterest users have made more than 175 billion Pins.
While vision-language integration is important for a wide range of Artificial Intelligence (AI) prototypes and applications, the notion of integration has not been established within a theoretical framework that would allow for more thorough research on the issue. In this paper, we attempt to explore the reasons that dictate this content integration by bringing together Searle's theory of intentionality, the symbol grounding problem, as well as arguments regarding the nature of images and language developed within different AI subfields. In doing so, the Double-Grounding theory emerges which provides an explanatory theoretical definition for visionlanguage integration. In correlating the need for visionlanguage integration with inherent characteristics of the integrated media and in associating this need with an agent's intentionality and intelligence, the work presented in this paper aims at providing a theoretically established --and therefore solid-- common ground for currently isolated and scattered multimedia integration research in AI subfields.