Collaborating Authors


AI-Labeling Crowdsourcing Platforms


Artificial intelligence (AI) is widely used in today's business such as for data analytics, natural language processing, or process automation. The emergence of artificial intelligence is based on decades of research for solving difficult computer science tasks and is now rapidly transforming business model innovation. Companies that are not considering artificial intelligence will be vulnerable to those companies that are equipped with artificial intelligence technology. While companies like Google, Amazon, and Tesla have already innovated their business models with artificial intelligence, medium and small caps have limited budgets for putting much effort into setting up such capabilities. One high-effort task in creating artificial intelligence services is the pre-processing of data and the training of machine learning models.

Biotech startup Gero grabs €1.9M to hack ageing and COVID-19


Based out of Singapore, Gero develops new drugs for ageing and other complicated disorders using its proprietary developed artificial intelligence (AI) platform. Recently, the company has secured $2.2 million (€1.9 million) in Series A funding, bringing the total capital raised since Gero's founding to over $7.5 million (€6.4 million). Gero's founder Peter Fedichev, said, "We are happy with the recognition and support from these strategic investors who themselves are acknowledged leaders in the fields of AI and biotechnology. This will help us attain the necessary knowledge at the junction of biological sciences and AI/ML technologies that is necessary for the radical acceleration of drug discovery battling the toughest medical challenges of the 21st century. We hope that the technology will soon lead to a meaningful healthspan extension and quality of life improvements " The round was led by Bulba Ventures with participation from previous investors and serial entrepreneurs in the fields of pharmaceuticals, IT, and AI.


Communications of the ACM

Entity matching (EM) finds data instances that refer to the same real-world entity. In 2015, we started the Magellan project at UW-Madison, jointly with industrial partners, to build EM systems. Most current EM systems are stand-alone monoliths. In contrast, Magellan borrows ideas from the field of data science (DS), to build a new kind of EM systems, which is ecosystems of interoperable tools for multiple execution environments, such as on-premise, cloud, and mobile. This paper describes Magellan, focusing on the system aspects. We argue why EM can be viewed as a special class of DS problems and thus can benefit from system building ideas in DS. We discuss how these ideas have been adapted to build PyMatcher and CloudMatcher, sophisticated on-premise tools for power users and self-service cloud tools for lay users. These tools exploit techniques from the fields of machine learning, big data scaling, efficient user interaction, databases, and cloud systems. They have been successfully used in 13 companies and domain science groups, have been pushed into production for many customers, and are being commercialized. We discuss the lessons learned and explore applying the Magellan template to other tasks in data exploration, cleaning, and integration. Entity matching (EM) finds data instances that refer to the same real-world entity, such as tuples (David Smith, UW-Madison) and (D. Smith, UWM). This problem, also known as entity resolution, record linkage, deduplication, data matching, et cetera, has been a long-standing challenge in the database, AI, KDD, and Web communities.2,6 As data-driven applications proliferate, EM will become even more important. For example, to analyze raw data for insights, we often integrate multiple raw data sets into a single unified one, before performing the analysis, and such integration often requires EM. To build a knowledge graph, we often start with a small graph and then expand it with new data sets, and such expansion requires EM. When managing a data lake, we often use EM to establish semantic linkages among the disparate data sets in the lake.

AI predicts a Dodgers World Series win after a COVID-shortened season


Major League Baseball is entering uncharted waters with the start of its COVID-abridged season today. Nobody's really sure if the 60-game season will even be able to get through the World Series without disruption by the pandemic's spread. However, one crowd-sourced AI system already has a pretty good guess as to who will be taking home the Commissioner's Trophy. The folks at Unanimous AI have been making high profile predictions like these since 2016, when their UNU platform correctly figured 11 of 15 winners for that year's Academy Awards. In 2017, the company followed up by correctly guessing the Kentucky Derby's top four finishers -- in order, no less -- and in 2019, correctly figured that the Houston Astros would make it to the series (though nobody could have seen the Nat's miraculous postseason run coming). "The fundamental core of our system is a technology that captures input from groups of people by connecting them together in real time using AI algorithms modeled after swarms," Dr. Louis Rosenberg, Unanimous' founder and chief scientist, told Engadget.

Are Clogged Blood Vessels the Key to Treating Alzheimer's Disease?

Discover - Top Stories

Citizen Science Salon is a partnership between Discover and In 2016, a team of Alzheimer's disease researchers at Cornell University hit a dead end. The scientists were studying mice, looking for links between Alzheimer's and blood flow changes in the brain. For years, scientists have known that reduced blood flow in the brain is a symptom of Alzheimer's disease. More recent research has also shown that this reduced blood flow can be caused by clogged blood vessels -- or "stalls." And by reversing these stalls in mice, scientists were able to restore their memory.

Coronaprofile: Can AI speed up the hunt for COVID-19 research results?


Throughout the research world, artificial intelligence is increasingly being applied to scanning complicated scientific literature more quickly than humans alone can do. At Utrecht University, Prof. Rens van de Schoot and his team are part of an international research community now applying that technology to COVID-19 publications. In an edited email exchange with Diane M. Fresquez of Science Business, van de Schoot talks about his work, and search for collaborators (have you got coding talent?) – initially, while under lockdown with his three children, aged six and under, who played quietly (or not so quietly) underfoot. Q. Tell us about your COVID-19 project. With an increase in COVID-19 research literature, and an urgency to find cures and treatments, it is essential that data collection is done real-time.

Crop Disease Detection Using Machine Learning and Computer Vision - KDnuggets


International Conference on Learning Representations (ICLR) and Consultative Group on International Agricultural Research (CGIAR) jointly conducted a challenge where over 800 data scientists globally competed to detect diseases in crops based on close shot pictures. The objective of this challenge is to build a machine learning algorithm to correctly classify if a plant is healthy, has stem rust, or has leaf rust. Wheat rust is a devastating plant disease affecting many crops, reducing yields and affecting the livelihoods of farmers and decreasing food security across Africa. The disease is difficult to monitor at a large scale, making it difficult to control and eradicate. An accurate image recognition model that can detect wheat rust from any image will enable a crowd-sourced approach to monitor crops. The imagery data came from a variety of sources.

Learning Compact Visual Descriptors for Low Bit Rate Mobile Landmark Search

AI Magazine

Coming with the ever growing computational power of mobile devices, mobile visual search have undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with an offline learning from geotagged photos from online photo sharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using Gaussian mixture model, and then boosting a ranking sensitive vocabulary within each region, with an "entropy" based descriptor compactness feedback to refine both phases iteratively.

NASA crowdsourcing helps build a better Moon digging robot


NASA's Artemis program will eventually need robots to help live off the lunar soil, and it's enlisting help from the public to make those robots viable. The space agency has picked winners from a design challenge that tasked people with improving the bucket drums RASSOR (Regolith Advanced Surface Systems Operations Robot) will use to dig on the Moon. The victors all had clever designs that should capture lunar regolith with little effort -- important when any long-term presence might depend on bots like this. The winner was a trap from Caleb Clausing that uses a passive door to grab large amounts of soil while remaining dust-tolerant. Others included a simple-yet-effective drum from Michael R, another from Kyle St. Thomas that uses narrow drums, an efficient double-helix design from Stephan Weiβenböck and a model from Clix that uses both gravity and weight to help movement.