Goto

Collaborating Authors

 segment


SA3DIP: Segment Any 3D Instance with Potential 3D Priors

Neural Information Processing Systems

The proliferation of 2D foundation models has sparked research into adapting them for open-world 3D instance segmentation. Recent methods introduce a paradigm that leverages superpoints as geometric primitives and incorporates 2D multi-view masks from Segment Anything model (SAM) as merging guidance, achieving outstanding zero-shot instance segmentation results. However, the limited use of 3D priors restricts the segmentation performance. Previous methods calculate the 3D superpoints solely based on estimated normal from spatial coordinates, resulting in under-segmentation for instances with similar geometry. Besides, the heavy reliance on SAM and hand-crafted algorithms in 2D space suffers from over-segmentation due to SAM's inherent part-level segmentation tendency. To address these issues, we propose SA3DIP, a novel method for Segmenting Any 3D Instances via exploiting potential 3D Priors.


Segment Any Change

Neural Information Processing Systems

Visual foundation models have achieved remarkable results in zero-shot image classification and segmentation, but zero-shot change detection remains an open problem. In this paper, we propose the segment any change models (AnyChange), a new type of change detection model that supports zero-shot prediction and generalization on unseen change types and data distributions.AnyChange is built on the segment anything model (SAM) via our training-free adaptation method, bitemporal latent matching.By revealing and exploiting intra-image and inter-image semantic similarities in SAM's latent space, bitemporal latent matching endows SAM with zero-shot change detection capabilities in a training-free way. We also propose a point query mechanism to enable AnyChange's zero-shot object-centric change detection capability.We perform extensive experiments to confirm the effectiveness of AnyChange for zero-shot change detection.AnyChange sets a new record on the SECOND benchmark for unsupervised change detection, exceeding the previous SOTA by up to 4.4\% F _1 score, and achieving comparable accuracy with negligible manual annotations (1 pixel per image) for supervised change detection.


Artificial Intelligence (AI) in Healthcare Market Size, Share, Trends, Analysis and Forecast by Region, Segment, Offering, Technology and End User, 2022-2027

#artificialintelligence

Summary The AI in healthcare market size was valued at US$7,679.39 million in 2021 and is expected to grow at a compound annual growth rate (CAGR) of 39.05% during 2022-2027. The key to the growth has been increasing investment and development in AI and increasing strategic moves by market players are stimulating. Additionally, key strategic partnerships and mergers and acquisitions are expected to accelerate market growth. Healthcare, including pharma, medical devices, healthcare providers, and payers, is a highly regulated industry, and therefore can be slow to adopt new technologies and modernize.However, the healthcare industry is realizing the benefits artificial intelligence (AI) can bring, and it is now being used in different areas across the entire value chain. Additionally, its use in the healthcare space is expected to continue to increase in the next five years. The integration of software with artificial intelligence is creating growth avenues for the global artificial intelligence in healthcare market.Integration of software with artificial intelligence offers immediate decision support and best results to diagnose diseases.


Robotic implant could help children with rare disorder eat again

New Scientist Online News

Some children are born with their oesophagus in two segments, so the tube doesn't connect to their stomach. A new robotic implant might help treat this serious condition, known as oesophageal atresia. The robot consists of two steel rings, some sensors and a motor, all sealed in a protective waterproof skin. The device is attached to the outside of one section of the oesophagus and gently elongates it by moving the rings apart. Once the organ is long enough, the two segments can be stitched together.


To Know or Not to Know

AI Magazine

JEEVES's success depended crucially on JEEVES's visual range was extremely JEEVES as successful as it was? JEEVES's success was that its software JEEVES's hardware was designed and built by JEEVES can reverse the direction of the brush. It is equipped with seven ultrasonic proximity sensors (only five were used in the competition), a wide-angle color camera, and a high-speed colorbased vision system manufactured by Newton Research Labs. Prior to the competition, the vision system was trained to recognize yellow tennis balls, pink squiggle balls, and cyan markers that marked the gate. The vision system proved extremely reliable during the competition, benefiting from clear color cues provided by the objects.


KBEmacs: Where's the AI?

AI Magazine

The Programmer's Apprentice project uses the domain of programming as a vehicle for studying (and attempting to duplicate) human problem solving behavior. Recognizing that it will be a long time before it is possible to fully duplicate an expert programmer's abilities, the project seeks to develop an intelligent assistant system, the Programmer's Apprentice (PA), which will help a programmer in various phases of the programming task. The Knowledge-Based Editor in Emacs (KBEmacs) is an initial step in the direction of the PA. A question that has been asked about KBEmacs is, "Where's the AI?" Going beyond this, the article uses the development of KBEmacs as an example that illustrates a number of general features of the process of developing an applied AI system. As part of this, the article compares the way AI ideas are used in KBEmacs with the way they were used in the initial proposal for the PA.


Steps toward a Cognitive Vision System

AI Magazine

An adequate natural language description of developments in a real-world scene can be taken as proof of "understanding what is going on." An algorithmic system that generates natural language descriptions from video recordings of road traffic scenes can be said to "understand" its input to the extent that algorithmically generated text is acceptable to the humans judging it. The ability to present a "variant formulation" without distorting the essential parts of the original message is taken as a cue that these essentials have been "understood." During art lessons, in particular those concerned with classical or ecclesiastic paintings, students are initially invited to merely describe what they see. Frequently, considerable a priori knowledge about ancient mythology or biblical traditions is required to succinctly characterize the depicted scene. Lack of the corresponding knowledge about other cultures can make it difficult for someone with only a European education to really understand and describe in an appropriate manner a painting by, for example, a Far East classic artist. Familiar human experiences mentioned in the preceding paragraph will now be "morphed" into a scientific challenge: to design and implement an algorithmic engine that generates an appropriate textual description of essential developments in a video sequence recorded from a real-world scene. Such an algorithmic engine will serve as one example of a cognitive vision system (CVS), which leaves room, as the experienced reader has noticed, for there to be more than one way to introduce the concept of a CVS. An alternative clearly consists in coupling a computer vision system with a robotic system of some kind and assessing the reactions of such a compound system. To whomever accepts the formulation, "one of the actions available to an agent is to produce language. This is called a speech act. Russell and Norvig (1995)" is unlikely to consider the two variants of a CVS alluded to previously as being fundamentally different. With regard to the first CVS version in particular, the following remarks are submitted for consideration: Obviously, we avoid a precise definition of understanding in favor of having humans compare the reaction of an algorithmic engine to that expected from a human. This fuzzy approach toward the circumscription of a CVS opens the road to constructive criticism--that is, to incremental system improvement--by pinpointing aspects of an output text that are not yet considered satisfactory.


2011 Robert S. Engelmore Memorial Lecture Award

AI Magazine

Following a brief overview discussing why people prefer listening to expressive music instead of nonexpressive synthesized music, we examine a representative selection of well-known approaches to expressive computer music performance with an emphasis on AIrelated approaches. In the main part of the article we focus on the existing CBR approaches to the problem of synthesizing expressive music, and particularly on Tempo-Express, a case-based reasoning system developed at our Institute, for applying musically acceptable tempo transformations to monophonic audio recordings of musical performances. Finally we briefly describe an ongoing extension of our previous work consisting of complementing audio information with information about the gestures of the musician. Music is played through our bodies, therefore capturing the gesture of the performer is a fundamental aspect that has to be taken into account in future expressive music renderings. This article is based on the "2011 Robert S. Engelmore Memorial Lecture" given by the first author at AAAI/IAAI 2011.


Evidence Accumulation & Flow of Control in a Hierarchical Spatial Reasoning System

AI Magazine

To elaborate, suppose a helicopter-based computer vision system is looking at a snow-covered terrain; this terrain knowledge must then be explicitly taken into account in a target recognition procedure. Clearly, the processing required for a snow-covered background is different from that for, say, a wooded area in spring. As a simpler example of knowledgebased processing, consider the problem of self-location for a vehiclemounted vision system (Kak et al. 1987). Let's say the vehicle's whereabouts are approximately known from the position encoders mounted on the wheels, the precision of this information limited by the extent of slippage in the wheels, and so on. Given this approximate information, is it possible to make a more precise fix on the location of the vehicle by integrating the vision data with the map knowledge while the two are out of registration?


Cognitive Architectures and General Intelligent Systems

AI Magazine

In this article, I claim that research on cognitive architectures is an important path to the development of general intelligent systems. I contrast this paradigm with other approaches to constructing such systems, and I review the theoretical commitments associated with a cognitive architecture. These entities were intended to have the same intellectual capacity as humans and they were supposed to exhibit their intelligence in a general way across many different domains. I will refer to this research agenda as aimed at the creation of general intelligent systems. Unfortunately, modern artificial intelligence has largely abandoned this objective, having instead divided into many distinct subfields that care little about generality, intelligence, or even systems.