line
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Models Out of Line: A Fourier Lens on Distribution Shift Robustness
Improving the accuracy of deep neural networks on out-of-distribution (OOD) data is critical to an acceptance of deep learning in real world applications. It has been observed that accuracies on in-distribution (ID) versus OOD data follow a linear trend and models that outperform this baseline are exceptionally rare (and referred to as ``effectively robust"). Recently, some promising approaches have been developed to improve OOD robustness: model pruning, data augmentation, and ensembling or zero-shot evaluating large pretrained models. However, there still is no clear understanding of the conditions on OOD data and model properties that are required to observe effective robustness. We approach this issue by conducting a comprehensive empirical study of diverse approaches that are known to impact OOD robustness on a broad range of natural and synthetic distribution shifts of CIFAR-10 and ImageNet. In particular, we view the effective robustness puzzle through a Fourier lens and ask how spectral properties of both models and OOD data correlate with OOD robustness. We find this Fourier lens offers some insight into why certain robust models, particularly those from the CLIP family, achieve OOD robustness. However, our analysis also makes clear that no known metric is consistently the best explanation of OOD robustness. Thus, to aid future research into the OOD puzzle, we address the gap in publicly-available models with effective robustness by introducing a set of pretrained CIFAR-10 models---$RobustNets$---with varying levels of OOD robustness.
TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight
Semantic Scene Completion (SSC) aims to perform geometric completion and semantic segmentation simultaneously. Despite the promising results achieved by existing studies, the inherently ill-posed nature of the task presents significant challenges in diverse driving scenarios. This paper introduces TALoS, a novel test-time adaptation approach for SSC that excavates the information available in driving environments. Specifically, we focus on that observations made at a certain moment can serve as Ground Truth (GT) for scene completion at another moment. Given the characteristics of the LiDAR sensor, an observation of an object at a certain location confirms both 1) the occupation of that location and 2) the absence of obstacles along the line of sight from the LiDAR to that point.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
The rebuttal from the authors was concise. However, I was not convinced about the assumption in Eq. 14 of the paper and how the authors defended it. The authors say: "Many documents (text categorization),[..] or time series signals (speech recognition) in a training set are alike. This fact is not systematically exploited by any existing stochastic optimization method!" I don't think this is correct.
Reviews: One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities
In my view, the main reason the proposed lower bound is interesting is that it offers a potential way to speed up training for multi-class models with a very large number of classes. While it is useful to understand other properties of the lower bound, the paper could be improved by emphasizing this primary use case in machine learning. Figure 1c and Figure 3 need a more clear explanation of what is being displayed, and why it is important. In particular, what value is being plotted on the y-axis, and at what setting of the parameters w. Here is how I understand it, for Figure 1c say: Blue Line - value of Eq. (13) at the setting of parameters w that maximize 13 Red Line - value of Eq. (13) at the setting of parameters w that maximize 14 Green Line - value of Eq. (13)? at the setting of parameters w that maximize the Bouchard lower bound (?) Red dashed line - value of Eq. (13)? at parameters w based on the given iterations of training?
China issues draft rules for fakes in cyberspace
Jan 28 (Reuters) - China's cyberspace regulator issued draft rules on Friday for content providers that alter facial and voice data, the latest measure to crack down on "deepfakes" and mould a cyberspace that promotes Chinese socialist values. The rules are aimed at further regulating technologies such as those using algorithms to generate and modify text, audio, images and videos, according to documents published on the website of the Cyberspace Administration of China. Any platform or company that uses deep learning or virtual reality to alter any online content, what the CAC calls "deep synthesis service providers", will now be expected to "respect social morality and ethics, adhere to the correct political direction". The regulations provide for people to be protected from being impersonated without their consent by deepfakes - images that are virtually indistinguishable from the original, and easily used for manipulation or misinformation. "Where a deep synthesis service provider provides significant editing functions for biometric information such as face and human voice, it shall prompt the (provider) to notify and obtain the individual consent of the subject whose personal information is being edited," Article 12 of the draft says.
On the Other Hand …
Clusters of conversation provide a more valuable way to spend ones time than attending sessions. At the last national meeting we escaped from the celebrations of the recent victory of Deep Blue over the dreaded Kasparov, to find just such a group, already engaged in an animated discussion: A: We need to draw a line. A: Between a program that has some intelligence in it and one that doesn't. All Deep Blue does is brute-force search. That hardly counts as AI.
- Leisure & Entertainment > Games > Chess (0.81)
- Government > Space Agency (0.52)
- Government > Regional Government > North America Government > US Government (0.52)
Three-Dimensional
The growing field of three-dimensional (3-D) computer vision-programs that can interpret the world from sensor data-is the topic of Three-Dimensional Computer Vision by Yoshiaki Shirai (Springer-Verlag, Berlin, 1987, 297 pp., $95.00) The term "three-dimensional" is used to distinguish the field from two-dimensional pattern (2-D) recognition, such as character recognition or the recognition of silhouettes. The 3-D scene-understanding problem is made difficult by shadows, uneven lighting, texture, and objects that occlude other objects The sensors used include those that obtain a greylevel or color-intensity image of a scene, methods that project a sheet of light on an object to reveal its 3-D structure, and distance-measuring devices that provide a "range image" in which the value of each picture element represents a distance from the sensor to a point in the scene Such range sensors are important because they are not affected by lighting conditions and shadows. This book, not to be confused with Takeo Kanade's Three-Dimensional Machine Vision (Kluwer Academic Publishers, 1987), describes the fundamental technology of 3-D computer vision for various applications The first four chapters are devoted to basic methods of computer vision. This is followed by chapters on image feature extraction (edge analysis, edge linking and following, and region methods) and image feature description (representing lines, segmenting a sequence of points, fitting line equations, and converting between lines and regions). Once these preliminaries are completed, the author concentrates on the 3-D world.
554
An important task in postal automation technology is determining the position and orientation of the destination address block in the image of a mail piece such as a letter, magazine, or parcel. The corresponding subimage is then presented to a human operator or a machine reader (optical character reader) that can read the zip code and, if necessary, other address information and direct the mail piece to the appropriate sorting bin Analysis of physical characteristics of mail pieces indicates that in order to automate the addressfinding task, several different image analysis operations are necessary Some examples are locating a rectangular white address label on a multicolor background, progressively grouping characters into text lines and text Lines into text blocks, eliminating candidate regions by specialized detectors (fol example, detecting regions such as postage stamps), and identifying handwritten regions. A typical mail piece has several regions or blocks that are meaningful to mail processing, for example, address blocks (destination and return), postage [meter mark or stamp) as well as extraneous blocks WINTER 1987 25 Figure 1. The heuristics listed in the previous section suggest that the design of ABLS consist of several specialized tools that are appropriately deployed. Rule R2 suggests the need for a tool to detect postage fluorescence, rule R3 a tool for isolating blocks of a certain color, rule R4 for discriminating between handwriting and print, and so on.