Goto

Collaborating Authors

 opération


Justice Department Says Anthropic Can't Be Trusted With Warfighting Systems

WIRED

Justice Department Says Anthropic Can't Be Trusted With Warfighting Systems In response to Anthropic's lawsuit, the government said it lawfully penalized the company for trying to limit how its Claude AI models could be used by the military. The Trump administration argued in a court filing on Tuesday that it did not violate Anthropic's First Amendment rights by designating the AI developer a supply-chain risk and predicted that the company's lawsuit against the government will fail. "The First Amendment is not a license to unilaterally impose contract terms on the government, and Anthropic cites nothing to support such a radical conclusion," US Department of Justice attorneys wrote. The response was filed in a federal court in San Francisco, one of two venues where Anthropic is challenging the Pentagon's decision to sanction the company with a label that can bar companies from defense contracts over concerns about potential security vulnerabilities. Anthropic argues the Trump administration overstepped its authority in applying the label and preventing the company's technologies from being used inside the department.


Towards Accurate Binary Convolutional Neural Network

Neural Information Processing Systems

We introduce a novel scheme to train binary convolutional neural networks (CNNs) -- CNNs with weights and activations constrained to {-1,+1} at run-time. It has been known that using binary weights and activations drastically reduce memory size and accesses, and can replace arithmetic operations with more efficient bitwise operations, leading to much faster test-time inference and lower power consumption.


Differentiable Learning of Logical Rules for Knowledge Base Reasoning

Neural Information Processing Systems

We study the problem of learning probabilistic first-order logical rules for knowledge base reasoning. This learning problem is difficult because it requires learning the parameters in a continuous space as well as the structure in a discrete space. We propose a framework, Neural Logic Programming, that combines the parameter and structure learning of first-order logical rules in an end-to-end differentiable model. This approach is inspired by a recently-developed differentiable logic called TensorLog [5], where inference tasks can be compiled into sequences of differentiable operations. We design a neural controller system that learns to compose these operations. Empirically, our method outperforms prior work on multiple knowledge base benchmark datasets, including Freebase and WikiMovies.


Binarized Neural Networks

Neural Information Processing Systems

We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time. At train-time the binary weights and activations are used for computing the parameter gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve power-efficiency. To validate the effectiveness of BNNs, we conducted two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. We also report our preliminary results on the challenging ImageNet dataset. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line.


Embedding Logical Queries on Knowledge Graphs

Neural Information Processing Systems

Learning low-dimensional embeddings of knowledge graphs is a powerful approach used to predict unobserved or missing edges between entities. However, an open challenge in this area is developing techniques that can go beyond simple edge prediction and handle more complex logical queries, which might involve multiple unobserved edges, entities, and variables. For instance, given an incomplete biological knowledge graph, we might want to predict em what drugs are likely to target proteins involved with both diseases X and Y? -- a query that requires reasoning about all possible proteins that might interact with diseases X and Y. Here we introduce a framework to efficiently make predictions about conjunctive logical queries -- a flexible but tractable subset of first-order logic -- on incomplete knowledge graphs. In our approach, we embed graph nodes in a low-dimensional space and represent logical operators as learned geometric operations (e.g., translation, rotation) in this embedding space. By performing logical operations within a low-dimensional embedding space, our approach achieves a time complexity that is linear in the number of query variables, compared to the exponential complexity required by a naive enumeration-based approach. We demonstrate the utility of this framework in two application studies on real-world datasets with millions of relations: predicting logical relationships in a network of drug-gene-disease interactions and in a graph-based representation of social interactions derived from a popular web forum.


Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution

Neural Information Processing Systems

The high-dimensional convolution is widely used in various disciplines but has a serious performance problem due to its high computational complexity. Over the decades, people took a handmade approach to design fast algorithms for the Gaussian convolution. Recently, requirements for various non-Gaussian convolutions have emerged and are continuously getting higher. However, the handmade acceleration approach is no longer feasible for so many different convolutions since it is a time-consuming and painstaking job. Instead, we propose an Acceleration Network (AccNet) which turns the work of designing new fast algorithms to training the AccNet. This is done by: 1, interpreting splatting, blurring, slicing operations as convolutions; 2, turning these convolutions to $g$CP layers to build AccNet. After training, the activation function $g$ together with AccNet weights automatically define the new splatting, blurring and slicing operations. Experiments demonstrate AccNet is able to design acceleration algorithms for a ton of convolutions including Gaussian/non-Gaussian convolutions and produce state-of-the-art results.


Mesh-TensorFlow: Deep Learning for Supercomputers

Neural Information Processing Systems

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters. We introduce Mesh-TensorFlow, a language for specifying a general class of distributed tensor computations.


Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

Neural Information Processing Systems

Distributed training of deep nets is an important technique to address some of the present day computing challenges like memory consumption and computational demands. Classical distributed approaches, synchronous or asynchronous, are based on the parameter server architecture, i.e., worker nodes compute gradients which are communicated to the parameter server while updated parameters are returned. Recently, distributed training with AllReduce operations gained popularity as well. While many of those operations seem appealing, little is reported about wall-clock training time improvements. In this paper, we carefully analyze the AllReduce based setup, propose timing models which include network latency, bandwidth, cluster size and compute time, and demonstrate that a pipelined training with a width of two combines the best of both synchronous and asynchronous training. Specifically, for a setup consisting of a four-node GPU cluster we show wall-clock time training improvements of up to 5.4x compared to conventional approaches.


'100 Video Calls Per Day': Models Are Applying to Be the Face of AI Scams

WIRED

'100 Video Calls Per Day': Models Are Applying to Be the Face of AI Scams Dozens of Telegram channels reviewed by WIRED include job listings for "AI face models." The (mostly) women who land these gigs are likely being used to dupe victims out of their money. "I can speak fluent English, I can speak good Chinese, I also speak Russian and Turkish," the glamorous, 24-year-old Uzbekistani woman explains in a selfie-style video made for recruiters. Angel had arrived in the Cambodian city of Sihanoukville that day, she said, and was ready to start work immediately. Those impressive language skills, however, have likely been put to use as part of elaborate " pig-butchering " scams targeting Americans.


iGarden M1 Pro Max 100 Review: A Sports Car for Your Pool

WIRED

Managed a perfect cleaning record, if you leave it in the water long enough. Basket is quite difficult to clean. Must be retrieved with a pole when finished. In an aquatic world dominated by robotic pool cleaners that mostly look identical, a company called iGarden has been that breath of fresh air you take after reaching the water's surface. The company's pool cleaners have always featured designs that feel inspired more by high-end automobiles than underwater janitors, and with the new M1 series, its gear is sportier than ever.