Exchanging gradients is a widely used method in modern multi-node machine learning system (e.g., distributed training, collaborative learning). For a long time, people believed that gradients are safe to share: i.e., the training data will not be leaked by gradient exchange. However, we show that it is possible to obtain the private training data from the publicly shared gradients. We name this leakage as Deep Leakage from Gradient and empirically validate the effectiveness on both computer vision and natural language processing tasks. Experimental results show that our attack is much stronger than previous approaches: the recovery is pixel-wise accurate for images and token-wise matching for texts. We want to raise people's awareness to rethink the gradient's safety. Finally, we discuss several possible strategies to prevent such deep leakage. The most effective defense method is gradient pruning.
Could Earth pick up signals from a hypothetical'clone' of Earth, 12 lightyears away? The radar beams are potentially detectable by current radio facilities such as the Arecibo Observatory and FAST and the planned future Square Kilometer Array at distances of tens to hundreds of thousands of lightyears. However, they are transient and only very rarely aimed at any star because they're tracking objects moving across the sky in the foreground. So unless an astronomer on an Earth duplicate at Tau Ceti were aiming their equivalent of the Deep Space Network directly at the solar system, we would be very unlikely to pick up those radar beams. Television broadcasts are aimed towards the local horizon, because that's where most of the customers are.
In my last blog, I spoke about the magnitude of damage P2P fraud can cause to the organization and the need to address the problem with a different mindset with a more data driven approach. As is conceivable, some of these could be due to human errors and others with an intent to deceive. While traditional approaches help in uncovering some gaps, they suffer from some inherent shortcomings as discussed in one of our earlier posts . Some of these shortcomings are high false positives, inability to uncover newer anomalies and recognize patterns in large datasets, not learning from feedback. We have seen significant upside potential through use of data analytics and machine learning in fraud detection.