Goto

Collaborating Authors

 Asia



The Download: supercharged scams and studying AI healthcare

MIT Technology Review

Plus: DeepSeek has unveiled its long-awaited new AI model. When ChatGPT was released in late 2022, it showed how easily generative AI could create human-like text. This quickly caught the eye of cybercriminals, who began using LLMs to compose malicious emails. Since then, they've adopted AI for everything from turbocharged phishing and hyperrealistic deepfakes to automated vulnerability scans. Many organizations are now struggling to cope with the sheer volume of cyberattacks. AI is making them faster, cheaper, and easier to carry out, a problem set to worsen as more cybercriminals adopt these tools--and their capabilities improve.


DeepSeek promises its new AI model has 'world-class' reasoning

Engadget

DeepSeek promises its new AI model has'world-class' reasoning The new models give users access to a'cost effective 1 million context length.' DeepSeek has released its latest AI models, the V4 Pro and Flash versions, a bit over a year after it went viral and became the top rated free app on Apple's App Store in the US. "Welcome to the era of cost-effective 1 million context length," DeepSeek said in its announcement . Context length is what you call the maximum number of tokens that an AI model can remember, so the bigger it is, the more coherent and consistent an AI is when it comes to extended conversations. OpenAI's recently announced GPT 5.5 has a context window ranging from 400,000 to 1 million, for instance.


Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision

Neural Information Processing Systems

In Vision-and-Language Navigation (VLN) task, an agent is asked to navigate inside 3D indoor environments following given instructions. Cross-modal alignment is one of the most critical challenges in VLN because the predicted trajectory needs to match the given instruction accurately. In this paper, we address the cross-modal alignment challenge from the perspective of fine-grain. Firstly, to alleviate weak cross-modal alignment supervision from coarse-grained data, we introduce a human-annotated fine-grained VLN dataset, namely Landmark-RxR. Secondly, to further enhance local cross-modal alignment under fine-grained supervision, we investigate the focal-oriented rewards with soft and hard forms, by focusing on the critical points sampled from fine-grained Landmark-RxR. Moreover, to fully evaluate the navigation process, we also propose a re-initialization mechanism that makes metrics insensitive to difficult points, which can cause the agent to deviate from the correct trajectories. Experimental results show that our agent has superior navigation performance on Landmark-RxR, en-RxR and R2R.


Efficient and Effective Augmentation Strategy for Adversarial Training

Neural Information Processing Systems

Adversarial training of Deep Neural Networks is known to be significantly more data-hungry when compared to standard training. Furthermore, complex data augmentations such as AutoAugment, which have led to substantial gains in standard training of image classifiers, have not been successful with Adversarial Training. We first explain this contrasting behavior by viewing augmentation during training as a problem of domain generalization, and further propose Diverse Augmentationbased Joint Adversarial Training (DAJAT) to use data augmentations effectively in adversarial training. We aim to handle the conflicting goals of enhancing the diversity of the training dataset and training with data that is close to the test distribution by using a combination of simple and complex augmentations with separate batch normalization layers during training. We further utilize the popular JensenShannon divergence loss to encourage the joint learning of the diverse augmentations, thereby allowing simple augmentations to guide the learning of complex ones. Lastly, to improve the computational efficiency of the proposed method, we propose and utilize a two-step defense, Ascending Constraint Adversarial Training (ACAT), that uses an increasing epsilon schedule and weight-space smoothing to prevent gradient masking. The proposed method DAJAT achieves substantially better robustness-accuracy trade-off when compared to existing methods on the RobustBench Leaderboard on ResNet-18 and WideResNet-34-10. The code for implementing DAJAT is available here: https://github.com/val-iisc/DAJAT.