granular control
Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization
Direct Preference Optimization (DPO) has emerged as a simple and effective method for aligning large language models. However, its reliance on a fixed temperature parameter leads to suboptimal training on diverse preference data, causing overfitting on easy examples and under-learning from informative ones. Recent methods have emerged to counter this. While IPO addresses general overfitting, its uniform regularization can be overly conservative. The more targeted approach of $β$-DPO suffers from its own limitations: its batch-level adaptation applies a single, compromised temperature to mixed-margin pairs, its linear update rule can produce unstable negative $β$ values, and its filtering mechanism discards potentially useful training signals. In this work, we introduce Margin-Adaptive Direct Preference Optimization (MADPO), a method that provides a stable, data-preserving, and instance-level solution. MADPO employs a practical two-step approach: it first trains a reward model to estimate preference margins and then uses these margins to apply a continuous, adaptive weight to the DPO loss for each individual training sample. This re-weighting scheme creates an effective target margin that is amplified for hard pairs and dampened for easy pairs, allowing for granular control over the learning signal. We provide a comprehensive theoretical analysis, proving that MADPO has a well-behaved optimization landscape and is robust to reward model estimation errors. We validate our theory with experiments on a sentiment generation task, where MADPO consistently and significantly outperforms strong baselines across datasets of varying quality. It achieves performance gains of up to +33.3\% on High Quality data and +10.5\% on Low Quality data over the next-best method. Our results establish MADPO as a more robust and principled approach to preference alignment.
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
OpenAI promises more 'granular control' to copyright owners after Sora 2 generates videos of popular characters
OpenAI's Sora 2 app allows users to make AI-generated videos based on a text prompt. OpenAI's Sora 2 app allows users to make AI-generated videos based on a text prompt. Company behind the AI video app says it will work with rights holders to'block characters from Sora at their request' Mon 6 Oct 2025 00.10 EDTLast modified on Mon 6 Oct 2025 00.11 EDT Sora 2, a video generator powered by artificial intelligence, was launched last week on an invite-only basis. The app allows users to generate short videos based on a text prompt. Varun Shetty, OpenAI's head of media partnerships, said: "We'll work with rights holders to block characters from Sora at their request and respond to takedown requests."
- Oceania > Australia (0.19)
- North America > United States (0.17)
- Europe > Ukraine (0.07)
- Government > Regional Government (0.74)
- Leisure & Entertainment > Sports (0.72)
- Law > Intellectual Property & Technology Law (0.52)
- Media > News (0.49)
"Do it my way!": Impact of Customizations on Trust perceptions in Human-Robot Collaboration
Kapoor, Parv, Chu, Simon, Chen, Angela
Trust has been shown to be a key factor in effective human-robot collaboration. In the context of assistive robotics, the effect of trust factors on human experience is further pronounced. Personalization of assistive robots is an orthogonal factor positively correlated with robot adoption and user perceptions. In this work, we investigate the relationship between these factors through a within-subjects study (N=17). We provide different levels of customization possibilities over baseline autonomous robot behavior and investigate its impact on trust. Our findings indicate that increased levels of customization was associated with higher trust and comfort perceptions. The assistive robot design process can benefit significantly from our insights for designing trustworthy and customized robots.
- Research Report > Experimental Study (0.70)
- Research Report > New Finding (0.49)
Samsung Galaxy Note 8: 10 Killer tips and tricks
Samsung has come back from last year's Note 7 disaster with the Galaxy Note 8, a phone so jam-packed with features, you might still be learning things about it when the time comes for your next upgrade. The Note 8 includes all the cool stuff Samsung bakes into all its phones, and then adds all the S Pen stuff. So why not accelerate your learning process? Here are 10 tips that will guide you to the very best features the Note 8 offers. The Galaxy Note 8 has one of the best displays available on a smartphone, and you can make a few adjustments to tweak it just right for your eyes.
- Information Technology > Communications > Mobile (1.00)
- Information Technology > Artificial Intelligence (0.91)
Drone Off! GoPro Karma and DJI Mavic Pro Fly Head-to-Head
Announced within a week of each other, the GoPro Karma and the DJI Mavic Pro are the season's (the year's?) hottest drones. They both fold up, they both shoot stabilized 4K video, and they'll both scare the hell out of your cat. Both have things that are absolutely fantastic, and both have things that are completely infuriating. If I had to recommend one it would be the GoPro Karma, but certainly not without reservations. The Mavic is wonderfully tiny--small enough to literally sit on the palm of your hand.