Goto

Collaborating Authors

 Media


Beyond the Battlefield: Framing Analysis of Media Coverage in Conflict Reporting

arXiv.org Artificial Intelligence

Framing used by news media, especially in times of conflict, can have substantial impact on readers' opinion, potentially aggravating the conflict itself. Current studies on the topic of conflict framing have limited insights due to their qualitative nature or only look at surface level generic frames without going deeper. In this work, we identify indicators of war and peace journalism, as outlined by prior work in conflict studies, in a corpus of news articles reporting on the Israel-Palestine war. For our analysis, we use computational approaches, using a combination of frame semantics and large language models to identify both communicative framing and its connection to linguistic framing. Our analysis reveals a higher focus on war based reporting rather than peace based. We also show substantial differences in reporting across the US, UK, and Middle Eastern news outlets in framing who the assailant and victims of the conflict are, surfacing biases within the media.


Improving the performance of optical inverse design of multilayer thin films using CNN-LSTM tandem neural networks

arXiv.org Artificial Intelligence

Optical properties of thin film are greatly influenced by the thickness of each layer. Accurately predicting these thicknesses and their corresponding optical properties is important in the optical inverse design of thin films. However, traditional inverse design methods usually demand extensive numerical simulations and optimization procedures, which are time-consuming. In this paper, we utilize deep learning for the inverse design of the transmission spectra of SiO2/TiO2 multilayer thin films. We implement a tandem neural network (TNN), which can solve the one-to-many mapping problem that greatly degrades the performance of deep-learning-based inverse designs. In general, the TNN has been implemented by a back-to-back connection of an inverse neural network and a pre-trained forward neural network, both of which have been implemented based on multilayer perceptron (MLP) algorithms. In this paper, we propose to use not only MLP, but also convolutional neural network (CNN) or long short-term memory (LSTM) algorithms in the configuration of the TNN. We show that an LSTM-LSTM-based TNN yields the highest accuracy but takes the longest training time among nine configurations of TNNs. We also find that a CNN-LSTM-based TNN will be an optimal solution in terms of accuracy and speed because it could integrate the strengths of the CNN and LSTM algorithms.


Multimodal Cinematic Video Synthesis Using Text-to-Image and Audio Generation Models

arXiv.org Artificial Intelligence

Advances in generative artificial intelligence have altered multimedia creation, allowing for automatic cinematic video synthesis from text inputs. This work describes a method for creating 60-second cinematic movies incorporating Stable Diffusion for high-fidelity image synthesis, GPT-2 for narrative structuring, and a hybrid audio pipeline using gTTS and YouTube-sourced music. It uses a five-scene framework, which is augmented by linear frame interpolation, cinematic post-processing (e.g., sharpening), and audio-video synchronization to provide professional-quality results. It was created in a GPU-accelerated Google Colab environment using Python 3.11. It has a dual-mode Gradio interface (Simple and Advanced), which supports resolutions of up to 1024x768 and frame rates of 15-30 FPS. Optimizations such as CUDA memory management and error handling ensure reliability. The experiments demonstrate outstanding visual quality, narrative coherence, and efficiency, furthering text-to-video synthesis for creative, educational, and industrial applications.


Q-Ponder: A Unified Training Pipeline for Reasoning-based Visual Quality Assessment

arXiv.org Artificial Intelligence

Recent studies demonstrate that multimodal large language models (MLLMs) can proficiently evaluate visual quality through interpretable assessments. However, existing approaches typically treat quality scoring and reasoning descriptions as separate tasks with disjoint optimization objectives, leading to a trade-off: models adept at quality reasoning descriptions struggle with precise score regression, while score-focused models lack interpretability. This limitation hinders the full potential of MLLMs in visual quality assessment, where accuracy and interpretability should be mutually reinforcing. To address this, we propose a unified two-stage training framework comprising a cold-start stage and a reinforcement learning-based fine-tuning stage. Specifically, in the first stage, we distill high-quality data from a teacher model through expert-designed prompts, initializing reasoning capabilities via cross-entropy loss supervision. In the second stage, we introduce a novel reward with Group Relative Policy Optimization (GRPO) to jointly optimize scoring accuracy and reasoning consistency. We designate the models derived from these two stages as Q-Ponder-CI and Q-Ponder. Extensive experiments show that Q-Ponder achieves state-of-the-art (SOTA) performance on quality score regression benchmarks, delivering up to 6.5% higher SRCC on cross-domain datasets. Furthermore, Q-Ponder significantly outperforms description-based SOTA models, including its teacher model Qwen-2.5-VL-72B, particularly in description accuracy and reasonableness, demonstrating the generalization potential over diverse tasks.


Netflix users outraged over new 'diabolical' update to app: 'Everyone needs to cancel'

Daily Mail - Science & tech

Netflix users around the world are lashing out over a new update that many say has ruined their viewing experience. The streaming giant, which boasts over 300 million subscribers, has rolled out a redesigned user interface (UI) to deliver better recommendations and a more personalized experience. However, the update has triggered a wave of frustration, with some subscribers calling it'diabolical.' Everyone needs to cancel,' one user posted on X. Users are now presented with enlarged title cards, the rectangular graphics that preview shows and movies, which they said takes up more screen space and reduces the number of titles visible at once. The'clunky' title boxes have also taken the place of key features that users said helped them find movies and shows easily.


How AI Is Being Used to Spread Misinformation--and Counter It--During the L.A. Protests

TIME - Tech

Here's how AI has been used during the L.A. protests. Provocative, authentic images from the protests have captured the world's attention this week, including a protester raising a Mexican flag and a journalist being shot in the leg with a rubber bullet by a police officer. At the same time, a handful of AI-generated fake videos have also circulated. Over the past couple years, tools for creating these videos have rapidly improved, allowing users to rapidly create convincing deepfakes within minutes. Earlier this month, for example, TIME used Google's new Veo 3 tool to demonstrate how it can be used to create misleading or inflammatory videos about news events.



Dua Lipa and Sir Elton John's bid to force government to change tack on AI fails

BBC News

"So this is good news for NHS workers and the police who will be freed from over a million hours of time spent doing admin, bereaved parents who will be supported to get the answers they deserve, and people who will be kept safer online thanks to new offences for deepfake abuse," DSIT said. But even though the Lords have decided they had made their point on AI, the argument has not gone away. Those who fought the battle have not changed their minds. Baroness Kidron, a film maker who led the charge for the amendment, told me the passing of the bill was "a pyrrhic victory at best" for the government, meaning it would lose more than it gains. That cost, she argues, is the giving away of UK assets, in the form of creative content, to largely US-based AI developers.


AI Agents Are Too Cheap for Our Own Good

WIRED

In 2007, Luke Arrigoni, an AI entrepreneur, earned 63,000 at his first job as a junior software developer. Today, he says AI tools that write better code than he did back then cost just 120 annually. The numbers don't sit right with him. Arrigoni, who runs Loti AI, a company that helps Hollywood stars find unauthorized deepfakes, worries that underpriced AI tools encourage companies to eliminate entry-level roles. He wants to flip the incentive structure so people's careers don't end before they begin.


EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits

arXiv.org Artificial Intelligence

Text-guided image editing, fueled by recent advancements in generative AI, is becoming increasingly widespread. This trend highlights the need for a comprehensive framework to verify text-guided edits and assess their quality. To address this need, we introduce EditInspector, a novel benchmark for evaluation of text-guided image edits, based on human annotations collected using an extensive template for edit verification. We leverage EditInspector to evaluate the performance of state-of-the-art (SoTA) vision and language models in assessing edits across various dimensions, including accuracy, artifact detection, visual quality, seamless integration with the image scene, adherence to common sense, and the ability to describe edit-induced changes. Our findings indicate that current models struggle to evaluate edits comprehensively and frequently hallucinate when describing the changes. To address these challenges, we propose two novel methods that outperform SoTA models in both artifact detection and difference caption generation.