Goto

Collaborating Authors

 Marketing


Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards

Neural Information Processing Systems

Online advertising platforms use automated auctions to connect advertisers with potential customers, requiring effective bidding strategies to maximize profits. Accurate ad impact estimation requires considering three key factors: delayed and long-term effects, cumulative ad impacts such as reinforcement or fatigue, and customer heterogeneity. However, these effects are often not jointly addressed in previous studies. To capture these factors, we model ad bidding as a Contextual Markov Decision Process (CMDP) with delayed Poisson rewards. For efficient estimation, we propose a two-stage maximum likelihood estimator combined with data-splitting strategies, ensuring controlled estimation error based on the first-stage estimator's (in)accuracy. Building on this, we design a reinforcement learning algorithm to derive efficient personalized bidding strategies. This approach achieves a near-optimal regret bound of O(dH2 T), where d is the contextual dimension, H is the number of rounds, and T is the number of customers. Our theoretical findings are validated by simulation experiments.


CHASM Unveiling Covert Advertisements on Chinese Social Media

Neural Information Processing Systems

Current benchmarks for evaluating large language models (LLMs) in social media moderation completely overlook a serious threat: covert advertisements, which disguise themselves as regular posts to deceive and mislead consumers into making purchases, leading to significant ethical and legal concerns. In this paper, we present the CHASM, a first-of-its-kind dataset designed to evaluate the capability of Multimodal Large Language Models (MLLMs) in detecting covert advertisements on social media. CHASM3 is a high-quality, anonymized, manually curated dataset consisting of 4,992 instances, based on real-world scenarios from the Chinese social media platform Rednote. The dataset was collected and annotated under strict privacy protection and quality control protocols. It includes many product experience sharing posts that closely resemble covert advertisements, making the dataset particularly challenging.


Can ChatGPT Be a Criminal Accomplice?

Slate

Can ChatGPT Be a Criminal Accomplice? With swiftly circumvented filters and no discernment, LLMs deliver "expertise" even when they shouldn't. Please enable javascript to get your Slate Plus feeds. If you can't access your feeds, please contact customer support. Check your phone for a link to finish setting up your feed.


OpenAI to introduce ads to ChatGPT in Japan

The Japan Times

The ads will appear on the free version and the Go plan, priced at ¥1,400 per month, but will not be shown to users under 18 or who subscribe to higher-priced tiers.


Appendix

Neural Information Processing Systems

The DeceptionBench is designed as a research benchmark to systematically study deception behaviors in LLMs, fostering a deeper understanding of their decision-making processes in real-world scenarios. Our primary intent is to provide a standardized, transparent tool for the research community to evaluate and improve LLMs' ethical alignment, not to enable or encourage deceptive practices. To prevent potential misuse by malicious actors, we commit to publicly releasing all evaluation data under an open license. This transparency ensures that DeceptionBench's methodology and outcomes are subject to scrutiny, replication, and improvement by the research community, reducing the risk of hidden exploitation. By prioritizing openness, we aim to advance responsible AI development while safeguarding against misuse in harmful contexts. The field of Large Language Models (LLMs) has undergone remarkable evolution in recent years, reshaping the landscape of natural language processing.


Adversarial Paraphrasing: AUniversal Attack for Humanizing AI-Generated Text

Neural Information Processing Systems

The increasing capabilities of Large Language Models (LLMs) have raised concerns about their misuse in AI-generated plagiarism and social engineering. While various AI-generated text detectors have been proposed to mitigate these risks, many remain vulnerable to simple evasion techniques such as paraphrasing. However, recent detectors have shown greater robustness against such basic attacks. In this work, we introduce Adversarial Paraphrasing, a training-free attack framework that universally humanizes any AI-generated text to evade detection more effectively. Our approach leverages an off-the-shelf instruction-following LLM to paraphrase AI-generated content under the guidance of an AI text detector, producing adversarial examples that are specifically optimized to bypass detection. Extensive experiments show that our attack is both broadly effective and highly transferable across several detection systems. For instance, compared to simple paraphrasing attack--which, ironically, increases the true positive at 1% false positive (T@1%F) by 8.57% on RADAR and 15.03% on Fast-DetectGPT--adversarial paraphrasing, guided by OpenAI-RoBERTa-Large, reduces T@1%F by 64.49% on RADAR and a striking 98.96% on Fast-DetectGPT. Across a diverse set of detectors--including neural network-based, watermark-based, and zero-shot approaches--our attack achieves an average T@1%F reduction of 87.88% under the guidance of OpenAI-RoBERTa-Large. We also analyze the tradeoff between text quality and attack success to find that our method can significantly reduce detection rates, with mostly a slight degradation in text quality. Our adversarial setup highlights the need for more robust and resilient detection strategies in the light of increasingly sophisticated evasion techniques.


Elon Musk's Trillion-Dollar Week Turned Out to Be Something Much Darker

Slate

His fortunes reached new heights while his online behavior reached new lows. Enter your email to receive alerts for this author. You can manage your newsletter subscriptions at any time. You're already subscribed to the aa_Nitish_Pahwa newsletter. You can manage your newsletter subscriptions at any time.


Learning-Augmented Online Bipartite Fractional Matching

Neural Information Processing Systems

Online bipartite matching is a fundamental problem in online optimization, extensively studied both in its integral and fractional forms due to its theoretical significance and practical applications, such as online advertising and resource allocation. Motivated by recent progress in learning-augmented algorithms, we study online bipartite fractional matching when the algorithm is given advice in the form of a suggested matching in each iteration. We develop algorithms for both the vertex-weighted and unweighted variants that provably dominate the naïve "coin flip" strategy of randomly choosing between the advice-following and advice-free algorithms. Moreover, our algorithm for the vertex-weighted setting extends to the AdWords problem under the small bids assumption, yielding a significant improvement over the seminal work of Mahdian, Nazerzadeh, and Saberi (EC 2007, TALG 2012). Complementing our positive results, we establish a hardness bound on the robustness-consistency tradeoff that is attainable by any algorithm.


Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards

Neural Information Processing Systems

Online advertising platforms use automated auctions to connect advertisers with potential customers, requiring effective bidding strategies to maximize profits. Accurate ad impact estimation requires considering three key factors: delayed and long-term effects, cumulative ad impacts such as reinforcement or fatigue, and customer heterogeneity. However, these effects are often not jointly addressed in previous studies. To capture these factors, we model ad bidding as a Contextual Markov Decision Process (CMDP) with delayed Poisson rewards. For efficient estimation, we propose a two-stage maximum likelihood estimator combined with data-splitting strategies, ensuring controlled estimation error based on the first-stage estimator's (in)accuracy. Building on this, we design a reinforcement learning algorithm to derive efficient personalized bidding strategies. This approach achieves a near-optimal regret bound of $\tilde{\mathcal{O}}(dH^2\sqrt{T})$, where $d$ is the contextual dimension, $H$ is the number of rounds, and $T$ is the number of customers. Our theoretical findings are validated by simulation experiments.


You don't need to worry about recursive-self-improving AI – yet

New Scientist

You don't need to worry about recursive-self-improving AI - yet One of the world's leading artificial intelligence companies has implored the industry to pause development on AI, because the latest models could be reaching a tipping point where they become capable of redesigning themselves, growing ever more powerful and finally escaping our control. At least, that's what the headlines said. In truth, Anthropic's co-founder Jack Clark and the boss of spin-out think-tank The Anthropic Institute, Marina Favaro, have published a long blog post bigging up the capabilities of their Claude model, shortly before the company floats on the stock exchange in an initial public offering (IPO) for a rumoured $1 trillion. Let's, for a moment, ignore the vast financial elephant in the room and look at the technological claims. An AI that becomes capable of designing a more powerful version of itself, which is in turn able to pull off the same feat, is an obvious gamechanger, but it is also not a new idea.