Goto

Collaborating Authors

 Media


Control Variates for Slate Off-Policy Evaluation

Neural Information Processing Systems

We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates. The problem is common to recommender systems and user-interface optimization, and it is particularly challenging because of the combinatorially-sized action space. Swaminathan et al. (2017) have proposed the pseudoinverse (PI) estimator under the assumption that the conditional mean rewards are additive in actions. Using control variates, we consider a large class of unbiased estimators that includes as specific cases the PI estimator and (asymptotically) its self-normalized variant. By optimizing over this class, we obtain new estimators with risk improvement guarantees over both the PI and the self-normalized PI estimators.


VigDet: Knowledge Informed Neural Temporal Point Process for Coordination Detection on Social Media

Neural Information Processing Systems

Recent years have witnessed an increasing use of coordinated accounts on social media, operated by misinformation campaigns to influence public opinion and manipulate social outcomes. Consequently, there is an urgent need to develop an effective methodology for coordinated group detection to combat the misinformation on social media. However, the sparsity of account activities on social media limits the performance of existing deep learning based coordination detectors as they can not exploit useful prior knowledge. Instead, the detectors incorporated with prior knowledge suffer from limited expressive power and poor performance. Therefore, in this paper we propose a coordination detection framework incorporating neural temporal point process with prior knowledge such as temporal logic or pre-defined filtering functions. Specifically, when modeling the observed data from social media with neural temporal point process, we jointly learn a Gibbs distribution of group assignment based on how consistent an assignment is to (1) the account embedding space and (2) the prior knowledge.


Trump DOJ jumps into Musk xAI court battle as diversity fight heats up

FOX News

The DOJ joined Elon Musk's xAI in suing Colorado, alleging a state AI regulation law violates the First and Fourteenth amendments by forcing developers to adopt DEI ideology.


Appendix of Learning to Break the Loop Analyzing and Mitigating Repetitions for Neural Text Generation

Neural Information Processing Systems

Previous work [2, 1] has observed that standard training and greedy decoding usually cause models to generate consecutive repetitive texts. These consecutive repetitive texts are redundant and do not convey new information, which is avoided in human language. There are three types of consecutive repetitions: word-level, phrase-level and sentence-level. The phrase-level means that a phrase consisting of several words is repeated consecutively. The sentence in our paper refers to a sequence split by '.!?' is repeated consecutively 2. We calculate the ratio of consecutive repetition in a sequence x as follows.


Fox News AI Newsletter: Your next Dairy Queen order could be taken by AI

FOX News

AI newsletter covers Dairy Queen's automated drive-thru backlash, Meta's 8,000 employee layoffs amid its AI push, and voter concerns about privacy and paychecks.


'Chemical-spraying' drones reportedly stolen from New Jersey facility sparks fears of 'nightmare scenario'

Daily Mail - Science & tech

Rob Reiner's son Jake shares horrific new details from night of his parents' murders and says it is'almost impossible to process' that his brother Nick has been charged with the killings Bloodbath on the streets as millions of dogs are'massacred' by firing squad ahead of the World Cup Tucker Carlson's secret heiress sister reveals bitter feud over family fortune: He says'I don't know her'... but trove of photos tells a very different story Lesbian sex secrets of Kristi Noem's ICE leader: Ex lover claims jealous rages over men, screaming through hotel walls... and vile tight bodysuit demand Hidden cameras at NYC's live animal markets expose filthy conditions, disease risks, and brutal treatment of chickens, ducks, rabbits and sheep MAUREEN CALLAHAN: Dark indisputable Michael Jackson truths Hollywood STILL covers up. His own daughter reportedly now thinks he was a pedophile, so why's this so hard to say? Scandal after high-ranking female prison officer gave birth to twins... as shocking rumor spreads about identity of their father My senior government source has told me why these scientists may REALLY be going missing. This is so serious even the President is being kept on a'need-to-know basis': KENNEDY Former NFL quarterback Tim Tebow announces tragic news of dad's death after battle with Parkinson's in heartbreaking post Reclusive Athina Onassis, heiress to $2.7billion fortune who stepped away from public life after humiliating heartbreak, breaks cover at Barcelona Bridal Week in rare public appearance Sam's Club just launched a perk that targets Costco's biggest flaw Disappointed customers reveal the most'overrated' chain restaurants... do YOU have good taste? Woke author who boasted about shoplifting from Whole Foods flies into foul-mouthed RAGE when confronted outside her $2.2m Brooklyn brownstone Sherrone Moore's ex-mistress reveals pregnancy as she details night fired Michigan coach came to her apartment Troubling past of'father of the year' who murdered son, 11, in airport bathroom... as grieving grandpa reveals warning sign that something awful was about to happen US threatens to'review' UK claim to Falklands Islands and ban Spain from NATO as punishment for failure to back Iran War'Chemical-spraying' drones reportedly stolen from New Jersey facility sparks fears of'nightmare scenario' An alarm has erupted after 15 powerful agricultural spray drones were stolen in a suspected coordinated heist in New Jersey last month. A report from The High Side claimed the FBI is investigating the theft amid fears the machines could be used to disperse dangerous materials.


Uniform Sampling over Episode Difficulty

Neural Information Processing Systems

Episodic training is a core ingredient of few-shot learning to train models on tasks with limited labelled data. Despite its success, episodic training remains largely understudied, prompting us to ask the question: what is the best way to sample episodes? In this paper, we first propose a method to approximate episode sampling distributions based on their difficulty. Building on this method, we perform an extensive analysis and find that sampling uniformly over episode difficulty outperforms other sampling schemes, including curriculum and easy-/hard-mining. As the proposed sampling method is algorithm agnostic, we can leverage these insights to improve few-shot learning accuracies across many episodic training algorithms. We demonstrate the efficacy of our method across popular few-shot learning datasets, algorithms, network architectures, and protocols.



MosaicBERT: ABidirectional Encoder Optimized for Fast Pretraining

Neural Information Processing Systems

Although BERT-style encoder models are heavily used in NLP research, many researchers do not pretrain their own BERTs from scratch due to the high cost of training. In the past half-decade since BERT first rose to prominence, many advances have been made with other transformer architectures and training configurations that have yet to be systematically incorporated into BERT. Here, we introduce MosaicBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining. This efficient architecture incorporates FlashAttention, Attention with Linear Biases (ALiBi), Gated Linear Units (GLU), a module to dynamically remove padded tokens, and low precision LayerNorm into the classic transformer encoder block. The training recipe includes a 30% masking ratio for the Masked Language Modeling (MLM) objective, bfloat16 precision, and vocabulary size optimized for GPU throughput, in addition to best-practices from RoBERTa and other encoder models. When pretrained from scratch on the C4 dataset, this base model achieves a downstream average GLUE (dev) score of 79.6 in 1.13 hours on 8 A100 80 GBGPUs at a cost of roughly $20. We plot extensive accuracy vs. pretraining speed Pareto curves and show that MosaicBERT base and large are consistently Pareto optimal when compared to a competitive BERT base and large. This empirical speed up in pretraining enables researchers and engineers to pretrain custom BERT-style models at low cost instead of finetune on existing generic models.


'The View' hosts blast RFK Jr's leadership as Joy Behar says policies are 'trying to kill us'

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .