Goto

Collaborating Authors

 competition



Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos

Neural Information Processing Systems

Understanding how biological visual systems process information is challenging because of the nonlinear relationship between visual input and neuronal responses. Artificial neural networks allow computational neuroscientists to create predictive models that connect biological and machine vision. Machine learning has benefited tremendously from benchmarks that compare different models on the same task under standardized conditions. However, there was no standardized benchmark to identify state-of-the-art dynamic models of the mouse visual system. To address this gap, we established the SENSORIUM 2023 Benchmark Competition with dynamic input, featuring a new large-scale dataset from the primary visual cortex of ten mice.






3 Questions: Using AI to help Olympic skaters land a quint

AIHub

Why apply AI to figure skating? Skaters can always keep pushing, higher, faster, stronger. OOFSkate is all about helping skaters figure out a way to rotate a little bit faster in their jumps or jump a little bit higher. The system helps skaters catch things that perhaps could pass an eye test, but that might allow them to target some high-value areas of opportunity. The artistic side of skating is much harder to evaluate than the technical elements because it's subjective.


NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating Large Language Models in Offensive Security Motivation

Neural Information Processing Systems

For what purpose was the dataset created? Was there a specific task in mind? Was there a specific gap that needed to be filled? The dataset was created to evaluate the effectiveness of large language models (LLMs) in solving Capture the Flag (CTF) challenges within the domain of offensive security. There was a specific need to thoroughly assess the capabilities of LLMs in this context, as their potential for handling such tasks had not been systematically evaluated. The goal was to develop a scalable, open-source benchmark database specifically designed for these applications. This dataset includes diverse CTF challenges from popular competitions, with metadata to support LLM testing and adaptive learning. The dataset addresses a critical gap by providing a comprehensive resource for the systematic evaluation of LLMs' performance in real-world cybersecurity tasks. The development of this dataset and the accompanying automated framework allows for the continuous improvement and refinement of LLM-based approaches to vulnerability detection and resolution. By making the dataset open-source, the project aims to foster further research and development in this area, providing an ideal platform for developing, testing, and refining LLM-based approaches to cybersecurity challenges. Who created the dataset (e.g., which team, research group) and on behalf of which entity (e.g., company, institution, organization)? The students listed above compiled and validated these challenges from all previous global CSAW competitions by manually checking their setup and ensuring they remain solvable despite software changes. This work was conducted in collaboration with the OSIRIS Lab and the Center for Cybersecurity at NYU, which organize CSAW and attract global participation[1].



A Meta-Analysis of Overfitting in Machine Learning

Rebecca Roelofs, Vaishaal Shankar, Benjamin Recht, Sara Fridovich-Keil, Moritz Hardt, John Miller, Ludwig Schmidt

Neural Information Processing Systems

In each competition, numerous practitioners repeatedly evaluated their progress against a holdout set that forms the basis of a public ranking availablethroughout the competition. Performance on a separate test set used only oncedetermined the final ranking.