Goto

Collaborating Authors

 txt file


ai.txt: A Domain-Specific Language for Guiding AI Interactions with the Internet

Li, Yuekang, Song, Wei, Zhu, Bangshuo, Gong, Dong, Liu, Yi, Deng, Gelei, Chen, Chunyang, Ma, Lei, Sun, Jun, Walsh, Toby, Xue, Jingling

arXiv.org Artificial Intelligence

We introduce ai.txt, a novel domain-specific language (DSL) designed to explicitly regulate interactions between AI models, agents, and web content, addressing critical limitations of the widely adopted robots.txt standard. As AI increasingly engages with online materials for tasks such as training, summarization, and content modification, existing regulatory methods lack the necessary granularity and semantic expressiveness to ensure ethical and legal compliance. ai.txt extends traditional URL-based access controls by enabling precise element-level regulations and incorporating natural language instructions interpretable by AI systems. To facilitate practical deployment, we provide an integrated development environment with code autocompletion and automatic XML generation. Furthermore, we propose two compliance mechanisms: XML-based programmatic enforcement and natural language prompt integration, and demonstrate their effectiveness through preliminary experiments and case studies. Our approach aims to aid the governance of AI-Internet interactions, promoting responsible AI use in digital ecosystems.


Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models

Zhang, Andy K., Perry, Neil, Dulepet, Riya, Jones, Eliot, Lin, Justin W., Ji, Joey, Menders, Celeste, Hussein, Gashon, Liu, Samantha, Jasper, Donovan, Peetathawatchai, Pura, Glenn, Ari, Sivashankar, Vikram, Zamoshchin, Daniel, Glikbarg, Leo, Askaryar, Derek, Yang, Mike, Zhang, Teddy, Alluri, Rishi, Tran, Nathan, Sangpisit, Rinnara, Yiorkadjis, Polycarpos, Osele, Kenny, Raghupathi, Gautham, Boneh, Dan, Ho, Daniel E., Liang, Percy

arXiv.org Artificial Intelligence

Language Model (LM) agents for cybersecurity that are capable of autonomously identifying vulnerabilities and executing exploits have the potential to cause real-world impact. Policymakers, model providers, and other researchers in the AI and cybersecurity communities are interested in quantifying the capabilities of such agents to help mitigate cyberrisk and investigate opportunities for penetration testing. Toward that end, we introduce Cybench, a framework for specifying cybersecurity tasks and evaluating agents on those tasks. We include 40 professional-level Capture the Flag (CTF) tasks from 4 distinct CTF competitions, chosen to be recent, meaningful, and spanning a wide range of difficulties. Each task includes its own description, starter files, and is initialized in an environment where an agent can execute bash commands and observe outputs. Since many tasks are beyond the capabilities of existing LM agents, we introduce subtasks, which break down a task into intermediary steps for more gradated evaluation; we add subtasks for 17 of the 40 tasks. To evaluate agent capabilities, we construct a cybersecurity agent and evaluate 7 models: GPT-4o, Claude 3 Opus, Claude 3.5 Sonnet, Mixtral 8x22b Instruct, Gemini 1.5 Pro, Llama 3 70B Chat, and Llama 3.1 405B Instruct. Without guidance, we find that agents are able to solve only the easiest complete tasks that took human teams up to 11 minutes to solve, with Claude 3.5 Sonnet and GPT-4o having the highest success rates. Finally, subtasks provide more signal for measuring performance compared to unguided runs, with models achieving a 3.2\% higher success rate on complete tasks with subtask-guidance than without subtask-guidance. All code and data are publicly available at https://cybench.github.io


Amazon investigating Perplexity AI after accusations it scrapes websites without consent

Engadget

Amazon Web Services has started an investigation to determine whether Perplexity AI is breaking its rules, according to Wired. To, be precise, the company's cloud division is looking into allegations that the service is using a crawler, which is hosted on its servers, that ignores the Robots Exclusion Protocol. This protocol is a web standard, wherein developers put a robots.txt Complying with those instructions is voluntary, but crawlers from reputable companies have generally been respecting them since web developers started implementing the standard in the '90s. In an earlier piece, Wired reported that it discovered a virtual machine that was bypassing its website's robots.txt


AI companies are reportedly still scraping websites despite protocols meant to block them

Engadget

Perplexity, a company that describes its product as "a free AI search engine," has been under fire over the past few days. Shortly after Forbes accused it of stealing its story and republishing it across multiple platforms, Wired reported that Perplexity has been ignoring the Robots Exclusion Protocol, or robots.txt, Technology website The Shortcut also accused the company of scraping its articles. Now, Reuters has reported that Perplexity isn't the only AI company that's bypassing robots.txt Reuters said it saw a letter addressed to publishers from TollBit, a startup that pairs them up with AI firms so they can reach licensing deals, warning them that "AI agents from multiple sources (not just one company) are opting to bypass the robots.txt


Testing Language Model Agents Safely in the Wild

Naihin, Silen, Atkinson, David, Green, Marc, Hamadi, Merwane, Swift, Craig, Schonholtz, Douglas, Kalai, Adam Tauman, Bau, David

arXiv.org Artificial Intelligence

A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild. Yet real-world autonomous tests face several unique safety challenges, both due to the possibility of causing harm during a test, as well as the risk of encountering new unsafe agent behavior through interactions with real-world and potentially malicious actors. We propose a framework for conducting safe autonomous agent tests on the open internet: agent actions are audited by a context-sensitive monitor that enforces a stringent safety boundary to stop an unsafe test, with suspect behavior ranked and logged to be examined by humans. We design a basic safety monitor (AgentMonitor) that is flexible enough to monitor existing LLM agents, and, using an adversarial simulated agent, we measure its ability to identify and stop unsafe situations. Then we apply the AgentMonitor on a battery of real-world tests of AutoGPT, and we identify several limitations and challenges that will face the creation of safe in-the-wild tests as autonomous agents grow more capable.


New York Times, CNN and ABC block OpenAI's GPTBot web crawler from accessing content

The Guardian

News outlets including the New York Times, CNN, Reuters and the Australian Broadcasting Corporation (ABC) have blocked a tool from OpenAI, limiting the company's ability to continue accessing their content. OpenAI is behind one of the best known artificial intelligence chatbots, ChatGPT. Its web crawler – known as GPTBot – may scan webpages to help improve its AI models. The Verge was first to report the New York Times had blocked GPTBot on its website. The Guardian subsequently found that other major news websites, including CNN, Reuters, the Chicago Tribune, the ABC and Australian Community Media (ACM) brands such as the Canberra Times and the Newcastle Herald, appear to have also disallowed the web crawler.


Week#5 Tracking of Students' Facial Expressions

#artificialintelligence

What Have We Done Last Week? Last week we did a research on which object detection algorithms we can use to detect facial expressions of students. As a result of this research, we have seen that deep learning-based object detection methods are widely used. These methods use convolutional neural networks to learn the properties of the data and perform object detection models. Among them, we examined the R-CNN, Faster R-CNN and YOLO methods.


FOON Creation and Traversal for Recipe Generation

Patel, Raj

arXiv.org Artificial Intelligence

Task competition by robots is still off from being completely dependable and usable. One way a robot may decipher information given to it and accomplish tasks is by utilizing FOON, which stands for functional object-oriented network. The network first needs to be created by having a human creates action nodes as well as input and output nodes in a .txt file. After the network is sizeable, utilization of this network allows for traversal of the network in a variety of ways such as choosing steps via iterative deepening searching by using the first seen valid option. Another mechanism is heuristics, such as choosing steps based on the highest success rate or lowest amount of input ingredients. Via any of these methods, a program can traverse the network given an output product, and derive the series of steps that need to be taken to produce the output.


Build a Named Entity Recognition App with Streamlit

#artificialintelligence

In my previous article, we fine-tuned a Named Entity Recognition (NER) model, trained on the wnut_17[1] dataset. In this article, we show step-by-step how to integrate this model with Streamlit and deploy it using HugginFace Spaces. The goal of this app is to tag input sentences per user request in real time. Also, keep in mind, that contrary to trivial ML models, deploying a large language model on Streamlit is tricky. We also address those challenges.


So, I made an AI to attend my online classes for me.

#artificialintelligence

We all now how boring it gets after a while to attend online classes. So, I made an AI to attend them for me. Let's see how will the AI get the data from the class? So from the above image I hope you get the basic gist of how the data collection will work. So, basically what the above code does is take the screenshoted image and make it negative so that the black turns white and the white turns black and then detect the the text and draw rectangles around it on the original image.