scott aaronson
Testing GPT-4-o1-preview on math and science problems: A follow-up study
In August 2023, Scott Aaronson and I reported the results of testing GPT4 with the Wolfram Alpha and Code Interpreter plug-ins over a collection of 105 original high-school level and college-level science and math problems (Davis and Aaronson, 2023). In September 2024, I tested the recently released model GPT-4o1-preview on the same collection. Overall I found that performance had significantly improved, but was still considerably short of perfect. In particular, problems that involve spatial reasoning are often stumbling blocks. On September 12, OpenAI (2024) released two preliminary versions, "ChatGPT-o1-preview" and "ChatGPT-o1-mini" of a forthcoming product "ChatGPT-o1".
- Europe > France (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > Canada > Quebec (0.05)
- (10 more...)
- Education > Educational Setting (0.55)
- Government > Space Agency (0.47)
How The ChatGPT Watermark Works And Why It Could Be Defeated
OpenAI's ChatGPT introduced a way to automatically create content but plans to introduce a watermarking feature to make it easy to detect are making some people nervous. This is how ChatGPT watermarking works and why there may be a way to defeat it. ChatGPT is an incredible tool that online publishers, affiliates and SEOs simultaneously love and dread. Some marketers love it because they're discovering new ways to use it to generate content briefs, outlines and complex articles. Online publishers are afraid of the prospect of AI content flooding the search results, supplanting expert articles written by humans.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.32)
How Classical Cryptography Will Survive Quantum Computers - Facts So Romantic
Justin Trudeau, the Canadian prime minister, certainly raised the profile of quantum computing a few notches last year, when he gamely--if vaguely1--described it for a press conference. But we've heard a lot about quantum computers in the past few years, as Google, I.B.M., and N.A.S.A., as well as many, many universities, have all been working on, or putting money into, quantum computers for various ends. The N.S.A., for instance, as the Snowden documents revealed, wants to build one for codebreaking, and it seems to be a common belief that if a full-scale, practical quantum computer is built, it could be really useful in that regard. A New Yorker article early this year, for example, stated that a quantum computer "would, on its first day of operation, be capable of cracking the Internet's most widely used codes." But maybe they won't be as useful as we have been led to believe.
- North America > Canada (0.55)
- North America > United States > New York (0.25)
- Information Technology > Hardware (1.00)
- Information Technology > Artificial Intelligence (1.00)
How Classical Cryptography Will Survive Quantum Computers - Facts So Romantic
Justin Trudeau, the Canadian prime minister, certainly raised the profile of quantum computing a few notches last year, when he gamely--if vaguely1--described it for a press conference. But we've heard a lot about quantum computers in the past few years, as Google, I.B.M., and N.A.S.A., as well as many, many universities, have all been working on, or putting money into, quantum computers for various ends. The N.S.A., for instance, as the Snowden documents revealed, wants to build one for codebreaking, and it seems to be a common belief that if a full-scale, practical quantum computer is built, it could be really useful in that regard. A recent New Yorker article, for example, stated that a quantum computer "would, on its first day of operation, be capable of cracking the Internet's most widely used codes." But maybe they won't be as useful as we have been led to believe.
- North America > Canada (0.55)
- North America > United States > New York (0.25)
- Information Technology > Hardware (1.00)
- Information Technology > Artificial Intelligence (1.00)