Goto

Collaborating Authors

 undergraduate


The Accidental Winners of the War on Higher Ed

The Atlantic - Technology

Go to a small liberal-arts college if you can. I n the waning heat of last summer, freshly back in my office at a major research university, I found myself considering the higher-education hellscape that had lately descended upon the nation. I'd spent months reporting on the Trump administration's attacks on universities for, speaking with dozens of administrators, faculty, and students about the billions of dollars in cuts to public funding for research and the resulting collapse of " college life ."At Initially, I surveyed the situation from the safe distance of a journalist who happens to also be a career professor and university administrator. I saw myself as an envoy between America's college campuses and its citizens, telling the stories of the people whose lives had been shattered by these transformations. By the summer, though, that safe distance had collapsed back on me.


In 1925, seven students went 60 hours without sleep--for science

Popular Science

Scientists were out to prove sleep was just a waste of time. Among the students who participated in the sleep deprivation study was the future head of the psychology department at George Washington University. Breakthroughs, discoveries, and DIY tips sent every weekday. The grueling Medical College Admission Test, or MCAT, was first devised in the 1920s by George Washington University professor Frederick August Moss. Originally called the Scholastic Aptitude Test for Medical Students, Moss developed the readiness test as a way to curb high dropout rates in medical schools.


18 months. 12,000 questions. A whole lot of anxiety. What I learned from reading students' ChatGPT logs

The Guardian

Making new friends is hard. Finding out what trousers exist in the world other than black ones is also, apparently, hard. Fortunately, for an AI-enabled generation of students, help with the complexities of campus life is just a prompt away. If you are really stuck on an essay or can't decide between management consulting or a legal career, or need suggestions on what you can cook with tomatoes, mushrooms, beetroot, mozzarella, olive oil and rice, then ChatGPT is there. It will to listen to you, analyse your inputs, and offer up a perfectly structured paper, a convincing cover letter, or a workable recipe for tomato and mushroom risotto with roasted beetroot and mozzarella. I know this because three undergraduates have given me permission to eavesdrop on every conversation they have had with ChatGPT over the past 18 months.


What Happens After A.I. Destroys College Writing?

The New Yorker

On a blustery spring Thursday, just after midterms, I went out for noodles with Alex and Eugene, two undergraduates at New York University, to talk about how they use artificial intelligence in their schoolwork. When I first met Alex, last year, he was interested in a career in the arts, and he devoted a lot of his free time to photo shoots with his friends. But he had recently decided on a more practical path: he wanted to become a C.P.A. His Thursdays were busy, and he had forty-five minutes until a study session for an accounting class. He stowed his skateboard under a bench in the restaurant and shook his laptop out of his bag, connecting to the internet before we sat down. Alex has wavy hair and speaks with the chill, singsong cadence of someone who has spent a lot of time in the Bay Area.


Large Language Models Pass the Turing Test

Jones, Cameron R., Bergen, Benjamin K.

arXiv.org Artificial Intelligence

We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomised, controlled, and pre-registered Turing tests on independent populations. Participants had 5 minute conversations simultaneously with another human participant and one of these systems before judging which conversational partner they thought was human. When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant. LLaMa-3.1, with the same prompt, was judged to be the human 56% of the time -- not significantly more or less often than the humans they were being compared to -- while baseline models (ELIZA and GPT-4o) achieved win rates significantly below chance (23% and 21% respectively). The results constitute the first empirical evidence that any artificial system passes a standard three-party Turing test. The results have implications for debates about what kind of intelligence is exhibited by Large Language Models (LLMs), and the social and economic impacts these systems are likely to have.


AI can beat university students, study suggests

BBC News

In the study, fake exam answers and essays were submitted for first-, second- and third-year modules, without the knowledge of those marking them. The scores by the AI students beat those achieved by the real undergraduates in the first two years. But the humans scored better in the third-year exams - which "is consistent with the notion that current AI struggles with more abstract reasoning", the researchers said. And theirs was the largest and most robust blind study of its kind to date. Academics have raised concerns about the influence of AI in education, with Glasgow University recently reintroducing in-person exams for one course.


EnviroExam: Benchmarking Environmental Science Knowledge of Large Language Models

Huang, Yu, Guo, Liang, Guo, Wanqian, Tao, Zhe, Lv, Yang, Sun, Zhihao, Zhao, Dongfang

arXiv.org Artificial Intelligence

In the field of environmental science, it is crucial to have robust evaluation metrics for large language models to ensure their efficacy and accuracy. We propose EnviroExam, a comprehensive evaluation method designed to assess the knowledge of large language models in the field of environmental science. EnviroExam is based on the curricula of top international universities, covering undergraduate, master's, and doctoral courses, and includes 936 questions across 42 core courses. By conducting 0-shot and 5-shot tests on 31 open-source large language models, EnviroExam reveals the performance differences among these models in the domain of environmental science and provides detailed evaluation standards. The results show that 61.3% of the models passed the 5-shot tests, while 48.39% passed the 0-shot tests. By introducing the coefficient of variation as an indicator, we evaluate the performance of mainstream open-source large language models in environmental science from multiple perspectives, providing effective criteria for selecting and fine-tuning language models in this field. Future research will involve constructing more domain-specific test sets using specialized environmental science textbooks to further enhance the accuracy and specificity of the evaluation.


More than half of UK undergraduates say they use AI to help with essays

The Guardian

More than half of undergraduates say they consult artificial intelligence programmes to help with their essays, while schools are trialling its use in the classroom. A survey of more than 1,000 UK undergraduates, conducted by the Higher Education Policy Institute (Hepi), found 53% were using AI to generate material for work they would be marked on. One in four are using applications such as Google Bard or ChatGPT to suggest topics and one in eight are using them to create content. Just 5% admitted to copying and pasting unedited AI-generated text into their assessments. Teachers are also seeking to use AI to streamline their work, with the Education Endowment Foundation (EEF) signing up secondary schools for a new research project into the use of AI to generate lesson plans and teaching materials as well as exams and model answers.


How Jensen Huang's Nvidia Is Powering the A.I. Revolution

The New Yorker

The revelation that ChatGPT, the astonishing artificial-intelligence chatbot, had been trained on an Nvidia supercomputer spurred one of the largest single-day gains in stock-market history. When the Nasdaq opened on May 25, 2023, Nvidia's value increased by about two hundred billion dollars. A few months earlier, Jensen Huang, Nvidia's C.E.O., had informed investors that Nvidia had sold similar supercomputers to fifty of America's hundred largest companies. By the close of trading, Nvidia was the sixth most valuable corporation on earth, worth more than Walmart and ExxonMobil combined. Huang's business position can be compared to that of Samuel Brannan, the celebrated vender of prospecting supplies in San Francisco in the late eighteen-forties.


ChatGPT better than undergraduates at solving SAT problems, study suggests

The Guardian

ChatGPT can solve problems at a level that matches or surpasses an undergraduate student, according to a new study. Researchers found that the GPT-3 large language model that underpins the chatbot performed about as well as US college undergraduates when asked to solve reasoning problems that appear on intelligence tests or exams such as the American college admission test, the SAT. Psychologists at the University of California, Los Angeles tested GPT-3's ability to predict the next image in a complex array of shapes, after converting the images to a text format that the model could process and also ensuring the model would never have encountered the questions before. The same problems were put to 40 UCLA undergraduates and the researchers found that GPT-3 solved 80% of the problems correctly, well above the average score of just below 60% for the human participants. The researchers also prompted the model to solve some SAT "analogy" questions – selecting pairs of words that are linked in some way – that they believe had not been published on the internet and therefore could not have appeared in the vast amount of data it was trained on.