Goto

Collaborating Authors

 world war


RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

Yu, Zichun, Xiong, Chenyan

arXiv.org Artificial Intelligence

High-quality pretraining data is the fossil fuel of large language models (LLMs), yet its reserves are running low for frontier models. In this paper, we introduce RePro, a novel web recycling method that trains a relatively small LM with reinforcement learning to generate effective and faithful rephrasings of pretraining data. Specifically, we design one quality reward and three faithfulness rewards, optimizing the LM rephraser to convert organic data into high-quality rephrasings while maintaining its core semantics and structure. In our experiment, we train a 4B rephraser to recycle 72B tokens sampled from DCLM-RefinedWeb. Pretraining results on 400M and 1.4B models demonstrate that RePro delivers 4.7%-14.0% relative accuracy gains over organic-only baseline on 22 downstream tasks. RePro also outperforms ReWire, the state-of-the-art web recycling method that prompts a 70B rephraser, as well as the organic baseline with a 4x larger data pool. Experiments with different amounts of recycled data highlight that RePro improves organic data efficiency by 2-3x. Individual and distributional analyses validate that RePro preserves more critical information and faithfully reflects the characteristics of organic data compared to prompting-based methods. Together, these results show that RePro provides an efficient and controllable path to effectively harness the fossil fuel of LLM pretraining. We open-source our code, rephraser, and recycled data at https://github.com/cxcscmu/RePro.


Sunken WWII bombs make a surprising home for sea life

Popular Science

A new study finds algae, mussels, and starfish flock to munitions dumped in the Baltic Sea. Breakthroughs, discoveries, and DIY tips sent every weekday. As the ink dried on Germany's unconditional surrender on May 8, 1945, celebrations erupted across the world. People cheered, wept, and kissed in the streets as World War II finally came to an end in Europe. A few months later at the Potsdam Conference, Germany agreed to demilitarize and dismantle its once formidable army, leaving the nation with lots and lots of leftover munitions.


What Is Noise?

The New Yorker

"Noise" is a fuzzy word--a noisy one, in the statistical sense. Its meanings run the gamut from the negative to the positive, from the overpowering to the mysterious, from anarchy to sublimity. The negative seems to lie at the root: etymologists trace the word to "nuisance" and "nausea." Noise is what drives us mad; it sends the Grinch over the edge at Christmastime. ("Oh, the Noise! Noise!") Noise is the sound of madness itself, the din within our minds. The demented narrator of Poe's "The Tell-Tale Heart" jabbers about noise while he hallucinates his victim's heartbeat: "I found that the noise was not within my ears. . . . The noise steadily increased. . . . Yet noise can be righteous and majestic. The Psalms are full of joyful noise, noise unto the Lord. In the Book of Ezekiel, the voice of God is said to be "like a noise of many waters." In "Paradise Lost," Heaven makes "infernal noise" as it beats back the armies of Hell. At the same time, the word can summon all manner of ...


Bridging History with AI A Comparative Evaluation of GPT 3.5, GPT4, and GoogleBARD in Predictive Accuracy and Fact Checking

Tasar, Davut Emre, Tasar, Ceren Ocal

arXiv.org Artificial Intelligence

The rapid proliferation of information in the digital era underscores the importance of accurate historical representation and interpretation. While artificial intelligence has shown promise in various fields, its potential for historical fact-checking and gap-filling remains largely untapped. This study evaluates the performance of three large language models LLMs GPT 3.5, GPT 4, and GoogleBARD in the context of predicting and verifying historical events based on given data. A novel metric, Distance to Reality (DTR), is introduced to assess the models' outputs against established historical facts. The results reveal a substantial potential for AI in historical studies, with GPT 4 demonstrating superior performance. This paper underscores the need for further research into AI's role in enriching our understanding of the past and bridging historical knowledge gaps.


'Eyes and ears': Could drones prove decisive in the Ukraine war?

Al Jazeera

Warning: Some readers may find some of the scenes described in this article disturbing. Kyiv, Ukraine – Ivan Ukraintsev, a stern-faced insurance broker turned director of a wartime charity providing crucial aid to Ukraine's military forces, is on a mission: to help Ukraine win the drone war. He is a polite but no-nonsense character, and he is here to talk about drones. "If we [Ukraine] had enough drones, we could end this war in two months," he says firmly. Ivan, who heads up the charity Starlife, had recently returned from overseeing a drone delivery to Bakhmut, a city in eastern Ukraine that has become the focal point for months of bloody battles between Ukrainian and Russian forces. Trench warfare, pockmarked and corpse-ridden swathes of no man's land, and constant artillery bombardments have drawn comparisons to battlefield conditions during World War I.


Peter Diamandis: 'I hope to see flying cars available by the end of this decade'

#artificialintelligence

When Peter Diamandis took to the stage at Madrid's Palacio de Cibeles for the Audi Summit for Progress last Tuesday, WhatsApp had crashed and the Wi-Fi wasn't working properly. It was a blow to the audience's faith in technology, but Diamandis, the star speaker at the summit, was ready to counter this. The 61-year-old doctor and engineer from New York has blind faith in the power of innovation and science. Diamandis, who is the founder of Singularity University and a friend of tycoon Elon Musk, has set up a number of technology companies and written several books in which he predicts a future of abundance, longevity, flying cars and an exponential increase in resources. It's a vision that is hard to imagine in times of war, an energy crisis and growing fears of recession.


Russian-Ukraine War Could Bring The World Economy Back To 1914

International Business Times

The ongoing Russian-Ukraine war and the unprecedented sanctions the United States and its allies have imposed on Russia could bring the world economy back to 1914, which signaled the end of early globalization and the revival of national and regional conflicts. "History doesn't repeat itself, but it often rhymes," Mark Twain is rightly or wrongly quoted as saying, which is as timely today as it was in his time. At the turn of the 20th century, capitalism was on track to conquer the global economy, creating a global market without borders, a trade regime where commodities and resources could flow freely within and across borders. But unfortunately for the world community, it didn't happen. By the beginning of the second decade, this trend of early globalization stalled and, in some cases, forestalled by the rise of nationalism and trade protectionism, not to mention the destruction of the two World Wars. For instance, increased trade protectionism limited the flow of resources and commodities across national borders.


This is how the Hobbit director used machine learning to rewrite the story of the Beatles

#artificialintelligence

In January 1969, the Beatles had 21 days a new album, register documentary film And back to playing in front of the audience. Contracted to Twickenham Film Studios, the idea behind the project came in many ways from Paul McCartney – in what many believe was an attempt to save the group. It all ended with their last concert, their last studio album, and with a documentary that apparently documented the deep struggles within "The Fab Four". Since then, this has been seen as the time of the Beatles really Although the breach was not officially completed until the spring of the following year. Several contestants were uncomfortable with Yoko Ono's presence, and controversies led to George Harrison's "resignation" to the Beatles for three weeks, before returning somewhat reluctantly.


Elon Musk posts cryptic tweet about the 'sun of the old world setting in a dying blaze of splendor'

Daily Mail - Science & tech

SpaceX and Tesla CEO Elon Musk posted an outlandish tweet on Monday in which he references the novel The Guns of August, a 500-page book about the early stages of World War I. Musk, 50, captioned the tweet with the name of the book, written by Barbara Tuchman in 1962, along with the entire first paragraph of the book. Barbara Tuchman's 1962 novel was centered on the first month of the Great War and the opening events of WWI, along with the decisions that led to it. Tuchman's book was an immediate bestseller and earned her a Pulitzer Prize for general nonfiction'The muffled tongue of Big Ben tolled nine by the clock as the cortege left the palace, but on history's clock it was sunset, and the sun of the old world was setting in a dying blaze of splendor never to be seen again,' so ends the paragraph. Tuchman's book, centered on the first month of the Great War, was an immediate bestseller and earned her a Pulitzer Prize for general nonfiction. President John F. Kennedy was so impressed with it that he gave a copy to each member of his cabinet and some of his top military advisors and told them to read it.


Will Members of the Military Ever Be Willing to Fight Alongside Autonomous Robots?

Slate

A writer and military historian responds to Justina Ireland's "Collateral Damage." The histories of the military and technology often go hand in hand. Soldiers and military thinkers throughout the past have continually come up with new ways to fill the people over there full of holes as a means to encourage them to stop trying to do the same to their opponents. After the introduction of a new weapon or the improvement of an existing one, strategists spend their time trying to come up with the best way to deploy their forces to take advantage of the tools and/or to blunt their effectiveness by devising countermeasures. The development of the Greek phalanx helped protect soldiers from cavalry, the deployment of English longbows helped stymie large formations of enemy soldiers, new construction methods changed the shape of fortifications, line infantry helped European formations take advantage of firearms, and anti-aircraft cannons helped protect against incoming enemy aircraft.