Goto

Collaborating Authors

 evolutionary history


MASCOT: Analyzing Malware Evolution Through A Well-Curated Source Code Dataset

Li, Bojing, Zhong, Duo, Nadendla, Dharani, Terceros, Gabriel, Bhandar, Prajna, S, Raguvir, Nicholas, Charles

arXiv.org Artificial Intelligence

Abstract--In recent years, the explosion of malware and extensive code reuse have formed complex evolutionary connections among malware specimens. The rapid pace of development makes it challenging for existing studies to characterize recent evolutionary trends. In addition, intuitive tools to untangle these intricate connections between malware specimens or categories are urgently needed. This paper introduces a manually-reviewed malware source code dataset containing 6032 specimens. Building on and extending current research from a software engineering perspective, we systematically evaluate the scale, development costs, code quality, as well as security and dependencies of modern malware. We further introduce a multi-view genealogy analysis to clarify malware connections: at an overall view, this analysis quantifies the strength and direction of connections among specimens and categories; at a detailed view, it traces the evolutionary histories of individual specimens. Experimental results indicate that, despite persistent shortcomings in code quality, malware specimens exhibit an increasing complexity and standardization, in step with the development of mainstream software engineering practices. Meanwhile, our genealogy analysis intuitively reveals lineage expansion and evolution driven by code reuse, providing new evidence and tools for understanding the formation and evolution of the malware ecosystem. With the rapid development of information technology and large language models, malware has experienced a surge in recent years, exhibiting strong connections among categories and specimens, as well as high code reuse rates [1]. In the past 12 months, more than 107 million new malicious or potentially unwanted applications were detected [2], [3]. Many of these malware specimens are variants of previously known malware, which indicates the prevalence of code reuse and family-oriented evolution. However, the difficulty of collecting, reviewing, and labeling has resulted in a scarcity of source code datasets [4]. Existing datasets lack human curation, reliable labels, and timestamps.


Rhinos once lived in Canada

Popular Science

A newly discovered species of Arctic rhino lived 23 million years ago. Breakthroughs, discoveries, and DIY tips sent every weekday. About 23 million years ago, a rhinoceros stomped across the Canadian High Arctic . Now extinct, a team of scientists from the Canadian Museum of Nature (CMN) have found a new species of the enigmatic "Arctic rhino." First uncovered almost 40 years ago in lake deposits in Haughton Crater on Devon Island, Nunavut, was more petite than many of its modern descendants.


Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding

Kong, Yibo, Tiley, George P., Solis-Lemus, Claudia

arXiv.org Machine Learning

The Tree of Life is a massive graphical structure which represents the evolutionary process from single cell organisms into the immense biodiversity of living species in present time. Estimating the Tree of Life would not only represent the greatest accomplishment in evolutionary biology and systematics, but it would also allow us to fully understand the development and evolution of important biological traits in nature, in particular, those related to resilience to extinction when exposed to environmental threats such as climate change. Therefore, the development of statistical and machine-learning theory to reconstruct the Tree of Life, especially those scalable to big data, are paramount in evolutionary biology, systematics, and conservation efforts against mass extinctions. Graphical structures that represent evolutionary processes are denoted phylogenetic trees. A phylogenetic tree is a binary tree whose internal nodes represent ancestral species that over time differentiate into two separate species giving rise to its two children nodes (see Figure 1 left). The evolutionary process is then depicted by this bifurcating tree from the root (the origin of life) to the external nodes of the tree (also denoted leaves) which represent the living organisms today.


Artificial Intelligence Finds Ancient 'Ghosts' in Modern DNA

#artificialintelligence

Could deep learning help paleontologists and geneticists hunt for ghosts? When modern humans first migrated out of Africa 70,000 years ago, at least two related species, now extinct, were already waiting for them on the Eurasian landmass. These were the Neanderthals and Denisovans, archaic humans who interbred with those early moderns, leaving bits of their DNA behind today in the genomes of people of non-African descent. But there have been growing hints of an even more convoluted and colorful history: A team of researchers reported in Nature last summer, for instance, that a bone fragment found in a Siberian cave belonged to the daughter of a Neanderthal mother and a Denisovan father. The finding marked the first fossil evidence of a first-generation human hybrid.


Polar bears and brown bears continued to mate with each other long after the species separated

Daily Mail - Science & tech

Polar bears and brown bears were still mating with each other long after they had split into two distinct species, a new study has found. The two species are known to have separated up to 1.6 million years ago, yet new genomic evidence suggests they have inherited traits from each other much more recently. Scientists from the USA, Mexico and Finland analysed the genomes of 64 modern polar and brown bears, as well as that of an ancient polar bear that lived up to 130,000 years ago. While evidence of evidence of hybridisation was found in both brown and polar bear genomes, the latter carried a particularly strong signature of DNA from brown bears. As global warming continued to melt Arctic sea ice, the two bear species may run into each other more frequently, their shared evolutionary history could become more significant.


Machine learning algorithms accelerate the protein engineering process

#artificialintelligence

Proteins are the molecular machines of all living cells and have been exploited for use in many applications, including therapeutics and industrial catalysts. To overcome the limitations of naturally occurring proteins, protein engineering is used to improve protein characteristics such as stability and functionality. In a new study, researchers demonstrate a machine learning algorithm that accelerates the protein engineering process. The study is reported in the journal Nature Communications. Machine learning algorithms assist in protein engineering by reducing the experimental burden of methods such as directed evolution, which involves multiple rounds of mutagenesis and high-throughput screening.


Artificial intelligence in structural biology is here to stay

#artificialintelligence

"I didn't think we would get to this point in my lifetime." That's how one research leader in structural biology responded to last week's publication of research in which artificial intelligence (AI) was used to predict the structure of more than 20,000 human proteins, as well as that of nearly all the known proteins produced by 20 model organisms such as Escherichia coli, fruit flies and yeast, but also soya bean and Asian rice. That is a combined total of around 365,000 predictions1. The data, publicly accessible for the first time (see https://alphafold.ebi.ac.uk), were released online on 22 July by researchers at DeepMind, a London-based AI company owned by Google's parent company, Alphabet, and the European Bioinformatics Institute, based at the European Molecular Biology Laboratory (EBI-EMBL) near Cambridge, UK. DeepMind's AI predicts structures for a vast trove of proteins The DeepMind team developed a machine-learning tool called AlphaFold.


Applying Behavioral Science to Machine Learning

#artificialintelligence

I recently started a new newsletter focus on AI education and already has over 50,000 subscribers. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Understanding the behavior of artificial intelligence(AI) agents is one of the pivotal challenges of the next decade of AI. Interpretability or explainability are some of the terms often used to describe methods that provide insights about the behavior of AI programs.


It's Alive!

Communications of the ACM

The biobot developed at the University of Illinois at Urbana-Champaign couples engineered skeletal muscle tissue to a 3D printed flexible skeleton. Although robotic humanoids now perform backflips and autonomous drones fly in formation, even the most advanced robots are relatively primitive when compared with living machines. The running, jumping, swimming, and flying creatures that cover our planet's surface have long inspired engineers. Yet a subset of researchers are not just taking tips from living creatures. These roboticists, computer scientists, and bioengineers are combining artificial materials with living tissue, or making machines entirely from living cells.


The Problem with the Way Scientists Study Reason - Facts So Romantic

Nautilus

Last year, I was in Paris for the International Convention of Psychological Science, one of the most prestigious gatherings in cognitive science. I listened to talks from my field, human reasoning, but I also enjoyed those on ethology, because I find studies on non-human animals, from turtles to parrots, fascinating. Despite their typically small sample sizes, I found the scientific reasoning in the animal-studies talks sounder, and their explanations richer, than the work I heard on human reasoning. The reason is simple: Ethologists evaluate their experimental paradigm, or set-up, in light of its ecological validity, or how well it matches natural surroundings. An animal's true habitat, and its evolutionary history, have always centered the discussion.