Goto

Collaborating Authors

 family tree


This book is a great insight into the new science of microchimerism

New Scientist

Lise Barnéoud's Hidden Guests shows how this fascinating new field brings with it profound implications for medicine, and even what it means to be human, finds Helen Thomson "We are composed not only of human cells and microbes but also fragments of others " My children were conceived using donated eggs, so you would be forgiven for assuming we share no genetic material. Yet science has proved this isn't entirely true. We now know that during pregnancy, fetal cells cross the placenta into the mother, embedding themselves in every organ yet studied. Likewise, maternal cells, and even those that crossed from my mum to me, can make their way into my kids. And things might get even more chimeric - I have older sisters, so their cells, having passed into my mum during their own gestation, might have then found their way into me and, in turn, into my kids.


Are Large Language Models Capable of Deep Relational Reasoning? Insights from DeepSeek-R1 and Benchmark Comparisons

So, Chi Chiu, Sun, Yueyue, Wang, Jun-Min, Yung, Siu Pang, Loh, Anthony Wai Keung, Chau, Chun Pong

arXiv.org Artificial Intelligence

How far are Large Language Models (LLMs) in performing deep relational reasoning? In this paper, we evaluate and compare the reasoning capabilities of three cutting-edge LLMs, namely, DeepSeek-R1, DeepSeek-V3 and GPT-4o, through a suite of carefully designed benchmark tasks in family tree and general graph reasoning. Our experiments reveal that DeepSeek-R1 consistently achieves the highest F1-scores across multiple tasks and problem sizes, demonstrating strong aptitude in logical deduction and relational inference. However, all evaluated models, including DeepSeek-R1, struggle significantly as problem complexity increases, largely due to token length limitations and incomplete output structures. A detailed analysis of DeepSeek-R1's long Chain-of-Thought responses uncovers its unique planning and verification strategies, but also highlights instances of incoherent or incomplete reasoning, calling attention to the need for deeper scrutiny into LLMs' internal inference dynamics. We further discuss key directions for future work, including the role of multimodal reasoning and the systematic examination of reasoning failures. Our findings provide both empirical insights and theoretical implications for advancing LLMs' reasoning abilities, particularly in tasks that demand structured, multi-step logical inference. Our code repository will be publicly available at https://github.com/kelvinhkcs/Deep-Relational-Reasoning.


Generalization from Starvation: Hints of Universality in LLM Knowledge Graph Learning

Baek, David D., Li, Yuxiao, Tegmark, Max

arXiv.org Artificial Intelligence

We show that these attractor representations optimize generalization to unseen examples by exploiting properties of knowledge graph relations (e.g. We find experimental support for such universality by showing that LLMs and simpler neural networks can be stitched, i.e., by stitching the first part of one model to the last part of another, mediated only by an affine or almost affine transformation. We hypothesize that this dynamic toward simplicity and generalization is driven by "intelligence from starvation": where overfitting is minimized by pressure to minimize the use of resources that are either scarce or competed for against other tasks. Large Language Models (LLMs), despite being primarily trained for next-token predictions, have shown impressive reasoning capabilities (Bubeck et al., 2023; Anthropic, 2024; Team et al., 2023). However, despite recent progress reviewed below, it is not well understood what knowledge LLMs represent internally and how they represent it. Improving such understanding could enable valuable progress relevant to transparency, interpretability, fairness and robustness, for example discovering and correcting inaccuracies to improve model reliability.


An Analysis of Letter Dynamics in the English Alphabet

Zhao, Neil, Zheng, Diana

arXiv.org Artificial Intelligence

The tabulation of commonly used letters, as determined by letter frequency, was later utilized to improve typewriter keyboard arrangement by minimizing hand motion [5]. Statistical characteristics of different letters of the English alphabet was further studied in the context of different sentence structures [6]. The letters'B', 'S', 'M', 'H', 'C' were found to most frequently occur as the initial letters of proper nouns, while'E', 'A', 'R', 'N' were the most frequently used letters when the entire proper noun is considered. For entire text documents, the most commonly used letters were found to be'E', 'T', 'A', 'O', 'N'. Interestingly, 95% of the English vocabulary was found to be represented by 13 letters of the alphabet. Our manuscript expanded upon the statistical study of the English alphabet by evaluating letter frequency in the context of different categories of writings. We analyzed news articles, novels, plays, and scientific articles for letter frequency and distribution. As a result, we determined the information density of the letters of the alphabet. Additionally, we developed a metric called "distance, d" to act as a simple algorithm for recognizing writing category.


Around the GLOBE: Numerical Aggregation Question-Answering on Heterogeneous Genealogical Knowledge Graphs with Deep Neural Networks

Suissa, Omri, Zhitomirsky-Geffet, Maayan, Elmalech, Avshalom

arXiv.org Artificial Intelligence

One of the key AI tools for textual corpora exploration is natural language question-answering (QA). Unlike keyword-based search engines, QA algorithms receive and process natural language questions and produce precise answers to these questions, rather than long lists of documents that need to be manually scanned by the users. State-of-the-art QA algorithms based on DNNs were successfully employed in various domains. However, QA in the genealogical domain is still underexplored, while researchers in this field (and other fields in humanities and social sciences) can highly benefit from the ability to ask questions in natural language, receive concrete answers and gain insights hidden within large corpora. While some research has been recently conducted for factual QA in the genealogical domain, to the best of our knowledge, there is no previous research on the more challenging task of numerical aggregation QA (i.e., answering questions combining aggregation functions, e.g., count, average, max). Numerical aggregation QA is critical for distant reading and analysis for researchers (and the general public) interested in investigating cultural heritage domains. Therefore, in this study, we present a new end-to-end methodology for numerical aggregation QA for genealogical trees that includes: 1) an automatic method for training dataset generation; 2) a transformer-based table selection method, and 3) an optimized transformer-based numerical aggregation QA model. The findings indicate that the proposed architecture, GLOBE, outperforms the state-of-the-art models and pipelines by achieving 87% accuracy for this task compared to only 21% by current state-of-the-art models. This study may have practical implications for genealogical information centers and museums, making genealogical data research easy and scalable for experts as well as the general public.


Question Answering with Deep Neural Networks for Semi-Structured Heterogeneous Genealogical Knowledge Graphs

Suissa, Omri, Zhitomirsky-Geffet, Maayan, Elmalech, Avshalom

arXiv.org Artificial Intelligence

With the rising popularity of user-generated genealogical family trees, new genealogical information systems have been developed. State-of-the-art natural question answering algorithms use deep neural network (DNN) architecture based on self-attention networks. However, some of these models use sequence-based inputs and are not suitable to work with graph-based structure, while graph-based DNN models rely on high levels of comprehensiveness of knowledge graphs that is nonexistent in the genealogical domain. Moreover, these supervised DNN models require training datasets that are absent in the genealogical domain. This study proposes an end-to-end approach for question answering using genealogical family trees by: 1) representing genealogical data as knowledge graphs, 2) converting them to texts, 3) combining them with unstructured texts, and 4) training a trans-former-based question answering model. To evaluate the need for a dedicated approach, a comparison between the fine-tuned model (Uncle-BERT) trained on the auto-generated genealogical dataset and state-of-the-art question-answering models was per-formed. The findings indicate that there are significant differences between answering genealogical questions and open-domain questions. Moreover, the proposed methodology reduces complexity while increasing accuracy and may have practical implications for genealogical research and real-world projects, making genealogical data accessible to experts as well as the general public.


A Face Recognition Site Crawled the Web for Dead People's Photos

WIRED

Finding out Taylor Swift was her 11th cousin twice-removed wasn't even the most shocking discovery Cher Scarlett made while exploring her family history. "There's a lot of stuff in my family that's weird and strange that we wouldn't know without Ancestry," says Scarlett, a software engineer and writer based in Kirkland, Washington. "I didn't even know who my mum's paternal grandparents were." In February 2022, the facial recognition search engine PimEyes surfaced non-consensual explicit photos of her at age 19, reigniting decades-old trauma. She attempted to get the pictures removed from the platform, which uses images scraped from the internet to create biometric "faceprints" of individuals.

  Country: North America > United States > Washington > King County > Kirkland (0.26)
  Genre: Personal > Obituary (0.33)
  Industry: Media (0.37)

A.I. Has Helped Humans Know the Family Tree of the Milky Way

#artificialintelligence

Kindly give this article a like or a comment, so I know that you are still reading. I'm a big believer in A.I.'s ability to enable human civilization to become a multi-planetary species. I think artificial intelligence will be critical in enabling us to make this jump in the brief window afforded to us by time and history since the risks of human extinction will become greater in the decades and centuries ahead. I'm always searching and on the hunt for big stories in how A.I. is shaping our understanding of the world and in terms of business innovation. Sometimes however you have to look up.


Researchers use AI to create the Milky Way's family tree

#artificialintelligence

Artificial intelligence (AI) has helped in creating the first complete family tree of Earth's home galaxy – the Milky Way. An international team of researchers, led by astrophysicists Diederik Kruijssen of the University of Heidelberg and Joel Pfeffer of Liverpool John Moores University, published their work in Monthly Notices of the Royal Astronomical Society. The researchers used AI to analyse large groups of stars with as many as million stars, orbiting the Milky Way. "The Milky Way hosts over 150 such clusters, many of which formed in the smaller galaxies that merged to form the galaxy that we live in today," a Royal Astronomical Society (RAS) release noted. With the help of the latest models and observations, the researchers managed to use the clusters as "fossils" to generate the history of galaxies, it added.


Ancient Kraken hiding inside the Milky Way gets revealed by artificial intelligence

#artificialintelligence

The Milky Way has had a long and eventful life. Throughout its history, our galaxy has collided and merged with multiple other galaxies, events that are hard to disentangle and make sense of. With the aid of Artificial Intelligence, a team of astronomers took on this painstaking task, piecing together the most complex history of our galaxy -- and the main attraction is something called The Kraken. Just like geologists look for fossils to see how ancient life might have looked like, astronomers also look for fossils of their own -- but instead of trilobites or dinosaurs, astronomers are preoccupied with very old cosmic structures called globular clusters. Globular clusters are spherical-shaped, densely-packed collections of ancient stars.