Goto

Collaborating Authors

 match


How Language Directions Align with Token Geometry in Multilingual LLMs

Kim, JaeSeong, Lee, Suan

arXiv.org Artificial Intelligence

Multilingual LLMs demonstrate strong performance across diverse languages, yet there has been limited systematic analysis of how language information is structured within their internal representation space and how it emerges across layers. We conduct a comprehensive probing study on six multilingual LLMs, covering all 268 transformer layers, using linear and nonlinear probes together with a new Token--Language Alignment analysis to quantify the layer-wise dynamics and geometric structure of language encoding. Our results show that language information becomes sharply separated in the first transformer block (+76.4$\pm$8.2 percentage points from Layer 0 to 1) and remains almost fully linearly separable throughout model depth. We further find that the alignment between language directions and vocabulary embeddings is strongly tied to the language composition of the training data. Notably, Chinese-inclusive models achieve a ZH Match@Peak of 16.43\%, whereas English-centric models achieve only 3.90\%, revealing a 4.21$\times$ structural imprinting effect. These findings indicate that multilingual LLMs distinguish languages not by surface script features but by latent representational structures shaped by the training corpus. Our analysis provides practical insights for data composition strategies and fairness in multilingual representation learning. All code and analysis scripts are publicly available at: https://github.com/thisiskorea/How-Language-Directions-Align-with-Token-Geometry-in-Multilingual-LLMs.


Kissing to Find a Match: Efficient Low-Rank Permutation Representation - Supplementary Material

Neural Information Processing Systems

Following our shape-matching experiments described in Sec. The recorded time values align with the accuracy measurements presented in Figure 1b. Moreover, it's possibly also necessary to adapt a network architecture that predicts the


Supplementary Materials Online Map Vectorization for Autonomous Driving: A Rasterization Perspective

Neural Information Processing Systems

The base model takes surround-view images of the ego-vehicle as input. As shown in Figure 1, we provide further visual comparisons of HD map vectorization results. The results reaffirm the necessity of a rasterization perspective in map vectorization. Figure 1 presents more visualization of MapVR's HD map construction results. As discussed in Section 3, the Chamfer-distance-based metric struggles to offer a fair evaluation for such scenarios.


any ground-truth visual relationship annotations, avoiding the challenging manual annotation of visual relationships;

Neural Information Processing Systems

We thank all the reviewers for their efforts and constructive comments! Below we address the important and common issues. On the other hand, the probing loss can further help improve the performance. As mentioned by R4, "this paper introduces a new and BLEU between captions (query image) and reference captions (retrieved images) in Table B. We see that'Obj.+Rel.' Table B: Results on 1K query images randomly sampled from MSCOCO.


Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient

Neural Information Processing Systems

Across scientific domains, generating new models or optimizing existing ones while meeting specific criteria is crucial. Traditional machine learning frameworks for guided design use a generative model and a surrogate model (discriminator), requiring large datasets. However, real-world scientific applications often have limited data and complex landscapes, making data-hungry models inefficient or impractical. We propose a new framework, PropEn, inspired by matching'', which enables implicit guidance without training a discriminator. By matching each sample with a similar one that has a better property value, we create a larger training dataset that inherently indicates the direction of improvement.


A David vs Goliath battle unfolding in the dating app industry

Al Jazeera

More than a decade ago, when Shahzad Younas started a website specifically for Muslims to meet and marry, he thought his problems would be the typical kind – attracting users, expanding the business, earning a profit. Instead, his biggest hurdle has been figuring out how to fend off a competitor that is suing him in multiple countries on multiple fronts with the aim, he said, of "stifling competition". Younas, 38, a British investment banker turned entrepreneur, has been butting heads since 2016 with the online dating giant Match Group, which owns Match.com, At issue are elements of his website's branding – elements that Match has argued create confusion between its platforms and Younas's. The latest blow came in late April when Younas lost a trademark appeal in the United Kingdom.


The Woman Who Made Online Dating Into a 'Science'

The Atlantic - Technology

The anthropologist and famed love expert Helen Fisher seemed ready to dash into oncoming traffic. We were on a sidewalk in Manhattan, opposite the American Museum of Natural History, and nowhere near a safe place to cross the street. She wanted me to stare down the yellow cabs and charge off the curb, though she knew I wouldn't do it: I'd recently taken the personality questionnaire she wrote 17 years ago for a dating website, which produced the insight that I am a cautious, conventional rule follower. She, however, is an "explorer"--she has visited 111 countries, including North Korea--but also, being high in estrogen, a "negotiator" who will use the crosswalk for my benefit. "I am horribly empathetic," she told me. I look into baby carriages and worry about their future with love." This is how Fisher, the 77-year-old chief scientific adviser for Match.com and one of the best-known, most-often-quoted experts on romance and "mate choice," understands life: Personality is a cocktail of ...


Grindr Public Listing Can't Keep It Casual

WSJ.com: WSJD - Technology

Investors will soon be able to hook up with the world's most-popular gay-dating platform. A merger with the special-purpose acquisition company Tiga Acquisition, announced in May, values Grindr at $2.1 billion and is expected to close by the end of the year. As with any SPAC merger, historical details on the business are slim. In online dating, though, a snapshot often says all you need to know. Grindr's popularity relative to its total market size is impressive.


Better Smatch = Better Parser? AMR evaluation is not so simple anymore

Opitz, Juri, Frank, Anette

arXiv.org Artificial Intelligence

Recently, astonishing advances have been observed in AMR parsing, as measured by the structural Smatch metric. In fact, today's systems achieve performance levels that seem to surpass estimates of human inter annotator agreement (IAA). Therefore, it is unclear how well Smatch (still) relates to human estimates of parse quality, as in this situation potentially fine-grained errors of similar weight may impact the AMR's meaning to different degrees. We conduct an analysis of two popular and strong AMR parsers that -- according to Smatch -- reach quality levels on par with human IAA, and assess how human quality ratings relate to Smatch and other AMR metrics. Our main findings are: i) While high Smatch scores indicate otherwise, we find that AMR parsing is far from being solved: we frequently find structurally small, but semantically unacceptable errors that substantially distort sentence meaning. ii) Considering high-performance parsers, better Smatch scores may not necessarily indicate consistently better parsing quality. To obtain a meaningful and comprehensive assessment of quality differences of parse(r)s, we recommend augmenting evaluations with macro statistics, use of additional metrics, and more human analysis.


What Happens When Artificial Intelligence Creates Images to Match the Lyrics of Iconic Songs: David Bowie's "Starman," Led Zeppelin's "Stairway to Heaven", ELO's "Mr. Blue Sky" & More

#artificialintelligence

Lyricists must write concretely enough to be evocative, yet vaguely enough to allow each listener his personal interpretation. The nineteen-sixties and seventies saw an especially rich balance struck between resonant ambiguity and massive popularity -- aided, as many involved parties have admitted, by the use of certain psychoactive substances. Half a century later, the visions induced by those same substances offer the closest comparison to the striking fruits of visual artificial-intelligence projects like Google's Deep Dream a few years ago or DALL-E today. Only natural, perhaps, that these advanced applications would sooner or later be fed psychedelic song lyrics. The video at the top of the post presents the Electric Light Orchestra's 1977 hit "Mr. Blue Sky" illustrated by images generated by artificial intelligence straight from its words.