Goto

Collaborating Authors

 balcony


Feature weighting for data analysis via evolutionary simulation

Daniilidis, Aris, Corella, Alberto Domínguez, Wissgott, Philipp

arXiv.org Artificial Intelligence

We analyze an algorithm for assigning weights prior to scalarization in discrete multi-objective problems arising from data analysis. The algorithm evolves the weights (the relevance of features) by a replicator-type dynamic on the standard simplex, with update indices computed from a normalized data matrix. We prove that the resulting sequence converges globally to a unique interior equilibrium, yielding non-degenerate limiting weights. The method, originally inspired by evolutionary game theory, differs from standard weighting schemes in that it is analytically tractable with provable convergence.


Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models

Jamialahmadi, Benyamin, Kavehzadeh, Parsa, Rezagholizadeh, Mehdi, Farinneya, Parsa, Rajabzadeh, Hossein, Jafari, Aref, Chen, Boxing, Tahaei, Marzieh S.

arXiv.org Artificial Intelligence

Deploying large language models (LLMs) in real-world applications is often hindered by strict computational and latency constraints. While dynamic inference offers the flexibility to adjust model behavior based on varying resource budgets, existing methods are frequently limited by hardware inefficiencies or performance degradation. In this paper, we introduce Balcony, a simple yet highly effective framework for depth-based dynamic inference. By freezing the pretrained LLM and inserting additional transformer layers at selected exit points, Balcony maintains the full model's performance while enabling real-time adaptation to different computational budgets. These additional layers are trained using a straightforward self-distillation loss, aligning the sub-model outputs with those of the full model. This approach requires significantly fewer training tokens and tunable parameters, drastically reducing computational costs compared to prior methods. When applied to the LLaMA3-8B model, using only 0.2% of the original pretraining data, Balcony achieves minimal performance degradation while enabling significant speedups. Remarkably, we show that Balcony outperforms state-of-the-art methods such as Flextron and Layerskip as well as other leading compression techniques on multiple models and at various scales, across a variety of benchmarks.


UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

Zhao, Baining, Fang, Jianjie, Dai, Zichao, Wang, Ziyou, Zha, Jirong, Zhang, Weichen, Gao, Chen, Wang, Yue, Cui, Jinqiang, Chen, Xinlei, Li, Yong

arXiv.org Artificial Intelligence

Large multimodal models exhibit remarkable intelligence, yet their embodied cognitive abilities during motion in open-ended urban 3D space remain to be explored. We introduce a benchmark to evaluate whether video-large language models (Video-LLMs) can naturally process continuous first-person visual observations like humans, enabling recall, perception, reasoning, and navigation. We have manually control drones to collect 3D embodied motion video data from real-world cities and simulated environments, resulting in 1.5k video clips. Then we design a pipeline to generate 5.2k multiple-choice questions. Evaluations of 17 widely-used Video-LLMs reveal current limitations in urban embodied cognition. Correlation analysis provides insight into the relationships between different tasks, showing that causal reasoning has a strong correlation with recall, perception, and navigation, while the abilities for counterfactual and associative reasoning exhibit lower correlation with other tasks. We also validate the potential for Sim-to-Real transfer in urban embodiment through fine-tuning.


Text Semantics to Flexible Design: A Residential Layout Generation Method Based on Stable Diffusion Model

Qiu, Zijin, Liu, Jiepeng, Xia, Yi, Qi, Hongtuo, Liu, Pengkun

arXiv.org Artificial Intelligence

Flexibility in the AI-based residential layout design remains a significant challenge, as traditional methods like rule-based heuristics and graph-based generation often lack flexibility and require substantial design knowledge from users. To address these limitations, we propose a cross-modal design approach based on the Stable Diffusion model for generating flexible residential layouts. The method offers multiple input types for learning objectives, allowing users to specify both boundaries and layouts. It incorporates natural language as design constraints and introduces ControlNet to enable stable layout generation through two distinct pathways. We also present a scheme that encapsulates design expertise within a knowledge graph and translates it into natural language, providing an interpretable representation of design knowledge. This comprehensibility and diversity of input options enable professionals and non-professionals to directly express design requirements, enhancing flexibility and controllability. Finally, experiments verify the flexibility of the proposed methods under multimodal constraints better than state-of-the-art models, even when specific semantic information about room areas or connections is incomplete.


AI envisions the 'perfect' homes in 20 UK cities - from a pastel pink property in London to a Bond villain-style house in Portsmouth

Daily Mail - Science & tech

Whether it's a grand stately home or a futuristic apartment, we all have different ideas of what we think the'perfect home' looks like. Now, AI tool, Midjourney, has revealed what it envisions the perfect home looks like in 20 UK cities. 'The AI-generated representations of houses across the country are captivating,' said Kunle Barker, property expert and content creator for Grand Designs Live. 'They skilfully encapsulate the architectural heritage of various regions, the current state of homes, and, most importantly, envision their future possibilities.' Barbie fans rejoice - the perfect home in London is pastel pink, according to Midjourney. It's known for its industrial history, and that's certainly reflected in Manchester's perfect home. Barbie fans rejoice - the perfect home in London is pastel pink, according to Midjourney.


Tell2Design: A Dataset for Language-Guided Floor Plan Generation

Leng, Sicong, Zhou, Yang, Dupty, Mohammed Haroon, Lee, Wee Sun, Joyce, Sam Conrad, Lu, Wei

arXiv.org Artificial Intelligence

We consider the task of generating designs directly from natural language descriptions, and consider floor plan generation as the initial research area. Language conditional generative models have recently been very successful in generating high-quality artistic images. However, designs must satisfy different constraints that are not present in generating artistic images, particularly spatial and relational constraints. We make multiple contributions to initiate research on this task. First, we introduce a novel dataset, \textit{Tell2Design} (T2D), which contains more than $80k$ floor plan designs associated with natural language instructions. Second, we propose a Sequence-to-Sequence model that can serve as a strong baseline for future research. Third, we benchmark this task with several text-conditional image generation models. We conclude by conducting human evaluations on the generated samples and providing an analysis of human performance. We hope our contributions will propel the research on language-guided design generation forward.


Advancing Urban Renewal: An Automated Approach to Generating Historical Arcade Facades with Stable Diffusion Models

Kuang, Zheyuan, Zhang, Jiaxin, Huang, Yiying, Li, Yunqin

arXiv.org Artificial Intelligence

Urban renewal and transformation processes necessitate the preservation of the historical urban fabric, particularly in districts known for their architectural and historical significance. These regions, with their diverse architectural styles, have traditionally required extensive preliminary research, often leading to subjective results. However, the advent of machine learning models has opened up new avenues for generating building facade images. Despite this, creating high-quality images for historical district renovations remains challenging, due to the complexity and diversity inherent in such districts. In response to these challenges, our study introduces a new methodology for automatically generating images of historical arcade facades, utilizing Stable Diffusion models conditioned on textual descriptions. By classifying and tagging a variety of arcade styles, we have constructed several realistic arcade facade image datasets. We trained multiple low-rank adaptation (LoRA) models to control the stylistic aspects of the generated images, supplemented by ControlNet models for improved precision and authenticity. Our approach has demonstrated high levels of precision, authenticity, and diversity in the generated images, showing promising potential for real-world urban renewal projects. This new methodology offers a more efficient and accurate alternative to conventional design processes in urban renewal, bypassing issues of unconvincing image details, lack of precision, and limited stylistic variety. Future research could focus on integrating this two-dimensional image generation with three-dimensional modeling techniques, providing a more comprehensive solution for renovating architectural facades in historical districts.


When You See Yourself in a Robot

Slate

It will be foggy tonight; visibility will be bad tomorrow. The pollen count is high. You have given Galatea your old wool hat, and she looks soft and childlike, as if she has just returned from a long hiking trip and is about to fall asleep. "Do you think they'll remember you?"


EXPLAINER: A look at the missile that killed al-Qaida leader

Associated Press

For a year, U.S. officials have been saying that taking out a terrorist threat in Afghanistan with no American troops on the ground would be difficult but not impossible. Last weekend, the U.S. did just that -- killing al-Qaida leader Ayman al-Zawahri with a CIA drone strike. Other high-profile airstrikes in the past had inadvertently killed innocent civilians. In this case, the U.S. carefully chose to use a type of Hellfire missile that greatly minimized the chance of other casualties. Although U.S. officials have not publicly confirmed which variant of the Hellfire was used, experts and others familiar with counterterrorism operations said a likely option was the highly secretive Hellfire R9X -- know by various nicknames, including the "knife bomb" or the "flying Ginsu."


EXPLAINER: Who was al-Zawahri -- and why did US kill him?

Associated Press

A U.S. drone strike in Afghanistan this weekend killed Ayman al-Zawahri, who helped Osama bin Laden plot the Sept. 11, 2001, attacks on the United States and ensured al-Qaida survived and spread in the years after. President Joe Biden on Monday announced the killing of al-Zawahri, delivering a significant counterterrorism win just 11 months after American troops left the country. A look at the al-Qaida leader, who evaded U.S. capture for 21 years after the suicide airliner attacks that in many ways changed America and its relations with the rest of the world. Americans who lived through the 9/11 attacks may not remember al-Zawahri's name, but many know his face more than two decades on: a man in glasses, slightly smiling, invariably shown in photos by the side of bin Laden as the two arranged the strike on the United States. An Egyptian, al-Zawahri was born June 19, 1951, to a comfortable family in a leafy, drowsy Cairo suburb.