Goto

Collaborating Authors

 architectural design


Handling Data Heterogeneity via Architectural Design for Federated Visual Recognition

Neural Information Processing Systems

Federated Learning (FL) is a promising research paradigm that enables the collaborative training of machine learning models among various parties without the need for sensitive information exchange. Nonetheless, retaining data in individual clients introduces fundamental challenges to achieving performance on par with centrally trained models. Our study provides an extensive review of federated learning applied to visual recognition. It underscores the critical role of thoughtful architectural design choices in achieving optimal performance, a factor often neglected in the FL literature. Many existing FL solutions are tested on shallow or simple networks, which may not accurately reflect real-world applications.


Auto-Compressing Networks

Dorovatas, Vaggelis, Paraskevopoulos, Georgios, Potamianos, Alexandros

arXiv.org Artificial Intelligence

Deep neural networks with short residual connections have demonstrated remarkable success across domains, but increasing depth often introduces computational redundancy without corresponding improvements in representation quality. We introduce Auto-Compressing Networks (ACNs), an architectural variant where additive long feedforward connections from each layer to the output replace traditional short residual connections. By analyzing the distinct dynamics induced by this modification, we reveal a unique property we coin as auto-compression, the ability of a network to organically compress information during training with gradient descent, through architectural design alone. Through auto-compression, information is dynamically "pushed" into early layers during training, enhancing their representational quality and revealing potential redundancy in deeper ones. We theoretically show that this property emerges from layer-wise training patterns present in ACNs, where layers are dynamically utilized during training based on task requirements. We also find that ACNs exhibit enhanced noise robustness compared to residual networks, superior performance in low-data settings, improved transfer learning capabilities, and mitigate catastrophic forgetting suggesting that they learn representations that generalize better despite using fewer parameters. Our results demonstrate up to 18% reduction in catastrophic forgetting and 30-80% architectural compression while maintaining accuracy across vision transformers, MLP-mixers, and BERT architectures. These findings establish ACNs as a practical approach to developing efficient neural architectures that automatically adapt their computational footprint to task complexity, while learning robust representations suitable for noisy real-world tasks and continual learning scenarios.


dataset release, tournament evaluation, architectural design, input representation, and other insights

Neural Information Processing Systems

We want to thank the reviewers for their helpful comments. The dataset will be made available to any interested researchers. We agree with R3 that there are a lot of non-trivial modeling choices in our architecture. We call the first one unit-based and the latter token-based. We apologize for writing some of the claims without referring to the evidence, like "orders from the last movement Our input representation is a result of both empirical findings and domain knowledge.


Text-to-Layout: A Generative Workflow for Drafting Architectural Floor Plans Using LLMs

Duggempudi, Jayakrishna, Gao, Lu, Senouci, Ahmed, Han, Zhe, Zhang, Yunpeng

arXiv.org Artificial Intelligence

This paper presents the development of an AI-powered workflow that uses Large Language Models (LLMs) to assist in drafting schematic architectural floor plans from natural language prompts. The proposed system interprets textual input to automatically generate layout options including walls, doors, windows, and furniture arrangements. It combines prompt engineering, a furniture placement refinement algorithm, and Python scripting to produce spatially coherent draft plans compatible with design tools such as Autodesk Revit. A case study of a mid-sized residential layout demonstrates the approach's ability to generate functional and structured outputs with minimal manual effort. The workflow is designed for transparent replication, with all key prompt specifications documented to enable independent implementation by other researchers. In addition, the generated models preserve the full range of Revit-native parametric attributes required for direct integration into professional BIM processes.


FloorplanMAE:A self-supervised framework for complete floorplan generation from partial inputs

Yin, Jun, Zhong, Jing, Zeng, Pengyu, Li, Peilin, Zhang, Miao, Luo, Ran, Lu, Shuai

arXiv.org Artificial Intelligence

In the architectural design process, floorplan design is often a dynamic and iterative process. Architects progressively draw various parts of the floorplan according to their ideas and requirements, continuously adjusting and refining throughout the design process. Therefore, the ability to predict a complete floorplan from a partial one holds significant value in the design process. Such prediction can help architects quickly generate preliminary designs, improve design efficiency, and reduce the workload associated with repeated modifications. To address this need, we propose FloorplanMAE, a self-supervised learning framework for restoring incomplete floor plans into complete ones. First, we developed a floor plan reconstruction dataset, FloorplanNet, specifically trained on architectural floor plans. Secondly, we propose a floor plan reconstruction method based on Masked Autoencoders (MAE), which reconstructs missing parts by masking sections of the floor plan and training a lightweight Vision Transformer (ViT). We evaluated the reconstruction accuracy of FloorplanMAE and compared it with state-of-the-art benchmarks. Additionally, we validated the model using real sketches from the early stages of architectural design. Experimental results show that the FloorplanMAE model can generate high-quality complete floor plans from incomplete partial plans. This framework provides a scalable solution for floor plan generation, with broad application prospects.


Insights Informed Generative AI for Design: Incorporating Real-world Data for Text-to-Image Output

Gupta, Richa, Kyaw, Alexander Htet

arXiv.org Artificial Intelligence

Generative AI, specifically text - to - image models, have revolutionized interior architectural design by enabling the rapid translation of conceptual ideas into visual representations f rom simple text prompts . While generative AI can produce visually appealing images they often lack actionable data for designers In this work, we propose a novel pipeline that integrates DALL - E 3 with a materials dataset to enrich AI - generated designs with sustainability metrics and material usage insights. After the model generates an interior design image, a post - processing modul e identifies the top ten materials present and pairs them with carbon dioxide equivalent (CO e) values from a general materials dictionary. This approach allows designers to immediately evaluate environmental impacts and refine prompts accordingly. We eval uate the system through three user tests: (1) no mention of sustainability to the user prior to the prompting process with generative AI, (2) sustainability goals communicated to the user before prompting, and (3) sustainability goals communicated along wi th quantitative CO e data included in the generative AI outputs . Our q ualitative and quantitative analyses reveal that the introduction of sustainability metrics in the third test leads to more informed design decisions, however, it can also trigger decision fatigue and lower overall satisfaction. Nevertheless, the majority % of participants reported incorporating sustainability principles into their workflows in the th ird test, underscoring the potential of integrated metrics to guide more ecologically responsible practices. Our findings showcase the importance of balancing design freedom with practical constraints, offering a clear path toward holistic, data - driven solutions i n AI - assisted architectural design.


ArchSeek: Retrieving Architectural Case Studies Using Vision-Language Models

Li, Danrui, Shi, Yichao, Wang, Yaluo, Shi, Ziying, Kapadia, Mubbasir

arXiv.org Artificial Intelligence

Efficiently searching for relevant case studies is critical in architectural design, as designers rely on precedent examples to guide or inspire their ongoing projects. However, traditional text-based search tools struggle to capture the inherently visual and complex nature of architectural knowledge, often leading to time-consuming and imprecise exploration. This paper introduces ArchSeek, an innovative case study search system with recommendation capability, tailored for architecture design professionals. Powered by the visual understanding capabilities from vision-language models and cross-modal embeddings, it enables text and image queries with fine-grained control, and interaction-based design case recommendations. It offers architects a more efficient, personalized way to discover design inspirations, with potential applications across other visually driven design fields. The source code is available at https://github.com/danruili/ArchSeek.


Handling Data Heterogeneity via Architectural Design for Federated Visual Recognition

Neural Information Processing Systems

Federated Learning (FL) is a promising research paradigm that enables the collaborative training of machine learning models among various parties without the need for sensitive information exchange. Nonetheless, retaining data in individual clients introduces fundamental challenges to achieving performance on par with centrally trained models. Our study provides an extensive review of federated learning applied to visual recognition. It underscores the critical role of thoughtful architectural design choices in achieving optimal performance, a factor often neglected in the FL literature. Many existing FL solutions are tested on shallow or simple networks, which may not accurately reflect real-world applications.


Machine Apophenia: The Kaleidoscopic Generation of Architectural Images

Tikhonov, Alexey, Sinyavin, Dmitry

arXiv.org Artificial Intelligence

This study investigates the application of generative artificial intelligence in architectural design. We present a novel methodology that combines multiple neural networks to create an unsupervised and unmoderated stream of unique architectural images. Our approach is grounded in the conceptual framework called machine apophenia. We hypothesize that neural networks, trained on diverse human-generated data, internalize aesthetic preferences and tend to produce coherent designs even from random inputs. The methodology involves an iterative process of image generation, description, and refinement, resulting in captioned architectural postcards automatically shared on several social media platforms. Evaluation and ablation studies show the improvement both in technical and aesthetic metrics of resulting images on each step.


Form Forge: Latent Space Exploration of Architectural Forms via Explicit Latent Variable Manipulation

Dunnell, Kevin, Lippman, Andy

arXiv.org Artificial Intelligence

This paper presents 'Form Forge,' a prototype of a creative system for interactively exploring the latent space of architectural forms, inspired by Franois Blanciak's SITELESS: 1001 Building Forms via direct manipulation of latent variables. Utilizing a fine-tuned StyleGAN2-ADA model, the system allows users to navigate an array of possible building forms derived from Blanciak's sketches. Distinct from common latent space exploration tools that often rely on projected navigation landmarks, Form Forge provides direct access to manipulate each latent variable, aiming to offer a more granular exploration of the model's capabilities. Form Forge's design is intended to simplify the interaction with a complex, high-dimensional space and to serve as a preliminary investigation into how such tools might support creative processes in architectural design.