AITopics | Chen, null

Collaborating Authors

Chen, null

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RobuNFR: Evaluating the Robustness of Large Language Models on Non-Functional Requirements Aware Code Generation

Lin, Feng, Kim, Dong Jae, Li, Zhenhao, Yang, Jinqiu, Tse-Hsun, null, Chen, null

arXiv.org Artificial IntelligenceApr-2-2025

When using LLMs to address Non-Functional Requirements (NFRs), developers may behave differently (e.g., expressing the same NFR in different words). Robust LLMs should output consistent results across these variations; however, this aspect remains underexplored. We propose RobuNFR for evaluating the robustness of LLMs in NFR-aware code generation across four NFR dimensions: design, readability, reliability, and performance, using three methodologies: prompt variation, regression testing, and diverse workflows. Our experiments show that RobuNFR reveals robustness issues in the tested LLMs when considering NFRs in code generation. Specifically, under prompt variation, including NFRs leads to a decrease in Pass@1 by up to 39 percent and an increase in the standard deviation from 0.48 to 2.48 compared to the baseline without NFRs (i.e., Function-Only). While incorporating NFRs generally improves overall NFR metrics, it also results in higher prompt sensitivity. In regression settings, some LLMs exhibit differences across versions, with improvements in one aspect (e.g., reduced code smells) often accompanied by regressions in another (e.g., decreased correctness), revealing inconsistencies that challenge their robustness. When varying workflows, the tested LLMs show significantly different NFR-aware code generation capabilities between two workflows: (1) integrating NFRs and functional requirements into the initial prompt and (2) enhancing Function-Only-generated code with the same NFR.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.22851

Country: North America > Canada (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

GraphCompNet: A Position-Aware Model for Predicting and Compensating Shape Deviations in 3D Printing

Lei, null, Chen, null, Lee, Juheon, Catana, Juan Carlos, Yhdego, Tsegai, Moroney, Nathan, Nabian, Mohammad Amin, Wang, Hui, Zeng, Jun

arXiv.org Artificial IntelligenceFeb-17-2025

This paper introduces a data-driven algorithm for modeling and compensating shape deviations in additive manufacturing (AM), addressing challenges in geometric accuracy and batch production. While traditional methods, such as analytical models and metrology, laid the groundwork for geometric precision, they are often impractical for large-scale production. Recent advancements in machine learning (ML) have improved compensation precision, but issues remain in generalizing across complex geometries and adapting to position-dependent variations. We present a novel approach for powder bed fusion (PBF) processes, using GraphCompNet, which is a computational framework combining graph-based neural networks with a generative adversarial network (GAN)-inspired training process. By leveraging point cloud data and dynamic graph convolutional neural networks (DGCNNs), GraphCompNet models complex shapes and incorporates position-specific thermal and mechanical factors. A two-stage adversarial training procedure iteratively refines compensated designs via a compensator-predictor architecture, offering real-time feedback and optimization. Experimental validation across diverse shapes and positions shows the framework significantly improves compensation accuracy (35 to 65 percent) across the entire print space, adapting to position-dependent variations. This work advances the development of Digital Twin technology for AM, enabling scalable, real-time monitoring and compensation, and addressing critical gaps in AM process control. The proposed method supports high-precision, automated industrial-scale design and manufacturing systems.

artificial intelligence, compensating shape deviation, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2502.09652

Genre: Research Report (1.00)

Industry: Machinery > Industrial Machinery (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Slicing Vision Transformer for Flexible Inference

Zhang, Yitian, Coskun, Huseyin, Ma, Xu, Wang, Huan, Ma, Ke, Xi, null, Chen, null, Hu, Derek Hao, Fu, Yun

arXiv.org Artificial IntelligenceDec-6-2024

Vision Transformers (ViT) is known for its scalability. In this work, we target to scale down a ViT to fit in an environment with dynamic-changing resource constraints. We observe that smaller ViTs are intrinsically the sub-networks of a larger ViT with different widths. Thus, we propose a general framework, named Scala, to enable a single network to represent multiple smaller ViTs with flexible inference capability, which aligns with the inherent design of ViT to vary from widths. Concretely, Scala activates several subnets during training, introduces Isolated Activation to disentangle the smallest sub-network from other subnets, and leverages Scale Coordination to ensure each sub-network receives simplified, steady, and accurate learning objectives. Comprehensive empirical validations on different tasks demonstrate that with only one-shot training, Scala learns slimmable representation without modifying the original ViT structure and matches the performance of Separate Training. Compared with the prior art, Scala achieves an average improvement of 1.6% on ImageNet-1K with fewer parameters. Code is available at here.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.04786

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Foundation Model for Composite Materials and Microstructural Analysis

Wei, Ting-Ju, Chuin-Shan, null, Chen, null

arXiv.org Artificial IntelligenceNov-10-2024

The rapid advancement of machine learning has unlocked numerous opportunities for materials science, particularly in accelerating the design and analysis of materials. However, a significant challenge lies in the scarcity and high cost of obtaining high-quality materials datasets. In other fields, such as natural language processing, foundation models pre-trained on large datasets have achieved exceptional success in transfer learning, effectively leveraging latent features to achieve high performance on tasks with limited data. Despite this progress, the concept of foundation models remains underexplored in materials science. Here, we present a foundation model specifically designed for composite materials. Our model is pre-trained on a dataset of short-fiber composites to learn robust latent features. During transfer learning, the MMAE accurately predicts homogenized stiffness, with an R2 score reaching as high as 0.959 and consistently exceeding 0.91, even when trained on limited data. These findings validate the feasibility and effectiveness of foundation models in composite materials. We anticipate extending this approach to more complex three-dimensional composite materials, polycrystalline materials, and beyond. Moreover, this framework enables high-accuracy predictions even when experimental data are scarce, paving the way for more efficient and cost-effective materials design and analysis.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.06565

Genre: Research Report > New Finding (1.00)

Industry: Materials (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Bilingual Adaptation of Monolingual Foundation Models

Gosal, Gurpreet, Xu, Yishi, Ramakrishnan, Gokul, Joshi, Rituraj, Sheinin, Avraham, Zhiming, null, Chen, null, Mishra, Biswajit, Vassilieva, Natalia, Hestness, Joel, Sengupta, Neha, Sahu, Sunil Kumar, Jia, Bokang, Katipomu, Satheesh, Pandit, Onkar, Kamboj, Samta, Pal, Rahul, Mullah, Parvez, Doraiswamy, Soundar, Chami, Mohamed El Karim

arXiv.org Artificial IntelligenceJul-13-2024

We present an efficient method for adapting a monolingual Large Language Model (LLM) to another language, addressing challenges of catastrophic forgetting and tokenizer limitations. We focus this study on adapting Llama 2 to Arabic. Our two-stage approach begins with expanding the vocabulary and training only the embeddings matrix, followed by full model continual pretraining on a bilingual corpus. By continually pretraining on a mix of Arabic and English corpora, the model retains its proficiency in English while acquiring capabilities in Arabic. Our approach results in significant improvements in Arabic and slight enhancements in English, demonstrating cost-effective cross-lingual transfer. We also perform extensive ablations on embedding initialization techniques, data mix ratios, and learning rates and release a detailed training recipe.

large language model, llama 2, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2407.12869

Country:

Europe > Italy (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Physics-Informed Geometric Operators to Support Surrogate, Dimension Reduction and Generative Models for Engineering Design

Khan, Shahroz, Masood, Zahid, Usama, Muhammad, Kostas, Konstantinos, Kaklis, Panagiotis, Wei, null, Chen, null

arXiv.org Artificial IntelligenceJul-10-2024

In this work, we propose a set of physics-informed geometric operators (GOs) to enrich the geometric data provided for training surrogate/discriminative models, dimension reduction, and generative models, typically employed for performance prediction, dimension reduction, and creating data-driven parameterisations, respectively. However, as both the input and output streams of these models consist of low-level shape representations, they often fail to capture shape characteristics essential for performance analyses. Therefore, the proposed GOs exploit the differential and integral properties of shapes--accessed through Fourier descriptors, curvature integrals, geometric moments, and their invariants--to infuse high-level intrinsic geometric information and physics into the feature vector used for training, even when employing simple model architectures or low-level parametric descriptions. We showed that for surrogate modelling, along with the inclusion of the notion of physics, GOs enact regularisation to reduce over-fitting and enhance generalisation to new, unseen designs. Furthermore, through extensive experimentation, we demonstrate that for dimension reduction and generative models, incorporating the proposed GOs enriches the training data with compact global and local geometric features. This significantly enhances the quality of the resulting latent space, thereby facilitating the generation of valid and diverse designs. Lastly, we also show that GOs can enable learning parametric sensitivities to a great extent. Consequently, these enhancements accelerate the convergence rate of shape optimisers towards optimal solutions.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.07611

Country:

Europe (0.67)
North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Marine (0.69)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.90)

Add feedback

Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Vishwakarma, Harit, Reid, null, Chen, null, Tay, Sui Jiet, Namburi, Satya Sai Srinath, Sala, Frederic, Vinayak, Ramya Korlakai

arXiv.org Machine LearningApr-24-2024

Auto-labeling is an important family of techniques that produce labeled training sets with minimum manual labeling. A prominent variant, threshold-based auto-labeling (TBAL), works by finding a threshold on a model's confidence scores above which it can accurately label unlabeled data points. However, many models are known to produce overconfident scores, leading to poor TBAL performance. While a natural idea is to apply off-the-shelf calibration methods to alleviate the overconfidence issue, such methods still fall short. Rather than experimenting with ad-hoc choices of confidence functions, we propose a framework for studying the \emph{optimal} TBAL confidence function. We develop a tractable version of the framework to obtain \texttt{Colander} (Confidence functions for Efficient and Reliable Auto-labeling), a new post-hoc method specifically designed to maximize performance in TBAL systems. We perform an extensive empirical evaluation of our method \texttt{Colander} and compare it against methods designed for calibration. \texttt{Colander} achieves up to 60\% improvements on coverage over the baselines while maintaining auto-labeling error below $5\%$ and using the same amount of labeled data as the baselines.

artificial intelligence, confidence function, machine learning, (17 more...)

arXiv.org Machine Learning

2404.16188

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

3D object quality prediction for Metal Jet Printer with Multimodal thermal encoder

Rachel, null, Chen, null, Zheng, Wenjia, Jalui, Sandeep, Suri, Pavan, Zeng, Jun

arXiv.org Artificial IntelligenceApr-17-2024

With the advancements in 3D printing technologies, it is extremely important that the quality of 3D printed objects, and dimensional accuracies should meet the customer's specifications. Various factors during metal printing affect the printed parts' quality, including the power quality, the printing stage parameters, the print part's location inside the print bed, the curing stage parameters, and the metal sintering process. With the large data gathered from HP's MetJet printing process, AI techniques can be used to analyze, learn, and effectively infer the printed part quality metrics, as well as assist in improving the print yield. In-situ thermal sensing data captured by printer-installed thermal sensors contains the part thermal signature of fusing layers. Such part thermal signature contains a convoluted impact from various factors. In this paper, we use a multimodal thermal encoder network to fuse data of a different nature including the video data vectorized printer control data, and exact part thermal signatures with a trained encoder-decoder module. We explored the data fusing techniques and stages for data fusing, the optimized end-to-end model architecture indicates an improved part quality prediction accuracy.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ECICE59523.2023.10383123

2404.11776

Country: North America > United States (0.69)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Machinery > Industrial Machinery (0.91)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Virtual Foundry Graphnet for Metal Sintering Deformation Prediction

Rachel, null, Chen, null, Lee, Juheon, Gan, Chuang, Yang, Zijiang, Nabian, Mohammad Amin, Zeng, Jun

arXiv.org Artificial IntelligenceApr-17-2024

Metal Sintering is a necessary step for Metal Injection Molded parts and binder jet such as HP's metal 3D printer. The metal sintering process introduces large deformation varying from 25 to 50% depending on the green part porosity. In this paper, we use a graph-based deep learning approach to predict the part deformation, which can speed up the deformation simulation substantially at the voxel level. Running a well-trained Metal Sintering inferencing engine only takes a range of seconds to obtain the final sintering deformation value. The tested accuracy on example complex geometry achieves 0.7um mean deviation for a 63mm testing part.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.11753

Country: North America > United States > California > Santa Clara County > Palo Alto (0.14)

Genre: Research Report (0.50)

Industry:

Education (0.66)
Machinery > Industrial Machinery (0.49)
Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

When LLM-based Code Generation Meets the Software Development Process

Lin, Feng, Kim, Dong Jae, Tse-Husn, null, Chen, null

arXiv.org Artificial IntelligenceMar-23-2024

Software process models play a pivotal role in fostering collaboration and communication within software teams, enabling them to tackle intricate development tasks effectively. This paper introduces LCG, a code generation framework inspired by established software engineering practices. LCG leverages multiple Large Language Model (LLM) agents to emulate various software process models, namely LCGWaterfall, LCGTDD, and LCGScrum. Each model assigns LLM agents specific roles such as requirement engineer, architect, developer, tester, and scrum master, mirroring typical development activities and communication patterns. Through collaborative efforts utilizing chain-of-thought and prompt composition techniques, the agents continuously refine themselves to enhance code quality. Utilizing GPT3.5 as the underlying LLM and baseline (GPT), we evaluate LCG across four code generation benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET. Results indicate LCGScrum outperforms other models, achieving Pass@1 scores of 75.2, 65.5, 82.5, and 56.7 in HumanEval, HumanEval-ET, MBPP, and MBPP-ET, respectively - an average 15% improvement over GPT. Analysis reveals distinct impacts of development activities on generated code, with design and code reviews contributing to enhanced exception handling, while design, testing, and code reviews mitigate code smells. Furthermore, temperature values exhibit negligible influence on Pass@1 across all models. However, variations in Pass@1 are notable for different GPT3.5 model versions, ranging from 5 to over 60 in HumanEval, highlighting the stability of LCG across model versions. This stability underscores the importance of adopting software process models to bolster the quality and consistency of LLM-generated code.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.15852

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback