AITopics | Liang, Zixuan

Collaborating Authors

Liang, Zixuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhanced Estimation Techniques for Certified Radii in Randomized Smoothing

Liang, Zixuan

arXiv.org Machine LearningMar-11-2025

This paper presents novel methods for estimating certified radii in randomized smoothing, a technique crucial for certifying the robustness of neural networks against adversarial perturbations. Our proposed techniques significantly improve the accuracy of certified test-set accuracy by providing tighter bounds on the certified radii. We introduce advanced algorithms for both discrete and continuous domains, demonstrating their effectiveness on CIFAR-10 and ImageNet datasets. The new methods show considerable improvements over existing approaches, particularly in reducing discrepancies in certified radii estimates. We also explore the impact of various hyperparameters, including sample size, standard deviation, and temperature, on the performance of these methods. Our findings highlight the potential for more efficient certification processes and pave the way for future research on tighter confidence sequences and improved theoretical frameworks. The study concludes with a discussion of potential future directions, including enhanced estimation techniques for discrete domains and further theoretical advancements to bridge the gap between empirical and theoretical performance in randomized smoothing.

artificial intelligence, machine learning, robustness, (17 more...)

arXiv.org Machine Learning

2503.08801

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Automating Date Format Detection for Data Visualization

Liang, Zixuan

arXiv.org Artificial IntelligenceJan-9-2025

--Data preparation, specifically date parsing, is a significant bottleneck in analytic workflows. T o address this, we present two algorithms--one based on minimum entropy and the other on natural language modeling--that automatically derive date formats from string data. These algorithms achieve over 90% accuracy on a large corpus of data columns, streamlining the data preparation process within visualization environments. The minimal entropy approach is particularly fast, providing interactive feedback. Our methods simplify date format extraction, making them suitable for integration into data visualization tools and databases. Lately, the coordination of information perception advancements like Polaris [1] and Spotfire [2] has featured the significance of joining computational power with human knowledge for successful information examination. While PCs succeed at handling huge datasets, people bring significant space skill and the capacity to perceive designs visually [3], [4]. Frameworks that influence both human criticism and machine handling demonstrate additional success in separating significant experiences from information. Intuitive perception frameworks have become fundamental for empowering clients to investigate information while keeping up with their scientific stream.

artificial intelligence, dataset, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.0564

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Efficient Representations for High-Cardinality Categorical Variables in Machine Learning

Liang, Zixuan

arXiv.org Artificial IntelligenceJan-9-2025

High\-cardinality categorical variables pose significant challenges in machine learning, particularly in terms of computational efficiency and model interpretability. Traditional one\-hot encoding often results in high\-dimensional sparse feature spaces, increasing the risk of overfitting and reducing scalability. This paper introduces novel encoding techniques, including means encoding, low\-rank encoding, and multinomial logistic regression encoding, to address these challenges. These methods leverage sufficient representations to generate compact and informative embeddings of categorical data. We conduct rigorous theoretical analyses and empirical validations on diverse datasets, demonstrating significant improvements in model performance and computational efficiency compared to baseline methods. The proposed techniques are particularly effective in domains requiring scalable solutions for large datasets, paving the way for more robust and efficient applications in machine learning.

artificial intelligence, categorical variable, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.05646

Country: North America > United States > Iowa (0.14)

Genre:

Research Report > Experimental Study (0.66)
Research Report > New Finding (0.66)

Industry:

Health & Medicine (1.00)
Banking & Finance > Real Estate (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

Harmonizing Metadata of Language Resources for Enhanced Querying and Accessibility

Liang, Zixuan

arXiv.org Artificial IntelligenceJan-9-2025

This paper addresses the harmonization of metadata from diverse repositories of language resources (LRs). Leveraging linked data and RDF techniques, we integrate data from multiple sources into a unified model based on DCAT and META-SHARE OWL ontology. Our methodology supports text-based search, faceted browsing, and advanced SPARQL queries through Linghub, a newly developed portal. Real user queries from the Corpora Mailing List (CML) were evaluated to assess Linghub capability to satisfy actual user needs. Results indicate that while some limitations persist, many user requests can be successfully addressed. The study highlights significant metadata issues and advocates for adherence to open vocabularies and standards to enhance metadata harmonization. This initial research underscores the importance of API-based access to LRs, promoting machine usability and data subset extraction for specific purposes, paving the way for more efficient and standardized LR utilization.

artificial intelligence, language resource, semantic web, (13 more...)

arXiv.org Artificial Intelligence

2501.05606

Country: Europe (0.14)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Communications > Web > Semantic Web (0.89)

Add feedback