AITopics | top prediction

Collaborating Authors

top prediction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding Hidden Computations in Chain-of-Thought Reasoning

Bharadwaj, Aryasomayajula Ram

arXiv.org Artificial IntelligenceDec-5-2024

Chain-of-Thought (CoT) prompting has significantly enhanced the reasoning abilities of large language models. However, recent studies have shown that models can still perform complex reasoning tasks even when the CoT is replaced with filler(hidden) characters (e.g., "..."), leaving open questions about how models internally process and represent reasoning steps. In this paper, we investigate methods to decode these hidden characters in transformer models trained with filler CoT sequences. By analyzing layer-wise representations using the logit lens method and examining token rankings, we demonstrate that the hidden characters can be recovered without loss of performance. Our findings provide insights into the internal mechanisms of transformer models and open avenues for improving interpretability and transparency in language model reasoning.

filler token, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.04537

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.52)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Characterizing stable regions in the residual stream of LLMs

Janiak, Jett, Karwowski, Jacek, Mangat, Chatrik Singh, Giglemiani, Giorgi, Petrova, Nora, Heimersheim, Stefan

arXiv.org Artificial IntelligenceNov-18-2024

We identify stable regions in the residual stream of Transformers, where the model's output remains insensitive to small activation changes, but exhibits high sensitivity at region boundaries. These regions emerge during training and become more defined as training progresses or model size increases. The regions appear to be much larger than previously studied polytopes. Our analysis suggests that these stable regions align with semantic distinctions, where similar prompts cluster within regions, and activations from the same region lead to similar next token predictions. This work provides a promising research direction for understanding the complexity of neural networks, shedding light on training dynamics, and advancing interpretability.

activation, prediction, stable region, (16 more...)

arXiv.org Artificial Intelligence

2409.17113

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout

Banerjee, Ayan, Mathur, Nityanand, Lladós, Josep, Pal, Umapada, Dutta, Anjan

arXiv.org Artificial IntelligenceMar-30-2024

Generating VectorArt from text prompts is a challenging vision task, requiring diverse yet realistic depictions of the seen as well as unseen entities. However, existing research has been mostly limited to the generation of single objects, rather than comprehensive scenes comprising multiple elements. In response, this work introduces SVGCraft, a novel end-to-end framework for the creation of vector graphics depicting entire scenes from textual descriptions. Utilizing a pre-trained LLM for layout generation from text prompts, this framework introduces a technique for producing masked latents in specified bounding boxes for accurate object placement. It introduces a fusion mechanism for integrating attention maps and employs a diffusion U-Net for coherent composition, speeding up the drawing process. The resulting SVG is optimized using a pre-trained encoder and LPIPS loss with opacity modulation to maximize similarity. Additionally, this work explores the potential of primitive shapes in facilitating canvas completion in constrained environments. Through both qualitative and quantitative assessments, SVGCraft is demonstrated to surpass prior works in abstraction, recognizability, and detail, as evidenced by its performance metrics (CLIP-T: 0.4563, Cosine Similarity: 0.6342, Confusion: 0.66, Aesthetic: 6.7832). The code will be available at github.com/SVGCraft.

svgcraft, text prompt, top prediction, (15 more...)

arXiv.org Artificial Intelligence

2404.00412

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.46)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Xing, Ximing, Wang, Chuang, Zhou, Haitao, Zhang, Jing, Yu, Qian, Xu, Dong

arXiv.org Artificial IntelligenceJan-15-2024

Even though trained mainly on images, we discover that pretrained diffusion models show impressive power in guiding sketch synthesis. In this paper, we present DiffSketcher, an innovative algorithm that creates \textit{vectorized} free-hand sketches using natural language input. DiffSketcher is developed based on a pre-trained text-to-image diffusion model. It performs the task by directly optimizing a set of B\'ezier curves with an extended version of the score distillation sampling (SDS) loss, which allows us to use a raster-level diffusion model as a prior for optimizing a parametric vectorized sketch generator. Furthermore, we explore attention maps embedded in the diffusion model for effective stroke initialization to speed up the generation process. The generated sketches demonstrate multiple levels of abstraction while maintaining recognizability, underlying structure, and essential visual details of the subject drawn. Our experiments show that DiffSketcher achieves greater quality than prior work. The code and demo of DiffSketcher can be found at https://ximinng.github.io/DiffSketcher-project/.

diffsketcher, diffusion model, sketch, (16 more...)

arXiv.org Artificial Intelligence

2306.14685

Country:

Asia > China > Hong Kong (0.04)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
North America > United States > Rocky Mountains (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Media > Photography (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Multilingual Perspective Towards the Evaluation of Attribution Methods in Natural Language Inference

Zaman, Kerem, Belinkov, Yonatan

arXiv.org Artificial IntelligenceJun-4-2023

Most evaluations of attribution methods focus on the English language. In this work, we present a multilingual approach for evaluating attribution methods for the Natural Language Inference (NLI) task in terms of faithfulness and plausibility. First, we introduce a novel cross-lingual strategy to measure faithfulness based on word alignments, which eliminates the drawbacks of erasure-based evaluations.We then perform a comprehensive evaluation of attribution methods, considering different output mechanisms and aggregation methods. Finally, we augment the XNLI dataset with highlight-based explanations, providing a multilingual NLI dataset with highlights, to support future exNLP studies. Our results show that attribution methods performing best for plausibility and faithfulness are different.

attribution method, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2204.05428

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Israel (0.04)
North America > Dominican Republic (0.04)
(7 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Increasing Textual Context Size Boosts Medical Image-Text Matching

Glassberg, Idan, Hope, Tom

arXiv.org Artificial IntelligenceMar-23-2023

Pretrained image-text matching models, such as OpenAI's CLIP [1], use natural language processing (NLP) approaches to find semantic relations between images and textual descriptions. This emerging technology has seen rapid adoption in the general domain, and increasing interest in the medical domain [2, 3] where medical imaging data often includes images paired with textual descriptions. For example, MIMIC-CXR[4] is a dataset that consists of chest radiographs along with free-text radiology reports. This dataset paved the way for works like BioViL [2] which used the images and the captions provided in the dataset to train an image-text matching model for chest X-Rays and chest related diseases. ROCO [5] is a dataset containing radiology images from publications available in the PubMed biomedical paper repository. ROCO includes several medical imaging modalities beyond X-Ray, such as CT, Ultrasound and MRI.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.1334

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)
Europe > Switzerland (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.31)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Zhang, Renrui, Hu, Xiangfei, Li, Bohao, Huang, Siyuan, Deng, Hanqiu, Li, Hongsheng, Qiao, Yu, Gao, Peng

arXiv.org Artificial IntelligenceMar-3-2023

Visual recognition in low-data regimes requires deep neural networks to learn generalized representations from limited training samples. Recently, CLIP-based methods have shown promising few-shot performance benefited from the contrastive language-image pre-training. We then question, if the more diverse pre-training knowledge can be cascaded to further assist few-shot representation learning. In this paper, we propose CaFo, a Cascade of Foundation models that incorporates diverse prior knowledge of various pre-training paradigms for better few-shot learning. Our CaFo incorporates CLIP's language-contrastive knowledge, DINO's vision-contrastive knowledge, DALL-E's vision-generative knowledge, and GPT-3's language-generative knowledge. Specifically, CaFo works by 'Prompt, Generate, then Cache'. Firstly, we leverage GPT-3 to produce textual inputs for prompting CLIP with rich downstream linguistic semantics. Then, we generate synthetic images via DALL-E to expand the few-shot training data without any manpower. At last, we introduce a learnable cache model to adaptively blend the predictions from CLIP and DINO. By such collaboration, CaFo can fully unleash the potential of different pre-training methods and unify them to perform state-of-the-art for few-shot classification. Code is available at https://github.com/ZrrSkywalker/CaFo.

artificial intelligence, knowledge, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.02151

Country:

Asia > China > Shanghai > Shanghai (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

Visual Classification via Description from Large Language Models

Menon, Sachit, Vondrick, Carl

arXiv.org Artificial IntelligenceDec-1-2022

Vision-language models (VLMs) such as CLIP have shown promising performance on a variety of recognition tasks using the standard zero-shot classification procedure - computing similarity between the query image and the embedded words for each category. By only using the category name, they neglect to make use of the rich context of additional information that language affords. The procedure gives no intermediate understanding of why a category is chosen, and furthermore provides no mechanism for adjusting the criteria used towards this decision. We present an alternative framework for classification with VLMs, which we call classification by description. We ask VLMs to check for descriptive features rather than broad categories: to find a tiger, look for its stripes; its claws; and more. By basing decisions on these descriptors, we can provide additional cues that encourage using the features we want to be used. In the process, we can get a clear idea of what features the model uses to construct its decision; it gains some level of inherent explainability. We query large language models (e.g., GPT-3) for these descriptors to obtain them in a scalable way. Extensive experiments show our framework has numerous advantages past interpretability. We show improvements in accuracy on ImageNet across distribution shifts; demonstrate the ability to adapt VLMs to recognize concepts unseen during training; and illustrate how descriptors can be edited to effectively mitigate bias compared to the baseline. Why does a person recognize a hen in Fig.1?

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.07183

Country:

Asia > Japan (0.04)
Africa > West Africa (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(6 more...)

Genre: Research Report (0.52)

Industry:

Leisure & Entertainment (0.93)
Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Neural-Symbolic Models for Logical Queries on Knowledge Graphs

Zhu, Zhaocheng, Galkin, Mikhail, Zhang, Zuobai, Tang, Jian

arXiv.org Artificial IntelligenceSep-6-2022

Answering complex first-order logic (FOL) queries on knowledge graphs is a fundamental task for multi-hop reasoning. Traditional symbolic methods traverse a complete knowledge graph to extract the answers, which provides good interpretation for each step. Recent neural methods learn geometric embeddings for complex queries. These methods can generalize to incomplete knowledge graphs, but their reasoning process is hard to interpret. In this paper, we propose Graph Neural Network Query Executor (GNN-QE), a neural-symbolic model that enjoys the advantages of both worlds. GNN-QE decomposes a complex FOL query into relation projections and logical operations over fuzzy sets, which provides interpretability for intermediate variables. To reason about the missing links, GNN-QE adapts a graph neural network from knowledge graph completion to execute the relation projections, and models the logical operations with product fuzzy logic. Experiments on 3 datasets show that GNN-QE significantly improves over previous state-of-the-art models in answering FOL queries. Meanwhile, GNN-QE can predict the number of answers without explicit supervision, and provide visualizations for intermediate variables.

artificial intelligence, fuzzy logic, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2205.10128

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Washington > Spokane County (0.14)
North America > United States > South Carolina > Greenville County (0.14)
(70 more...)

Genre:

Research Report (0.70)
Personal > Honors (0.68)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.57)

Add feedback

2022 Top Predictions for AI in Finance

#artificialintelligenceJan-14-2022, 00:30:35 GMT

It is no secret that AI has played a major role in the ongoing democratization of investing. My prediction for next year and beyond is that the major growth we've seen in retail investing will continue at a rapid pace – and AI will continue to fuel that growth. AI has helped to level the playing field for investors. Today you don't have to be a high-net-worth (HNW) investor to get personalized financial advice, there is a chatbot for that. These AI-driven chatbots will only continue to get smarter. Machine learning can now sift through various financial accounts and profiles for a user and provide a snapshot of recommended to-dos on a dashboard. This will continue to gain traction in the decade ahead. AI has also helped to simplify the client onboarding process, while also enhancing the customer experience. Going forward, as the retail investing trend continues to grow expect AI to play a larger role in risk assessment, risk management, and fraud detection. This will enable businesses to scale and keep up with heavy volatility.

chatbot, investor, top prediction

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Applied AI (0.40)

Add feedback