code stack
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
Although autoregressive models have achieved promising results on image generation, their unidirectional generation process prevents the resultant images from fully reflecting global contexts. To address the issue, we propose an effective image generation framework of \emph{Draft-and-Revise} with \emph{Contextual RQ-transformer} to consider global contexts during the generation process. As a generalized VQ-VAE, RQ-VAE first represents a high-resolution image as a sequence of discrete code stacks. After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image. Then, we propose the two-phase decoding, Draft-and-Revise, for Contextual RQ-Transformer to generates an image, while fully exploiting the global contexts of the image during the generation process.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.73)
Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
Although autoregressive models have achieved promising results on image generation, their unidirectional generation process prevents the resultant images from fully reflecting global contexts. To address the issue, we propose an effective image generation framework of \emph{Draft-and-Revise} with \emph{Contextual RQ-transformer} to consider global contexts during the generation process. As a generalized VQ-VAE, RQ-VAE first represents a high-resolution image as a sequence of discrete code stacks. After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image. Then, we propose the two-phase decoding, Draft-and-Revise, for Contextual RQ-Transformer to generates an image, while fully exploiting the global contexts of the image during the generation process.
Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks
Lee, Hokyung, Sharma, Sumanyu, Hu, Bing
Recent advancements in Large Language Models (LLMs) have significantly increased their use in various real-world applications, including information retrieval and coding assistance [1]. Notably, the dramatic expansion of context window sizes in models like GPT-4 [2], Claude 3 [3], and Gemini-1.5 [4] has broadened the potential applications of these models. To evaluate the retrieval capabilities of these LLMs within large context windows, a series of benchmarks known as Needle-in-a-Haystack (NIAH) [5] has been developed. The NIAH benchmarks [5] typically involve prompting an LLM to retrieve contextual information based on a clue (e.g., needle) hidden within a large document (e.g., background). These benchmarks have been effective in evaluating LLMs' ability to retrieve information from large text data such as in text-summarization, and legal and medical domains [6, 7, 8]. NIAH represents important use-cases finding precedent case law in the legal domain [7] and information retrieval from lengthy electronic health records in the medical domain [8]. Verifying the "faithfulness" of long-text-summarization has also been shown as an important NIAH task for the FABLES dataset [6]. Generating code and programs following provided specifications or requirements is a long-standing challenge in computer science called program synthesis [9].
- Law (0.88)
- Health & Medicine > Health Care Technology > Medical Record (0.54)