dynamic filter
Wind Power Scenario Generation based on the Generalized Dynamic Factor Model and Generative Adversarial Network
Cho, Young-ho, Zhu, Hao, Lee, Duehee, Baldick, Ross
--For conducting resource adequacy studies, we synthesize multiple long-term wind power scenarios of distributed wind farms simultaneously by using the spatio-temporal features: spatial and temporal correlation, waveforms, marginal and ramp rates distributions of waveform, power spectral densities, and statistical characteristics. Generating the spatial correlation in scenarios requires the design of common factors for neighboring wind farms and antithetical factors for distant wind farms. The generalized dynamic factor model (GDFM) can extract the common factors through cross spectral density analysis, but it cannot closely imitate waveforms. The GAN can synthesize plausible samples representing the temporal correlation by verifying samples through a fake sample discriminator . T o combine the advantages of GDFM and GAN, we use the GAN to provide a filter that extracts dynamic factors with temporal information from the observation data, and we then apply this filter in the GDFM to represent both spatial and frequency correlations of plausible waveforms. Numerical tests on the combination of GDFM and GAN have demonstrated performance improvements over competing alternatives in synthesizing wind power scenarios from Australia, better realizing plausible statistical characteristics of actual wind power compared to alternatives such as the GDFM with a filter synthesized from distributions of actual dynamic filters and the GAN with direct synthesis without dynamic factors. ESOURCE adequacy means to maintain power system reliability by having sufficient capacity such that, even with failures or variability of resources, the probability of not being able to meet all load is sufficiently small [1]. System operators achieve resource adequacy of a power system by ensuring there is enough generation capacity [2]. In the case of intermittent energy resources, the effective load carrying capacity (ELCC) of the intermittent resource is the equivalent capacity of highly reliable generators that would result in the same probability of not being able to meet all load [3]. For example, the ELCC of wind power can be obtained by simulating power systems with long-term wind power scenarios with realistic ramping rates and marginal distributions [4]. Furthermore, the capacity factor and reserve margin contribution of wind power to the power system reliability can also be obtained by simulating a future power system by using realistic long-term wind power scenarios [5].
FFT-based Dynamic Token Mixer for Vision
Multi-head-self-attention (MHSA)-equipped models have achieved notable performance in computer vision. Their computational complexity is proportional to quadratic numbers of pixels in input feature maps, resulting in slow processing, especially when dealing with high-resolution images. New types of token-mixer are proposed as an alternative to MHSA to circumvent this problem: an FFT-based token-mixer involves global operations similar to MHSA but with lower computational complexity. However, despite its attractive properties, the FFT-based token-mixer has not been carefully examined in terms of its compatibility with the rapidly evolving MetaFormer architecture. Here, we propose a novel token-mixer called Dynamic Filter and novel image recognition models, DFFormer and CDFFormer, to close the gaps above. The results of image classification and downstream tasks, analysis, and visualization show that our models are helpful. Notably, their throughput and memory efficiency when dealing with high-resolution image recognition is remarkable. Our results indicate that Dynamic Filter is one of the token-mixer options that should be seriously considered. The code is available at https://github.com/okojoalg/dfformer
Lan
Hybrid human-machine query processing systems, such as crowd-powered database systems, aim to broaden the scope of questions users can ask about their data by incorporating human computation to support queries that may be subjective and/or require visual or semantic interpretation. A common type of query involves filtering data by several criteria, some of which need human computation to be evaluated. For example, filtering a set of hotels for those that both (1) have great views from the rooms, and (2) have a fitness center. Criteria can differ in the amount of human effort required to decide if data satisfy them, due to criterion's subjectivity and difficulty. There is potential to reduce crowdsourcing costs by ordering the evaluation of each of the criteria such that criteria needing more human computation are not processed for data that have not satisfied the less costly criteria. Unfortunately, for queries specified on-the-fly, the information about subjectivity and difficulty is unknown a priori. To overcome this challenge, we present Dynamic Filter, an adaptive query processing algorithm that dynamically changes the order in which criteria are evaluated based on observations while the query is running. Using crowdsourced data from a popular crowdsourcing platform, we show that Dynamic Filter can effectively adapt the processing order and approach the performance of a "clairvoyant" algorithm.
ELSA: Enhanced Local Self-Attention for Vision Transformer
Zhou, Jingkai, Wang, Pichao, Wang, Fan, Liu, Qiong, Li, Hao, Jin, Rong
Self-attention is powerful in modeling long-range dependencies, but it is weak in local finer-level feature learning. The performance of local self-attention (LSA) is just on par with convolution and inferior to dynamic filters, which puzzles researchers on whether to use LSA or its counterparts, which one is better, and what makes LSA mediocre. To clarify these, we comprehensively investigate LSA and its counterparts from two sides: \emph{channel setting} and \emph{spatial processing}. We find that the devil lies in the generation and application of spatial attention, where relative position embeddings and the neighboring filter application are key factors. Based on these findings, we propose the enhanced local self-attention (ELSA) with Hadamard attention and the ghost head. Hadamard attention introduces the Hadamard product to efficiently generate attention in the neighboring case, while maintaining the high-order mapping. The ghost head combines attention maps with static matrices to increase channel capacity. Experiments demonstrate the effectiveness of ELSA. Without architecture / hyperparameter modification, drop-in replacing LSA with ELSA boosts Swin Transformer \cite{swin} by up to +1.4 on top-1 accuracy. ELSA also consistently benefits VOLO \cite{volo} from D1 to D5, where ELSA-VOLO-D5 achieves 87.2 on the ImageNet-1K without extra training images. In addition, we evaluate ELSA in downstream tasks. ELSA significantly improves the baseline by up to +1.9 box Ap / +1.3 mask Ap on the COCO, and by up to +1.9 mIoU on the ADE20K. Code is available at \url{https://github.com/damo-cv/ELSA}.
- Asia > China (0.04)
- North America > United States > Indiana > Marion County > Lawrence (0.04)
MOCA: A Modular Object-Centric Approach for Interactive Instruction Following
Singh, Kunal Pratap, Bhambri, Suvaansh, Kim, Byeonghwi, Mottaghi, Roozbeh, Choi, Jonghyun
Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for an AI agent. Recently, an `interactive instruction following' task has been proposed to foster research in reasoning over long instruction sequences that requires object interactions in a simulated environment. It involves solving open problems in vision, language and navigation literature at each step. To address this multifaceted problem, we propose a modular architecture that decouples the task into visual perception and action policy, and name it as MOCA, a Modular Object-Centric Approach. We evaluate our method on the ALFRED benchmark and empirically validate that it outperforms prior arts by significant margins in all metrics with good generalization performance (high success rate in unseen environments). Our code is available at https://github.com/gistvision/moca.
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Dynamic Filter: Adaptive Query Processing with the Crowd
Lan, Doren (Harvey Mudd College) | Reed, Katherine (Harvey Mudd College) | Shin, Austin (Harvey Mudd College) | Trushkowsky, Beth (Harvey Mudd College)
Hybrid human-machine query processing systems, such as crowd-powered database systems, aim to broaden the scope of questions users can ask about their data by incorporating human computation to support queries that may be subjective and/or require visual or semantic interpretation. A common type of query involves filtering data by several criteria, some of which need human computation to be evaluated. For example, filtering a set of hotels for those that both (1) have great views from the rooms, and (2) have a fitness center. Criteria can differ in the amount of human effort required to decide if data satisfy them, due to criterion's subjectivity and difficulty. There is potential to reduce crowdsourcing costs by ordering the evaluation of each of the criteria such that criteria needing more human computation are not processed for data that have not satisfied the less costly criteria. Unfortunately, for queries specified on-the-fly, the information about subjectivity and difficulty is unknown a priori. To overcome this challenge, we present Dynamic Filter, an adaptive query processing algorithm that dynamically changes the order in which criteria are evaluated based on observations while the query is running. Using crowdsourced data from a popular crowdsourcing platform, we show that Dynamic Filter can effectively adapt the processing order and approach the performance of a "clairvoyant" algorithm.
A Dynamic Window Neural Network for CCG Supertagging
Wu, Huijia (Institute of Automation, Chinese Academy of Sciences) | Zhang, Jiajun (Institute of Automation, Chinese Academy of Sciences) | Zong, Chengqing (Institute of Automation, Chinese Academy of Sciences)
Combinatory Category Grammar (CCG) supertagging is a task to assign lexical categories to each word in a sentence. Almost all previous methods use fixed context window sizes to encode input tokens. However, it is obvious that different tags usually rely on different context window sizes. This motivates us to build a supertagger with a dynamic window approach, which can be treated as an attention mechanism on the local contexts. We find that applying dropout on the dynamic filters is superior to the regular dropout on word embeddings. We use this approach to demonstrate the state-of-the-art CCG supertagging performance on the standard test set.
- Asia > Middle East > Jordan (0.05)
- North America > United States (0.04)
- Asia > China (0.04)