Not enough data to create a plot.
Try a different view from the menu above.
Ji, Wei
Dynamic Speed Guidance for CAV Ramp Merging in Non-Cooperative Environment: An On-Site Experiment
Ji, Wei, Ma, Yechi, Cui, Guangzhang, Qin, Xiaotian, Hua, Wei
Ramp merging is a typical application of cooperative intelligent transportation system (C-ITS). Vehicle trajectories perceived by roadside sensors are importation complement to the limited visual field of on-board perception. Vehicle tracking and trajectory denoising algorithm is proposed in this paper to take full advantage of roadside cameras for vehicle trajectory and speed profile estimation. Dynamic speed guidance algorithm is proposed to help on-ramp vehicles to merge into mainline smoothly, even in non-cooperative environment where mainline vehicles are not expected to slow down to accommodate on-ramp vehicles. On-site experiments were taken out in a merging area of Hangzhou Belt Highway to testify our prototype system, and simulation analysis shows our proposed algorithm can achieve significant fuel savings during the ramp merging process.
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering
Xiao, Junbin, Yao, Angela, Liu, Zhiyuan, Li, Yicong, Ji, Wei, Chua, Tat-Seng
Video question answering requires models to understand and reason about both complex video and language data to correctly derive answers. Existing efforts focus on designing sophisticated cross-modal interactions to fuse the information from two modalities, while encoding the video and question holistically as frame and word sequences. Despite their success, these methods are essentially revolving around the sequential nature of video- and question-contents, providing little insight to the problem of question-answering and lacking interpretability as well. In this work, we argue that while video is presented in frame sequence, the visual elements (eg, objects, actions, activities and events) are not sequential but rather hierarchical in semantic space. To align with the multi-granular essence of linguistic concepts in language queries, we propose to model video as a conditional graph hierarchy which weaves together visual facts of different granularity in a level-wise manner, with the guidance of corresponding textual cues. Despite the simplicity, our extensive experiments demonstrate the superiority of such conditional hierarchical graph architecture, with clear performance improvements over prior methods and also better generalization across different type of questions. Further analyses also consolidate the model's reliability as it shows meaningful visual-textual evidences for the predicted answers.
Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search
Acerbi, Luigi, Ji, Wei
Computational models in fields such as computational neuroscience are often evaluated via stochastic simulation or numerical approximation. Fitting these models implies a difficult optimization problem over complex, possibly noisy parameter landscapes. Bayesian optimization (BO) has been successfully applied to solving expensive black-box problems in engineering and machine learning. Here we explore whether BO can be applied as a general tool for model fitting. First, we present a novel hybrid BO algorithm, Bayesian adaptive direct search (BADS), that achieves competitive performance with an affordable computational overhead for the running time of typical models. We then perform an extensive benchmark of BADS vs. many common and state-of-the-art nonconvex, derivative-free optimizers, on a set of model-fitting problems with real data and models from six studies in behavioral, cognitive, and computational neuroscience. With default settings, BADS consistently finds comparable or better solutions than other methods, including `vanilla' BO, showing great promise for advanced BO techniques, and BADS in particular, as a general model-fitting tool.