AITopics | Wu, I-Chen

Collaborating Authors

Wu, I-Chen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Sim-to-Real Dense Object Descriptors for Robotic Manipulation

Cao, Hoang-Giang, Zeng, Weihao, Wu, I-Chen

arXiv.org Artificial IntelligenceApr-17-2023

It is crucial to address the following issues for ubiquitous robotics manipulation applications: (a) vision-based manipulation tasks require the robot to visually learn and understand the object with rich information like dense object descriptors; and (b) sim-to-real transfer in robotics aims to close the gap between simulated and real data. In this paper, we present Sim-to-Real Dense Object Nets (SRDONs), a dense object descriptor that not only understands the object via appropriate representation but also maps simulated and real data to a unified feature space with pixel consistency. We proposed an object-to-object matching method for image pairs from different scenes and different domains. This method helps reduce the effort of training data from real-world by taking advantage of public datasets, such as GraspNet. With sim-to-real object representation consistency, our SRDONs can serve as a building block for a variety of sim-to-real manipulation tasks. We demonstrate in experiments that pre-trained SRDONs significantly improve performances on unseen objects and unseen visual environments for various robotic tasks with zero real-world training.

descriptor, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2304.08703

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A Local-Pattern Related Look-Up Table

Shih, Chung-Chin, Wei, Ting Han, Wu, Ti-Rong, Wu, I-Chen

arXiv.org Artificial IntelligenceDec-22-2022

This paper describes a Relevance-Zone pattern table (RZT) that can be used to replace a traditional transposition table. An RZT stores exact game values for patterns that are discovered during a Relevance-Zone-Based Search (RZS), which is the current state-of-the-art in solving L&D problems in Go. Positions that share the same pattern can reuse the same exact game value in the RZT. The pattern matching scheme for RZTs is implemented using a radix tree, taking into consideration patterns with different shapes. To improve the efficiency of table lookups, we designed a heuristic that prevents redundant lookups. The heuristic can safely skip previously queried patterns for a given position, reducing the overhead to 10% of the original cost. We also analyze the time complexity of the RZT both theoretically and empirically. Experiments show the overhead of traversing the radix tree in practice during lookup remain flat logarithmically in relation to the number of entries stored in the table. Experiments also show that the use of an RZT instead of a traditional transposition table significantly reduces the number of searched nodes on two data sets of 7x7 and 19x19 L&D Go problems.

artificial intelligence, machine learning, radix tree, (17 more...)

arXiv.org Artificial Intelligence

2212.13922

Country: North America > Canada > Alberta (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Go (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

A Novel Approach to Solving Goal-Achieving Problems for Board Games

Shih, Chung-Chin, Wu, Ti-Rong, Wei, Ting Han, Wu, I-Chen

arXiv.org Artificial IntelligenceDec-5-2021

Goal-achieving problems are puzzles that set up a specific situation with a clear objective. An example that is well-studied is the category of life-and-death (L&D) problems for Go, which helps players hone their skill of identifying region safety. Many previous methods like lambda search try null moves first, then derive so-called relevance zones (RZs), outside of which the opponent does not need to search. This paper first proposes a novel RZ-based approach, called the RZ-Based Search (RZS), to solving L&D problems for Go. RZS tries moves before determining whether they are null moves post-hoc. This means we do not need to rely on null move heuristics, resulting in a more elegant algorithm, so that it can also be seamlessly incorporated into AlphaZero's super-human level play in our solver. To repurpose AlphaZero for solving, we also propose a new training method called Faster to Life (FTL), which modifies AlphaZero to entice it to win more quickly. We use RZS and FTL to solve L&D problems on Go, namely solving 68 among 106 problems from a professional L&D book while a previous program solves 11 only. Finally, we discuss that the approach is generic in the sense that RZS is applicable to solving many other goal-achieving problems for board games.

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2112.02563

Country:

Asia (0.46)
North America > Canada > Alberta (0.28)

Genre:

Research Report > Promising Solution (0.40)
Overview > Innovation (0.40)

Industry: Leisure & Entertainment > Games > Go (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Optimistic Temporal Difference Learning for 2048

Guei, Hung, Chen, Lung-Pin, Wu, I-Chen

arXiv.org Artificial IntelligenceNov-22-2021

Temporal difference (TD) learning and its variants, such as multistage TD (MS-TD) learning and temporal coherence (TC) learning, have been successfully applied to 2048. These methods rely on the stochasticity of the environment of 2048 for exploration. In this paper, we propose to employ optimistic initialization (OI) to encourage exploration for 2048, and empirically show that the learning quality is significantly improved. This approach optimistically initializes the feature weights to very large values. Since weights tend to be reduced once the states are visited, agents tend to explore those states which are unvisited or visited few times. Our experiments show that both TD and TC learning with OI significantly improve the performance. As a result, the network size required to achieve the same performance is significantly reduced. With additional tunings such as expectimax search, multistage learning, and tile-downgrading technique, our design achieves the state-of-the-art performance, namely an average score of 625 377 and a rate of 72% reaching 32768 tiles. In addition, for sufficiently large tests, 65536 tiles are reached at a rate of 0.02%.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TG.2021.3109887

2111.1109

Country:

Europe (1.00)
Asia (0.69)
North America > United States > Massachusetts (0.28)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Unsupervised Video Game Playstyle Metric via State Discretization

Lin, Chiu-Chou, Chiu, Wei-Chen, Wu, I-Chen

arXiv.org Artificial IntelligenceOct-3-2021

On playing video games, different players usually have their own playstyles. Recently, there have been great improvements for the video game AIs on the playing strength. However, past researches for analyzing the behaviors of players still used heuristic rules or the behavior features with the game-environment support, thus being exhausted for the developers to define the features of discriminating various playstyles. In this paper, we propose the first metric for video game playstyles directly from the game observations and actions, without any prior specification on the playstyle in the target game. Our proposed method is built upon a novel scheme of learning discrete representations that can map game observations into latent discrete states, such that playstyles can be exhibited from these discrete states. Namely, we measure the playstyle distance based on game observations aligned to the same states. We demonstrate high playstyle accuracy of our metric in experiments on some video game platforms, including TORCS, RGSK, and seven Atari games, and for different agents including rule-based AI bots, learning-based AI bots, and human players.

artificial intelligence, computer game, playstyle, (17 more...)

arXiv.org Artificial Intelligence

2110.0095

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Games (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
(2 more...)

Add feedback

Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search

Lan, Li-Cheng, Tsai, Meng-Yu, Wu, Ti-Rong, Wu, I-Chen, Hsieh, Cho-Jui

arXiv.org Artificial IntelligenceDec-14-2020

Monte Carlo tree search (MCTS) has achieved state-of-the-art results in many domains such as Go and Atari games when combining with deep neural networks (DNNs). When more simulations are executed, MCTS can achieve higher performance but also requires enormous amounts of CPU and GPU resources. However, not all states require a long searching time to identify the best action that the agent can find. For example, in 19x19 Go and NoGo, we found that for more than half of the states, the best action predicted by DNN remains unchanged even after searching 2 minutes. This implies that a significant amount of resources can be saved if we are able to stop the searching earlier when we are confident with the current searching result. In this paper, we propose to achieve this goal by predicting the uncertainty of the current searching status and use the result to decide whether we should stop searching. With our algorithm, called Dynamic Simulation MCTS (DS-MCTS), we can speed up a NoGo agent trained by AlphaZero 2.5 times faster while maintaining a similar winning rate. Also, under the same average simulation count, our method can achieve a 61% winning rate against the original program.

computer game, deep learning, simulation, (20 more...)

arXiv.org Artificial Intelligence

2012.0791

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Sim-To-Real Transfer for Miniature Autonomous Car Racing

Chu, Yeong-Jia Roger, Wei, Ting-Han, Huang, Jin-Bo, Chen, Yuan-Hao, Wu, I-Chen

arXiv.org Artificial IntelligenceNov-11-2020

Sim-to-real, a term that describes where a model is trained in a simulator then transferred to the real world, is a technique that enables faster deep reinforcement learning (DRL) training. However, differences between the simulator and the real world often cause the model to perform poorly in the real world. Domain randomization is a way to bridge the sim-to-real gap by exposing the model to a wide range of scenarios so that it can generalize to real-world situations. However, following domain randomization to train an autonomous car racing model with DRL can lead to undesirable outcomes. Namely, a model trained with randomization tends to run slower; a higher completion rate on the testing track comes at the expense of longer lap times. This paper aims to boost the robustness of a trained race car model without compromising racing lap times. For a training track and a testing track having the same shape (and same optimal paths), but with different lighting, background, etc., we first train a model (teacher model) that overfits the training track, moving along a near optimal path. We then use this model to teach a student model the correct actions along with randomization. With our method, a model with 18.4\% completion rate on the testing track is able to help teach a student model with 52\% completion. Moreover, over an average of 50 trials, the student is able to finish a lap 0.23 seconds faster than the teacher. This 0.23 second gap is significant in tight races, with lap times of about 10 to 12 seconds.

ground transportation, neural network, randomization, (23 more...)

arXiv.org Artificial Intelligence

2011.05617

Country: North America > Canada > Alberta (0.28)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (0.72)
Information Technology > Robotics & Automation (0.72)
Transportation > Passenger (0.62)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.72)

Add feedback

Multiple Policy Value Monte Carlo Tree Search

Lan, Li-Cheng, Li, Wei, Wei, Ting-Han, Wu, I-Chen

arXiv.org Artificial IntelligenceMay-31-2019

Many of the strongest game playing programs use a combination of Monte Carlo tree search (MCTS) and deep neural networks (DNN), where the DNNs are used as policy or value evaluators. Given a limited budget, such as online playing or during the self-play phase of AlphaZero (AZ) training, a balance needs to be reached between accurate state estimation and more MCTS simulations, both of which are critical for a strong game playing agent. Typically, larger DNNs are better at generalization and accurate evaluation, while smaller DNNs are less costly, and therefore can lead to more MCTS simulations and bigger search trees with the same budget. This paper introduces a new method called the multiple policy value MCTS (MPV-MCTS), which combines multiple policy value neural networks (PV-NNs) of various sizes to retain advantages of each network, where two PV-NNs f_S and f_L are used in this paper. We show through experiments on the game NoGo that a combined f_S and f_L MPV-MCTS outperforms single PV-NN with policy value MCTS, called PV-MCTS. Additionally, MPV-MCTS also outperforms PV-MCTS for AZ training.

deep learning, neural network, simulation, (21 more...)

arXiv.org Artificial Intelligence

1905.13521

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Go (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Towards Combining On-Off-Policy Methods for Real-World Applications

Hu, Kai-Chun, Pi, Chen-Huan, Wei, Ting Han, Wu, I-Chen, Cheng, Stone, Dai, Yi-Wei, Ye, Wei-Yuan

arXiv.org Machine LearningApr-24-2019

In this paper, we point out a fundamental property of the objective in reinforcement learning, with which we can reformulate the policy gradient objective into a perceptron-like loss function, removing the need to distinguish between on and off policy training. Namely, we posit that it is sufficient to only update a policy $\pi$ for cases that satisfy the condition $A(\frac{\pi}{\mu}-1)\leq0$, where $A$ is the advantage, and $\mu$ is another policy. Furthermore, we show via theoretic derivation that a perceptron-like loss function matches the clipped surrogate objective for PPO. With our new formulation, the policies $\pi$ and $\mu$ can be arbitrarily apart in theory, effectively enabling off-policy training. To examine our derivations, we can combine the on-policy PPO clipped surrogate (which we show to be equivalent with one instance of the new reformation) with the off-policy IMPALA method. We first verify the combined method on the OpenAI Gym pendulum toy problem. Next, we use our method to train a quadrotor position controller in a simulator. Our trained policy is efficient and lightweight enough to perform in a low cost micro-controller at a minimum update rate of 500 Hz. For the quadrotor, we show two experiments to verify our method and demonstrate performance: 1) hovering at a fixed position, and 2) tracking along a specific trajectory. In preliminary trials, we are also able to apply the method to a real-world quadrotor.

artificial intelligence, neural network, trajectory, (18 more...)

arXiv.org Machine Learning

1904.10642

Country:

Oceania > Australia (0.14)
North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Human vs. Computer Go: Review and Prospect

Lee, Chang-Shing, Wang, Mei-Hui, Yen, Shi-Jim, Wei, Ting-Han, Wu, I-Chen, Chou, Ping-Chiang, Chou, Chun-Hsun, Wang, Ming-Wan, Yang, Tai-Hsiung

arXiv.org Artificial IntelligenceJun-7-2016

The Google DeepMind challenge match in March 2016 was a historic achievement for computer Go development. This article discusses the development of computational intelligence (CI) and its relative strength in comparison with human intelligence for the game of Go. We first summarize the milestones achieved for computer Go from 1998 to 2016. Then, the computer Go programs that have participated in previous IEEE CIS competitions as well as methods and techniques used in AlphaGo are briefly introduced. Commentaries from three high-level professional Go players on the five AlphaGo versus Lee Sedol games are also included. We conclude that AlphaGo beating Lee Sedol is a huge achievement in artificial intelligence (AI) based largely on CI methods. In the future, powerful computer Go programs such as AlphaGo are expected to be instrumental in promoting Go education and AI real-world applications.

alphago, deep learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MCI.2016.2572559

1606.02032

Country:

Asia > Taiwan (0.18)
Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Go (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Games > Go (1.00)

Add feedback