This paper presents a novel scheme, based on a unique combination of genetic algorithms (GAs) and deep learning (DL), for the automatic reconstruction of Portuguese tile panels, a challenging real-world variant of the jigsaw puzzle problem (JPP) with important national heritage implications. Specifically, we introduce an enhanced GA-based puzzle solver, whose integration with a novel DL-based compatibility measure (DLCM) yields state-of-the-art performance, regarding the above application. Current compatibility measures consider typically (the chromatic information of) edge pixels (between adjacent tiles), and help achieve high accuracy for the synthetic JPP variant. However, such measures exhibit rather poor performance when applied to the Portuguese tile panels, which are susceptible to various real-world effects, e.g., monochromatic panels, non-squared tiles, edge degradation, etc. To overcome such difficulties, we have developed a novel DLCM to extract high-level texture/color statistics from the entire tile information. Integrating this measure with our enhanced GA-based puzzle solver, we have demonstrated, for the first time, how to deal most effectively with large-scale real-world problems, such as the Portuguese tile problem. Specifically, we have achieved 82% accuracy for the reconstruction of Portuguese tile panels with unknown piece rotation and puzzle dimension (compared to merely 3.5% average accuracy achieved by the best method known for solving this problem variant). The proposed method outperforms even human experts in several cases, correcting their mistakes in the manual tile assembly.
In this paper we introduce new types of square-piece jigsaw puzzles, where in addition to the unknown location and orientation of each piece, a piece might also need to be flipped. These puzzles, which are associated with a number of real world problems, are considerably harder, from a computational standpoint. Specifically, we present a novel generalized genetic algorithm (GA)-based solver that can handle puzzle pieces of unknown location and orientation (Type 2 puzzles) and (two-sided) puzzle pieces of unknown location, orientation, and face (Type 4 puzzles). To the best of our knowledge, our solver provides a new state-of-the-art, solving previously attempted puzzles faster and far more accurately, handling puzzle sizes that have never been attempted before, and assembling the newly introduced two-sided puzzles automatically and effectively. This paper also presents, among other results, the most extensive set of experimental results, compiled as of yet, on Type 2 puzzles.
Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unordered fragments, is fundamental to numerous applications, and yet most of the literature has focused thus far on less realistic puzzles whose pieces are identical squares. Here we formalize a new type of jigsaw puzzle where the pieces are general convex polygons generated by cutting through a global polygonal shape with an arbitrary number of straight cuts, a generation model inspired by the celebrated Lazy caterer's sequence. We analyze the theoretical properties of such puzzles, including the inherent challenges in solving them once pieces are contaminated with geometrical noise. To cope with such difficulties and obtain tractable solutions, we abstract the problem as a multi-body spring-mass dynamical system endowed with hierarchical loop constraints and a layered reconstruction process. We define evaluation metrics and present experimental results to indicate that such puzzles are solvable completely automatically.
The pictorial jigsaw (PJ) puzzle is a well-known leisure game for humans. Usually, a PJ puzzle game is played by one or several human players face-to-face in the physical space. In this paper, we focus on how to solve PJ puzzles in the cyberspace by a group of physically distributed human players. We propose an approach to solving PJ puzzle by stigmergy-inspired Internet-based human collective intelligence. The core of the approach is a continuously executing loop, named the EIF loop, which consists of three activities: exploration, integration, and feedback. In exploration, each player tries to solve the PJ puzzle alone, without direct interactions with other players. At any time, the result of a player's exploration is a partial solution to the PJ puzzle, and a set of rejected neighboring relation between pieces. The results of all players' exploration are integrated in real time through integration, with the output of a continuously updated collective opinion graph (COG). And through feedback, each player is provided with personalized feedback information based on the current COG and the player's exploration result, in order to accelerate his/her puzzle-solving process. Exploratory experiments show that: (1) supported by this approach, the time to solve PJ puzzle is nearly linear to the reciprocal of the number of players, and shows better scalability to puzzle size than that of face-to-face collaboration for 10-player groups; (2) for groups with 2 to 10 players, the puzzle-solving time decreases 31.36%-64.57% on average, compared with the best single players in the experiments.
Generating music medleys is about finding an optimal permutation of a given set of music clips. Toward this goal, we propose a self-supervised learning task, called the music puzzle game, to train neural network models to learn the sequential patterns in music. In essence, such a game requires machines to correctly sort a few multisecond music fragments. In the training stage, we learn the model by sampling multiple non-overlapping fragment pairs from the same songs and seeking to predict whether a given pair is consecutive and is in the correct chronological order. For testing, we design a number of puzzle games with different difficulty levels, the most difficult one being music medley, which requiring sorting fragments from different songs. On the basis of state-of-the-art Siamese convolutional network, we propose an improved architecture that learns to embed frame-level similarity scores computed from the input fragment pairs to a common space, where fragment pairs in the correct order can be more easily identified. Our result shows that the resulting model, dubbed as the similarity embedding network (SEN), performs better than competing models across different games, including music jigsaw puzzle, music sequencing, and music medley. Example results can be found at our project website, https://remyhuang.github.io/DJnet.