Data Science
KDD-93: Progress and Challenges in Knowledge Discovery in Databases
Piatetsky-Shapiro, Gregory, Matheus, Christopher, Smyth, Padhraic, Uthurusamy, Ramasamy
Over 60 researchers from 10 countries took part in the Third Knowledge Discovery in Databases (KDD) Workshop, held during the Eleventh National Conference on Artificial Intelligence in Washington, D.C. A major trend evident at the workshop was the transition to applications in the core KDD area of discovery of relatively simple patterns in relational databases; the most successful applications are appearing in the areas of greatest need, where the databases are so large that manual analysis is impossible. Progress has been facilitated by the availability of commercial KDD tools for both generic discovery and domain-specific applications such as marketing. At the same time, progress has been slowed by problems such as lack of statistical rigor, overabundance of patterns, and poor integration. Besides applications, the main themes of this workshop were (1) the discovery of dependencies and models and (2) integrated and interactive KDD systems.
Pattern Matching and Discourse Processing in Information Extraction from Japanese Text
Kitani, T., Eriguchi, Y., Hara, M.
Information extraction is the task of automaticallypicking up information of interest from an unconstrained text. Informationof interest is usually extracted in two steps. First, sentence level processing locates relevant pieces of information scatteredthroughout the text; second, discourse processing merges coreferential information to generate the output. In the first step, pieces of information are locally identified without recognizing any relationships among them. A key word search or simple patternsearch can achieve this purpose. The second step requires deeperknowledge in order to understand relationships among separately identified pieces of information. Previous information extraction systems focused on the first step, partly because they were not required to link up each piece of information with other pieces. To link the extracted pieces of information and map them onto a structuredoutput format, complex discourse processing is essential. This paperreports on a Japanese information extraction system that merges information using a pattern matcher and discourse processor. Evaluationresults show a high level of system performance which approaches human performance.
Analog Cochlear Model for Multiresolution Speech Analysis
Liu, Weimin, Andreou, Andreas G., Jr., Moise H. Goldstein
The tradeoff between time and frequency resolution is viewed as the fundamental difference between conventional spectrographic analysis and cochlear signal processing for broadband, rapid-changing signals. The model's response exhibits a wavelet-like analysis in the scale domain that preserves good temporal resolution; the frequency of each spectral component in a broadband signal can be accurately determined from the interpeak intervals in the instantaneous firing rates of auditory fibers. Such properties of the cochlear model are demonstrated with natural speech and synthetic complex signals. 1 Introduction As a nonparametric tool, spectrogram, or short-term Fourier transform, is widely used in analyzing non-stationary signals, such speech. Usually a window is applied to the running signal and then the Fourier transform is performed. The specific window applied determines the tradeoff between temporal and spectral resolutions of the analysis, as indicated by the uncertainty principle [1].
Analog Cochlear Model for Multiresolution Speech Analysis
Liu, Weimin, Andreou, Andreas G., Jr., Moise H. Goldstein
The tradeoff between time and frequency resolution is viewed as the fundamental difference between conventional spectrographic analysis and cochlear signal processing for broadband, rapid-changing signals. The model's response exhibits a wavelet-like analysis in the scale domain that preserves good temporal resolution; the frequency of each spectral component in a broadband signal can be accurately determined from the interpeak intervals in the instantaneous firing rates of auditory fibers. Such properties of the cochlear model are demonstrated with natural speech and synthetic complex signals. 1 Introduction As a nonparametric tool, spectrogram, or short-term Fourier transform, is widely used in analyzing non-stationary signals, such speech. Usually a window is applied to the running signal and then the Fourier transform is performed. The specific window applied determines the tradeoff between temporal and spectral resolutions of the analysis, as indicated by the uncertainty principle [1].
3D Object Recognition Using Unsupervised Feature Extraction
Intrator, Nathan, Gold, Joshua I., Bülthoff, Heinrich H., Edelman, Shimon
Intrator (1990) proposed a feature extraction method that is related to recent statistical theory (Huber, 1985; Friedman, 1987), and is based on a biologically motivated model of neuronal plasticity (Bienenstock et al., 1982). This method has been recently applied to feature extraction in the context of recognizing 3D objects from single 2D views (Intrator and Gold, 1991). Here we describe experiments designed to analyze the nature of the extracted features, and their relevance to the theory and psychophysics of object recognition. 1 Introduction Results of recent computational studies of visual recognition (e.g., Poggio and Edelman, 1990) indicate that the problem of recognition of 3D objects can be effectively reformulated in terms of standard pattern classification theory. According to this approach, an object is represented by a few of its 2D views, encoded as clusters in multidimentional space. Recognition of a novel view is then carried out by interpo-460 3D Object Recognition Using Unsupervised Feature Extraction 461 lating among the stored views in the representation space.
3D Object Recognition Using Unsupervised Feature Extraction
Intrator, Nathan, Gold, Joshua I., Bülthoff, Heinrich H., Edelman, Shimon
Intrator (1990) proposed a feature extraction method that is related to recent statistical theory (Huber, 1985; Friedman, 1987), and is based on a biologically motivated model of neuronal plasticity (Bienenstock et al., 1982). This method has been recently applied to feature extraction in the context of recognizing 3D objects from single 2D views (Intrator and Gold, 1991). Here we describe experiments designed to analyze the nature of the extracted features, and their relevance to the theory and psychophysics of object recognition. 1 Introduction Results of recent computational studies of visual recognition (e.g., Poggio and Edelman, 1990) indicate that the problem of recognition of 3D objects can be effectively reformulated in terms of standard pattern classification theory. According to this approach, an object is represented by a few of its 2D views, encoded as clusters in multidimentional space. Recognition of a novel view is then carried out by interpo-460 3D Object Recognition Using Unsupervised Feature Extraction 461 lating among the stored views in the representation space.
3D Object Recognition Using Unsupervised Feature Extraction
Intrator, Nathan, Gold, Joshua I., Bülthoff, Heinrich H., Edelman, Shimon
Gold Center for Neural Science, Brown University Providence, RI 02912, USA Shimon Edelman Dept. of Applied Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel Abstract Intrator (1990) proposed a feature extraction method that is related to recent statistical theory (Huber, 1985; Friedman, 1987), and is based on a biologically motivated model of neuronal plasticity (Bienenstock et al., 1982). This method has been recently applied to feature extraction in the context of recognizing 3D objects from single 2D views (Intrator and Gold, 1991). Here we describe experiments designed to analyze the nature of the extracted features, and their relevance to the theory and psychophysics of object recognition. 1 Introduction Results of recent computational studies of visual recognition (e.g., Poggio and Edelman, 1990)indicate that the problem of recognition of 3D objects can be effectively reformulated in terms of standard pattern classification theory. According to this approach, an object is represented by a few of its 2D views, encoded as clusters in multidimentional space. Recognition of a novel view is then carried out by interpo-460 3D Object Recognition Using Unsupervised Feature Extraction 461 lating among the stored views in the representation space.
Knowledge Discovery in Databases: An Overview
Frawley, William J., Piatetsky-Shapiro, Gregory, Matheus, Christopher J.
After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases. The contributors to the AAAI Press book Knowledge Discovery in Databases were excited at the potential benefits of this research. The editors hope that some of this excitement will communicate itself to "AI Magazine readers of this article.