The World Wide Web changed the way we live our lives, most notably in the ways we now share, consume and find information. There are many more webpages now than there are people, and links connect these webpages to each other in a giant network that is accessible from your favorite browser.
A downside of this success is that now there’s too much information, so much in fact, that we need machines to intelligently read these webpages and answer our questions. The Semantic Web is a movement and research community that brings together experts from different areas, examples being natural language processing, ontologies, databases, social media, networks and logic, to realize the vision of making the Web machine-readable.
Why is this such a difficult problem? The main reason is that much of the Web, even today, is in a natural language like English or French. These languages are very ambiguous, but we humans have a knack for understanding them due to a variety of factors, not the least of which is our immense store of background knowledge and common sense. Machines are not yet capable of understanding English at the same level as an adult human being, though impressive progress is being made.
To overcome this problem, the Semantic Web presents a vision of the Web as an interlinked network of concepts, relationships and entities, rather than an interlinked network of ‘natural’ webpages. Intelligent systems, often called ‘agents’, can consume the Semantic Web and answer complex questions that now require human labor. The research in the Semantic Web also helps search; e.g. the Google Knowledge Graph, which uses Semantic Web technology, can help you to answer some of your questions without even clicking on a link!
The Semantic Web Company (SWC) is a leading provider of software and services in the areas of Semantic Information Management, Machine Learning, Natural Language Processing, and Linked Data technologies. SWC's renowned PoolParty Semantic Suite software platform is used in large enterprises, Government Organizations, NPOs and NGOs around the globe to extract meaning from big data.
Research on semantic web services promises greater interoperability among software agents and web services by enabling content-based automated service discovery and interaction and by utilizing. Although this is to be based on use of shared ontologies published on the semantic web, services produced and described by different developers may well use different, perhaps partly overlapping, sets of ontologies. Interoperability will depend on ontology mappings and architectures supporting the associated translation processes. The question we ask is, does the traditional approach of introducing mediator agents to translate messages between requestors and services work in such an open environment? This article reviews some of the processing assumptions that were made in the development of the semantic web service modeling ontology OWLS and argues that, as a practical matter, the translation function cannot always be isolated in mediators.
EUROLAN, which has been held biennially since 1993, is one of the most significant European summer schools in the area of natural language processing. Each of the EUROLAN sessions has focused on an area of timely interest to researchers in the field; this year's EU-ROLAN involved students in tutorials and hands-on sessions concerned with semantic web technologies as applied to language processing, ontology creation and use, and consideration of the semantic web's potential and limitations. This year's school was organized by the Faculty of Computer Science at the A. I. Cuza University of Iasi, the Research Institute for Artificial Intelligence at the Romanian Academy in Bucharest, and the Department of Computer Science at Vassar College. It was the most successful in its 10-year history, with 119 registered participants from 23 countries. Hosted by the Romanian Academy, the most prestigious cultural and scientific institution in the country, the event was given significant attention in the media.
In numerous distributed environments, including today's World Wide Web, enterprise data management systems, large science projects, and the emerging semantic web, applications will inevitably use the information described by multiple ontologies and schemas. We organized the Workshop on Semantic Integration at the Second International Semantic Web Conference to bring together different communities working on the issues of enabling integration among different resources. The workshop generated a lot of interest and attracted more than 70 participants. Interoperability among applications depends critically on the ability to map between them. Semantic integration issues have now become a key bottleneck in the deployment of a wide variety of information management applications.
The emerging Semantic Web focuses on bringing knowledge representationlike capabilities to Web applications in a Web-friendly way. The ability to put knowledge on the Web, share it, and reuse it through standard Web mechanisms provides new and interesting challenges to artificial intelligence. In this paper, I explore the similarities and differences between the Semantic Web and traditional AI knowledge representation systems, and see if I can validate the analogy "The Semantic Web is to KR as the Web is to hypertext." The first comes from a tutorial on expert systems written by Robert Engelmore with Edward Feigenbaum in 1993. Because of the importance of knowledge in expert systems and because the current knowledge acquisition method is slow and tedious, much of the future of expert systems depends on breaking the knowledge acquisition bottleneck and in codifying and representing a large knowledge infrastructure.
In the past, many knowledge representation systems failed because they were too monolithic and didn't scale well, whereas other systems failed to have an impact because they were small and isolated. Along with this tradeoff in size, there is also a constant tension between the cost involved in building a larger community that can interoperate through common terms and the cost of the lack of interoperability. The semantic web offers a good compromise between these approaches as it achieves wide-scale communication and interoperability using finite effort and cost. The semantic web is a set of standards for knowledge representation and exchange that is aimed at providing interoperability across applications and organizations. We believe that the gathering success of this technology is not derived from the particular choice of syntax or of logic.
Semantic web technologies (Hitzler, Krötzsch, and Rudolph 2010) are meant to deal with these issues, and indeed since the advent of linked data (Bizer, Heath, and Berners-Lee 2009) a few years ago, they have become central to mainstream semantic web research and development. We can easily understand linked data as being a part of the greater big data landscape, as many of the challenges are the same (Hitzler and Janowicz 2013). The linking component of linked data, however, puts an additional focus on the integration and conflation of data across multiple sources. This issue of AI Magazine is a followup from that meeting and contains significantly extended, enhanced, and updated contributions. We summarize the articles in the following paragraphs.
In terms of the envisioned methodologies, those catchphrases tend to emphasize the role of predictive analytics, that is, statistical techniques including data mining and machine learning, as well as supercomputing. Interestingly, however, while this perspective takes the availability of data as a given, it does not answer the question how one would discover the required data in today's chaotic information universe, how one would understand which data sets can be meaningfully integrated, and how to communicate the results to humans and machines alike. In the following, we argue why the data train needs semantic rails. We point out that making sense of data and gaining new insights work best if inductive and deductive techniques go hand-in-hand instead of competing over the prerogative of interpretation. Equally important, however, are questions of how to publish data effectively and break up data silos, how to retrieve data, how to enable the exploration of unfamiliar data sets from different domains, how to access provenance information, how to determine whether data sets can be meaningfully reused and integrated, how to prevent data from being misunderstood, how to combine data with processing services and workflows on the fly, and finally how to make data readable and understandable by machines and humans.
The need for access to and interoperability between these repositories is growing. Research groups need to access their own increasingly diverse data collections. As investigations begin to include results from many different experiments, researchers also need to access and utilize other research groups' data repositories in a single discipline or, more interestingly, in multiple disciplines. Also, it is not simply trained scientists who are interested in accessing scientific data; lay people are becoming interested in looking at trends in scientific data as well, for example, when they become engaged in climate discussions. The promise of the true virtual interconnected heterogeneous distributed international data repository is starting to be realized.
The Twenty-First National Conference on Artificial Intelligence (AAAI-06) will be held July 16-20, 2006, at the Seaport Hotel and World Trade Center in Boston Massachusetts. The Eighteenth Conference on Innovative Applications of Artificial Intelligence will be collocated with AAAI-06, and will be held July 18-20. The special track on AI and the Web invites technical papers on the use of AI techniques, systems, and concepts involving the Web. The program chairs are especially interested in receiving papers in two active research areas: (1) using text and language analysis to interpret and understand natural language text found on the web and (2) developing and exploiting semantic web languages and systems that explicitly encode knowledge using languages such as RDF and OWL. Papers submitted to this track should report on theoretical or empirical studies that highlight the role of integration of multiple components in achieving intelligent behavior.