Collection
Multimodal Tree Decoder for Table of Contents Extraction in Document Images
Hu, Pengfei, Zhang, Zhenrong, Zhang, Jianshu, Du, Jun, Wu, Jiajia
Table of contents (ToC) extraction aims to extract headings of different levels in documents to better understand the outline of the contents, which can be widely used for document understanding and information retrieval. Existing works often use hand-crafted features and predefined rule-based functions to detect headings and resolve the hierarchical relationship between headings. Both the benchmark and research based on deep learning are still limited. Accordingly, in this paper, we first introduce a standard dataset, HierDoc, including image samples from 650 documents of scientific papers with their content labels. Then we propose a novel end-to-end model by using the multimodal tree decoder (MTD) for ToC as a benchmark for HierDoc. The MTD model is mainly composed of three parts, namely encoder, classifier, and decoder. The encoder fuses the multimodality features of vision, text, and layout information for each entity of the document. Then the classifier recognizes and selects the heading entities. Next, to parse the hierarchical relationship between the heading entities, a tree-structured decoder is designed. To evaluate the performance, both the metric of tree-edit-distance similarity (TEDS) and F1-Measure are adopted. Finally, our MTD approach achieves an average TEDS of 87.2% and an average F1-Measure of 88.1% on the test set of HierDoc. The code and dataset will be released at: https://github.com/Pengfei-Hu/MTD.
Conversations That Matter: Working with artificial intelligence
"There is no shortage of commentary on what artificial intelligence will do to human jobs. It's easy to find a multiplicity of predictions, prescriptions, or denunciations," says Thomas H. Davenport, one of the co-authors of the book. "It is not so easy, however, to find descriptions of how people work day-to-day with smart machines." Davenport joined a Conversation That Matters about our emerging and ever-expanding relationship with a technology that scares a wide range of people including, Elon Musk and Bill Gates.
Lifelong and Continual Learning Dialogue Systems
Dialogue systems, commonly known as chatbots, have gained escalating popularity in recent times due to their wide-spread applications in carrying out chit-chat conversations with users and task-oriented dialogues to accomplish various user tasks. Existing chatbots are usually trained from pre-collected and manually-labeled data and/or written with handcrafted rules. Many also use manually-compiled knowledge bases (KBs). Their ability to understand natural language is still limited, and they tend to produce many errors resulting in poor user satisfaction. Typically, they need to be constantly improved by engineers with more labeled data and more manually compiled knowledge. This book introduces the new paradigm of lifelong learning dialogue systems to endow chatbots the ability to learn continually by themselves through their own self-initiated interactions with their users and working environments to improve themselves. As the systems chat more and more with users or learn more and more from external sources, they become more and more knowledgeable and better and better at conversing. The book presents the latest developments and techniques for building such continual learning dialogue systems that continuously learn new language expressions and lexical and factual knowledge during conversation from users and off conversation from external sources, acquire new training examples during conversation, and learn conversational skills. Apart from these general topics, existing works on continual learning of some specific aspects of dialogue systems are also surveyed. The book concludes with a discussion of open challenges for future research.
AI
Artificial intelligence (AI) is having a major impact on healthcare. While advances in the sharing and analysis of medical data result in better and earlier diagnoses and more patient-tailored treatments, data management is also affected by trends such as increased patient-centricity (with shared decision making), self-care (e.g., using wearables), and integrated care delivery. The way in which health services are delivered is being revolutionized through the sharing and integration of health data across organizational boundaries. Via AI, researchers can provide new approaches to merge, analyze, and process complex data and gain more actionable insights, understanding, and knowledge at an individual and population level. This Special Issue focuses on how AI is used in healthcare, and on related topics such as data management, data integration, data sharing, patient privacy and bioethical issues.
eBook: Intuitive Machine Learning and Explainable AI - Machine Learning Techniques
By Vincent Granville Ph.D. Published in September 2022. This book covers the foundations of machine learning, with modern approaches to solving complex problems. Emphasis is on scalability, automation, testing, optimizing, and interpretability (explainable AI). For instance, regression techniques -- including logistic and Lasso -- are presented as a single method, without using advanced linear algebra. There is no need to learn 50 versions when one does it all and more.
The 35th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems – Conference Report
The 35th edition of the IEA/AIE2022 (International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems) was hosted in hybrid mode from July 19th to July 22nd, 2022, in Kitakyushu, Japan. IEA/AIE is an annual conference dedicated to advances related to the theory and applications of artificial intelligence that started in 1988 and has been hosted in over twenty countries. IEA/AIE 2022 was organized in cooperation with the American Association of Artificial Intelligence (AAAI), the ACM Special Interest Group on Artificial Intelligence (SIGAI) and has received the support of Springer, the International Society of Applied Intelligence (ISAI), Kitakyushu city, Universiti Teknologi Malaysia, i-SOMET Inc., and several other international Organizations. The conference had a main track and five special sessions for emerging topics in applied intelligence, named Spatiotemporal Big Data Analytics (SBDA 2022), Intelligent Systems and e-Applications (ISeA 2022), Collective Intelligence in Social Media (CISM 2022), Multi-Agent Systems and Metaheuristics for Complex Problems (MASMCP 2022), and Intelligent Knowledge Engineering in Decision Making Systems (IKEDS 2022). All the submissions were peer-reviewed by at least three reviewers following a double-blind process.
Group Recommender Systems: An Introduction (SpringerBriefs in Electrical and Computer Engineering): Felfernig, Alexander, Boratto, Ludovico, Stettinger, Martin, Tkalčič, Marko: 9783319750668: Amazon.com: Books
Alexander Felfernig is a full professor at the Graz University of Technology (Austria) since March 2009 and received his PhD in Computer Science from the University of Klagenfurt. He directs the Applied Software Engineering (ASE) research group. His research interests include configuration systems, recommender systems, model-based diagnosis, software requirements engineering, different aspects of human decision making, and knowledge acquisition methods. In these areas, he is engaged in national research projects as well as in a couple of European Union projects. Alexander Felfernig has published numerous papers in renowned international conferences and journals (e.g., AI Magazine, Artificial Intelligence, IEEE Transactions on Engineering Management, IEEE Intelligent Systems, Journal of Electronic Commerce) and is a co-author of the book on "Recommender Systems" published by Cambridge University Press.
Computer Vision - Richard Szeliski
As humans, we perceive the three-dimensional structure of the world around us with apparent ease. Think of how vivid the three-dimensional percept is when you look at a vase of flowers sitting on the table next to you. You can tell the shape and translucency of each petal through the subtle patterns of light and shading that play across its surface and effortlessly segment each flower from the background of the scene (Figure 1.1). Looking at a framed group por- trait, you can easily count (and name) all of the people in the picture and even guess at their emotions from their facial appearance. Perceptual psychologists have spent decades trying to understand how the visual system works and, even though they can devise optical illusions1 to tease apart some of its principles (Figure 1.3), a complete solution to this puzzle remains elusive (Marr 1982; Palmer 1999; Livingstone 2008).