segmentation point
SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation
Jiang, Junfeng, Dong, Chengzhang, Kurohashi, Sadao, Aizawa, Akiko
Dialogue segmentation is a crucial task for dialogue systems allowing a better understanding of conversational texts. Despite recent progress in unsupervised dialogue segmentation methods, their performances are limited by the lack of explicit supervised signals for training. Furthermore, the precise definition of segmentation points in conversations still remains as a challenging problem, increasing the difficulty of collecting manual annotations. In this paper, we provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues and release a large-scale supervised dataset called SuperDialseg, containing 9,478 dialogues based on two prevalent document-grounded dialogue corpora, and also inherit their useful dialogue-related annotations. Moreover, we provide a benchmark including 18 models across five categories for the dialogue segmentation task with several proper evaluation metrics. Empirical studies show that supervised learning is extremely effective in in-domain datasets and models trained on SuperDialseg can achieve good generalization ability on out-of-domain data. Additionally, we also conducted human verification on the test set and the Kappa score confirmed the quality of our automatically constructed dataset. We believe our work is an important step forward in the field of dialogue segmentation. Our codes and data can be found from: https://github.com/Coldog2333/SuperDialseg.
- North America > Dominican Republic (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- (14 more...)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.67)
Take a Break in the Middle: Investigating Subgoals towards Hierarchical Script Generation
Li, Xinze, Cao, Yixin, Chen, Muhao, Sun, Aixin
Goal-oriented Script Generation is a new task of generating a list of steps that can fulfill the given goal. In this paper, we propose to extend the task from the perspective of cognitive theory. Instead of a simple flat structure, the steps are typically organized hierarchically - Human often decompose a complex task into subgoals, where each subgoal can be further decomposed into steps. To establish the benchmark, we contribute a new dataset, propose several baseline methods, and set up evaluation metrics. Both automatic and human evaluation verify the high-quality of dataset, as well as the effectiveness of incorporating subgoals into hierarchical script generation. Furthermore, We also design and evaluate the model to discover subgoal, and find that it is a bit more difficult to decompose the goals than summarizing from segmented steps.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Singapore (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (10 more...)
- Research Report (1.00)
- Workflow (0.88)
- Consumer Products & Services (0.93)
- Health & Medicine > Consumer Health (0.67)
- Education (0.67)
Recognition-based Segmentation of On-Line Hand-printed Words
The input strings consist of a time(cid:173) ordered sequence of X-Y coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite differ(cid:173) ent. The first method, which we call IN SEC (for input segmen(cid:173) tation), uses a combination of heuristics to identify particular pen(cid:173) lifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empiri(cid:173) cally trained recognition engine for both recognizing characters and identifying relevant segmentation points.
Are uGLAD? Time will tell!
Imani, Shima, Shrivastava, Harsh
We frequently encounter multiple series that are temporally correlated in our surroundings, such as EEG data to examine alterations in brain activity or sensors to monitor body movements. Segmentation of multivariate time series data is a technique for identifying meaningful patterns or changes in the time series that can signal a shift in the system's behavior. However, most segmentation algorithms have been designed primarily for univariate time series, and their performance on multivariate data remains largely unsatisfactory, making this a challenging problem. In this work, we introduce a novel approach for multivariate time series segmentation using conditional independence (CI) graphs. CI graphs are probabilistic graphical models that represents the partial correlations between the nodes. We propose a domain agnostic multivariate segmentation framework `$\texttt{tGLAD}$' which draws a parallel between the CI graph nodes and the variables of the time series. Consider applying a graph recovery model $\texttt{uGLAD}$ to a short interval of the time series, it will result in a CI graph that shows partial correlations among the variables. We extend this idea to the entire time series by utilizing a sliding window to create a batch of time intervals and then run a single $\texttt{uGLAD}$ model in multitask learning mode to recover all the CI graphs simultaneously. As a result, we obtain a corresponding temporal CI graphs representation. We then designed a first-order and second-order based trajectory tracking algorithms to study the evolution of these graphs across distinct intervals. Finally, an `Allocation' algorithm is used to determine a suitable segmentation of the temporal graph sequence. $\texttt{tGLAD}$ provides a competitive time complexity of $O(N)$ for settings where number of variables $D<
Visual Subtitle Feature Enhanced Video Outline Generation
Lv, Qi, Cao, Ziqiang, Xie, Wenrui, Wang, Derui, Wang, Jingwen, Hu, Zhiwei, Zhang, Tangkun, Yuan, Ba, Li, Yuanhang, Cao, Min, Li, Wenjie, Li, Sujian, Fu, Guohong
With the tremendously increasing number of videos, there is a great demand for techniques that help people quickly navigate to the video segments they are interested in. However, current works on video understanding mainly focus on video content summarization, while little effort has been made to explore the structure of a video. Inspired by textual outline generation, we introduce a novel video understanding task, namely video outline generation (VOG). This task is defined to contain two sub-tasks: (1) first segmenting the video according to the content structure and then (2) generating a heading for each segment. To learn and evaluate VOG, we annotate a 10k+ dataset, called DuVOG. Specifically, we use OCR tools to recognize subtitles of videos. Then annotators are asked to divide subtitles into chapters and title each chapter. In videos, highlighted text tends to be the headline since it is more likely to attract attention. Therefore we propose a Visual Subtitle feature Enhanced video outline generation model (VSENet) which takes as input the textual subtitles together with their visual font sizes and positions. We consider the VOG task as a sequence tagging problem that extracts spans where the headings are located and then rewrites them to form the final outlines. Furthermore, based on the similarity between video outlines and textual outlines, we use a large number of articles with chapter headings to pretrain our model. Experiments on DuVOG show that our model largely outperforms other baseline methods, achieving 77.1 of F1-score for the video segmentation level and 85.0 of ROUGE-L_F0.5 for the headline generation level.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Beijing > Beijing (0.04)
A streaming feature-based compression method for data from instrumented infrastructure
Gregory, Alastair, Lau, Din-Houn, Tessier, Alex, Zhang, Pan
An increasing amount of civil engineering applications are utilising data acquired from infrastructure instrumented with sensing devices. This data has an important role in monitoring the response of these structures to excitation, and evaluating structural health. In this paper we seek to monitor pedestrian-events (such as a person walking) on a footbridge using strain and acceleration data. The rate of this data acquisition and the number of sensing devices make the storage and analysis of this data a computational challenge. We introduce a streaming method to compress the sensor data, whilst preserving key patterns and features (unique to different sensor types) corresponding to pedestrian-events. Numerical demonstrations of the methodology on data obtained from strain sensors and accelerometers on the pedestrian footbridge are provided to show the trade-off between compression and accuracy during and in-between periods of pedestrian-events.
- North America > United States > Texas (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
Segmentation of Offline Handwritten Bengali Script
Basu, Subhadip, Chaudhuri, Chitrita, Kundu, Mahantapas, Nasipuri, Mita, Basu, Dipak K.
Character segmentation is one of the most important decision processes for optical character recognition (OCR). Isolating individual alphabetic characters in the script image is often significant enough to make a decisive contribution towards the success rate of the overall system. An OCR system may be designed to work for either of online and off-line purposes. Online OCR systems collect input data by recording the order of strokes made by the write on an electronic bit-pad, and off-line OCR systems do the same by recording the pixel by pixel digital image of the entire writing with a digital scanner. OCR has a wide field of application covering handwritten document transcription, automatic mail address recognition, machine processing of bankchecks, faxes etc. Off-line OCR of hand written words has long been an active area research. Some important contributions so far made in this field involve analysis of English texts [1], [2], [3], [5], Chinese script [6] and Arabic characters [9]. With this background of research, the present work considers Bengali script for developing suitable techniques for off-line OCR with it.
- Asia > India > West Bengal > Kolkata (0.06)
- Asia > Bangladesh (0.04)
Recognition-based Segmentation of On-Line Hand-printed Words
Schenkel, M., Weissman, H., Guyon, I., Nohl, C., Henderson, D.
The input strings consist of a timeordered sequenceof XY coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. Thefirst method, which we call IN SEC (for input segmentation), usesa combination of heuristics to identify particular penlifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empirically trainedrecognition engine for both recognizing characters and identifying relevant segmentation points. 1 INTRODUCTION We address the problem of writer independent recognition of hand-printed words from an 80,OOO-word English dictionary. Several levels of difficulty in the recognition of hand-printed words are illustrated in figure 1. The examples were extracted from our databases (table 1). Except in the cases of boxed or clearly spaced characters, segmenting characters independently of the recognition process yields poor recognition performance.This has motivated us to explore recognition-based segmentation techniques.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
Recognition-based Segmentation of On-Line Hand-printed Words
Schenkel, M., Weissman, H., Guyon, I., Nohl, C., Henderson, D.
The input strings consist of a timeordered sequence of XY coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. The first method, which we call IN SEC (for input segmentation), uses a combination of heuristics to identify particular penlifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empirically trained recognition engine for both recognizing characters and identifying relevant segmentation points.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
Recognition-based Segmentation of On-Line Hand-printed Words
Schenkel, M., Weissman, H., Guyon, I., Nohl, C., Henderson, D.
The input strings consist of a timeordered sequence of XY coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. The first method, which we call IN SEC (for input segmentation), uses a combination of heuristics to identify particular penlifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empirically trained recognition engine for both recognizing characters and identifying relevant segmentation points.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)