You are here

Visual-Textual Video Synopsis Generation

Download pdf | Full Screen View

Date Issued:
2019
Abstract/Description:
In this dissertation we tackle the problem of automatic video summarization. Automatic summarization techniques enable faster browsing and indexing of large video databases. However, due to the inherent subjectivity of the task, no single video summarizer fits all users unless it adapts to individual user's needs. To address this issue, we introduce a fresh view on the task called "Query-focused'' extractive video summarization. We develop a supervised model that takes as input a video and user's preference in form of a query, and creates a summary video by selecting key shots from the original video. We model the problem as subset selection via determinantal point process (DPP), a stochastic point process that assigns a probability value to each subset of any given set. Next, we develop a second model that exploits capabilities of memory networks in the framework and concomitantly reduces the level of supervision required to train the model. To automatically evaluate system summaries, we contend that a good metric for video summarization should focus on the semantic information that humans can perceive rather than the visual features or temporal overlaps. To this end, we collect dense per-video-shot concept annotations, compile a new dataset, and suggest an efficient evaluation method defined upon the concept annotations. To enable better summarization of videos, we improve the sequential DPP in two folds. In terms of learning, we propose a large-margin algorithm to address the exposure bias that is common in many sequence to sequence learning methods. In terms of modeling, we integrate a new probabilistic distribution into SeqDPP, the resulting model accepts user input about the expected length of the summary. We conclude this dissertation by developing a framework to generate textual synopsis for a video, thus, enabling users to quickly browse a large video database without watching the videos.
Title: Visual-Textual Video Synopsis Generation.
34 views
14 downloads
Name(s): Sharghi Karganroodi, Aidean, Author
Shah, Mubarak, Committee Chair
Da Vitoria Lobo, Niels, Committee CoChair
Rahnavard, Nazanin, Committee Member
Atia, George, Committee Member
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2019
Publisher: University of Central Florida
Language(s): English
Abstract/Description: In this dissertation we tackle the problem of automatic video summarization. Automatic summarization techniques enable faster browsing and indexing of large video databases. However, due to the inherent subjectivity of the task, no single video summarizer fits all users unless it adapts to individual user's needs. To address this issue, we introduce a fresh view on the task called "Query-focused'' extractive video summarization. We develop a supervised model that takes as input a video and user's preference in form of a query, and creates a summary video by selecting key shots from the original video. We model the problem as subset selection via determinantal point process (DPP), a stochastic point process that assigns a probability value to each subset of any given set. Next, we develop a second model that exploits capabilities of memory networks in the framework and concomitantly reduces the level of supervision required to train the model. To automatically evaluate system summaries, we contend that a good metric for video summarization should focus on the semantic information that humans can perceive rather than the visual features or temporal overlaps. To this end, we collect dense per-video-shot concept annotations, compile a new dataset, and suggest an efficient evaluation method defined upon the concept annotations. To enable better summarization of videos, we improve the sequential DPP in two folds. In terms of learning, we propose a large-margin algorithm to address the exposure bias that is common in many sequence to sequence learning methods. In terms of modeling, we integrate a new probabilistic distribution into SeqDPP, the resulting model accepts user input about the expected length of the summary. We conclude this dissertation by developing a framework to generate textual synopsis for a video, thus, enabling users to quickly browse a large video database without watching the videos.
Identifier: CFE0007862 (IID), ucf:52756 (fedora)
Note(s): 2019-12-01
Ph.D.
Engineering and Computer Science,
Doctoral
This record was generated from author submitted information.
Subject(s): Video Summarization -- query focused summarization -- Text synopsis generation
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0007862
Restrictions on Access: public 2019-12-15
Host Institution: UCF

In Collections