Download PDFOpen PDF in browserCurrent version

Smart Surveillance for Smart City

EasyChair Preprint 3519, version 1

Versions: 12history
6 pagesDate: May 30, 2020

Abstract

In recent years, video surveillance technology has become pervasive in every sphere. The manual generation of the description of videos requires huge time and labor and sometimes important aspects of videos are overlooked in human summaries. The present work is an attempt towards the automated description generation of Surveillance Video.
The proposed method consists of the extraction of key-frames from a surveillance video, object detection in the key-frames, natural language (English) description generation of the key-frames and finally summarizing the descriptions.
The key-frames are identified based on a mean square error ratio. Object detection in a key-frame is performed using region convolutional Neural Network (R-CNN). We used Long Short Term Memory (LSTM) to generate captions from frames. Translation Error Rate (TER) is used to identify and remove duplicate event descriptions.
Tf-idf is used to rank the event descriptions generated from a video and the top-ranked description is returned as the system generated a summary of the video.
We evaluated the MSVD data set to validate our proposed approach and the system produces a Bilingual Evaluation Understudy (BLEU) score of 46.83.

Keyphrases: Content Based Video Retrieval, Image frame, Smart City, Smart Surveillance, key frame, key frame extraction, microsoft video description, object detection, pattern recognition, real-time object detection, video description corpus, video summarization

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:3519,
  author    = {Atanu Mandal and Sudip Kumar Naskar},
  title     = {Smart Surveillance for Smart City},
  howpublished = {EasyChair Preprint 3519},
  year      = {EasyChair, 2020}}
Download PDFOpen PDF in browserCurrent version