Multimedia Technologies

Recent advances in multimedia technologies allow the capture and storage of video data with relatively inexpensive computers. The new possibilities provided by the information highways have made a huge amount of video data publicly available. However, without appropriate search techniques all this data is hardly usable. Users want to query the content instead of the text and video data. Mostly videos were retrieved based on keyword from a tremendous collection of videos. Given a text query by users, the system then returns a series of approximately relevant videos by matching the input text with text documents associated with the video shots. Users are usually interested in the top ranked portion of returned search results and therefore it is mandatory for search engines to achieve accuracy on the search results. Here, Content based video searching re-ranking and retrieval framework is proposed to improve the efficiency and accuracy of the search engines. Video retrieval using query-by-image is not successful as it is not giving more relevant and top ranked results. In proposed method, input is a video clip for achieving the high quality of content based video retrieval by discovering the temporal patterns in the video contents. On the basis of the discovered temporal patterns, an efficient indexing technique and an effective sequence matching technique are integrated together to reduce the computation cost and to raise the retrieval accuracy, respectively. The method enhances content based video retrieval in terms of efficiency and effectiveness
Keywords: Content-based video retrieval, Fast Pattern Index tree, Indexing, Index Tree,
Re-ranking, Video Segmentation

Recently, multimedia data is increasing very rapidly which takes large amount of memory spaces for storage. Retrieving any particular video from large storage is very difficult task by querying any type of input. Manually video analysis, indexing and searching becomes difficult due to unstructured databases. The recent video development of content-based video retrieval systems has advanced capabilities for searching videos via color, layout, texture, motion and shape edge feature etc. To help users find and retrieve relevant information effectively and to facilitate new and better ways of entertainment, advanced technologies need to be developed for indexing, browsing, filtering, searching and updating the vast amount of information available in video[2].
To prevent video retrieval from the plight of textual-based video retrieval, content-based video retrieval (CBVR) has been brought to researchers' attention for a long time. Without considering the identification of the query terms, the users can obtain their desired videos by submitting her/his interested video clip.
An innovative method is helpful for achieving the high quality of content-based video retrieval by mining the temporal patterns in the video contents. Emphasis is given on the construction of a pattern-based index for efficient retrieval, namely fast-pattern-index tree, and the other is unique search strategy for effective retrieval, namely pattern based search.
The near-duplicate videos may be uploaded many times from many different users. So the problem of efficient identification of near duplicate videos on the web is an important issue for video management. Watching a large number of videos to grasp important information quickly is a big challenge. The evolution of the entire event is not directly observable by simply watching these videos. Content based video retrieval has wide range of applications, such as customer domain applications, quick databases. Most existing content based video retrieval systems focus on video analysis, visual feature extraction and supporting query by text or query by image. Only few CBVR systems release their inherent video database indexing structure.
Conventional query by text retrieval can't satisfy user's requirements in finding the desired videos effectively hence, content-based video retrieval is regarded as one of the most practical solutions to improve the retrieval quality and effectiveness. Video retrieval using query by image is not successful in associating the videos with user's interest either.
Knowledge discovery from the huge amount of multimedia data is called multimedia mining. For multimedia mining, compound and complex multimedia data are usually organized into the multimedia repositories by multimedia conceptualizing techniques known as classification and annotation, so browsing of video folders, remote instruction, digital museums, news event analysis, video surveillance, and educational applications becomes easy.
Search engine has been widely used as the platform for knowledge discovery from the web. In order to facilitate textual- based video retrieval, the most natural way is manual annotation. However, manual annotation is expensive due to the massive amount of video contents. A considerable number of past studies were conducted on automated semantic videos such as decision tree, hidden Markov model (HMM), K nearest neighbor (KNN), association mining, support vector machine (SVM) etc. Through the automated descriptions of videos, the user's interest and videos can be associated semantically [2].
Video Content Analysis includes analyzing video data and storing according to the content of video. Video content is the capability of automatically analyzing video to detect and determine temporal events which are based on several images. Video analysis includes
1) Background Subtraction: Detect foreground regions like objects and also whether people are present in image or not which is based on a statistical model of the camera. Separate out foreground and background image by comparing continuous key frames.
2) Object tracking: Track each foreground region and perform occlusions or merging and splitting of regions by automatic or manual efforts.
3) Event Reasoning: Determine which regions represent people and which represent stationary objects and differentiate both behaviors.
4) Graphical User Interface: Displaying video input and event diagnosis to the system.
5) Indexing and Retrieval: Index the video clip into multiple shots which is called as shot segmentation and retrieves most relevant video sequences based on user queries.
Video segmentation segments the video into small units that includes shot boundary detection, key frame extraction, scene segmentation and audio extraction.
Video Segmentation includes shot boundary detection. Dividing the whole video into a number of temporal segments is called shots. A shot may be defined as a continuous sequence of frames generated by a single non-stop camera operation. Methods for shot boundary detection usually first extract visual features from each frame, then measure similarities between frames using the extracted features, and, finally, detect shot boundaries between frames that are dissimilar. Following are the techniques for video segmentation:
' Shot Boundary Detection
Shot boundary detection applications classified into two types. 1) Threshold based approach detects shot boundaries by comparing frames with a predefined threshold 2) Statistical learning-based approach detects shot boundary in which frames are classified as the shot changes.

' Key frame Extraction
The features used for key frame extraction include color histogram, edges, shapes, texture, optical flow. Current approaches to extract key frames are classified into three categories: sequential comparison-based, global comparison-based, reference frame-based,
Sequential comparison based approach includes previously extracted key frame which is sequentially compared with the present key frame until a frame which is very different from the key-frame is obtained. Color histogram is used to find difference between the current frame & the previous key frame.
Global comparison-based approach is based on global differences between frames in a shot distribute key-frames by minimizing a predefined objective function.
Reference frame based Algorithm generates a reference frame and then extracts key frames by comparing the frames in the shot with the reference frame [1].
Feature extraction includes extracting features from the output of video segmentation. Feature extraction is the time consuming task in CBVR. Key frames are classified as color based, texture based, shape based, edge based features. Color-based features include color histograms, color moments, color correlograms, a mixture of Gaussian models, etc. split the image into 8??8 blocks to capture local color information. Feature Extraction includes extracting feature from segmented video clips.[1].

Video is divided into the shots which is useful for video clustering and video retrieval.
Color Feature: Color is the most significant features of the image. Scale change is not so sensitive and showing a strong robustness. Color features include color histogram, dominant color, and average luminance. Color includes two luminance and one chrominance (YUV) factor. For large scale image dataset fast search method requires firstly RGB color space conversion to visual space HSV and then to quantify the 'm' color set which is defined as the quantized color space in a choice of colors. Color set feature vector is binary and hence can construct a binary tree for fast searches.
Texture Feature: In texture feature extraction image is divided into set of objects. Image pixels are used according to the direction and distance to construct co-occurrences matrix. But these demographic characteristics have disadvantages like more computation cost and complexity.
Shape Feature: Shape analysis needs to use the proper image segmentation algorithm to separate the different objects out from images. The Shape feature representation is important criteria required for translation, rotation, scaling invariance which is usually based on shape of the image boundary and divided into two categories which are based on region known as Fourier described and moment invariants.
Edge Feature- Edge is extracted from image and used for comparing similar type of image. If match found related videos are added into the media player for execution. Edge includes body outer structure, object outline etc. which used for further comparison. [5]
In existing system, video data retrieval method is carried out by querying through text input or the images. Since the search is by the text query; it retrieves each and every video that matches to text query. Querying by text or image consumes more time for indexing stage and may also give irrelevant results, which increases process overhead.
To implement the efficient content based video retrieval system, firstly we have to analyze input/ query video and also we have to the maintain database simultaneously. It is carried out by surveying the patterns that are present in the video. The techniques for video retrieval are as follows:

Key-frame-based retrieval: A video is composed of a set of sequential images, text and audio. For content-based video retrieval, a query video can convey the richer content information to a search system than a query image. Sequential comparison between key-frames of query video with key-frames of target videos for finding the relevant videos. Computation cost is so high that the users cannot put up with the long response time.
Sliding Window Based Retrieval-Specialized distance functions are used for matching videos by calculating the similarities between the shots of query and those of target videos making use of longest common subsequence(LCS) matching technique. This technique discovers the longest sub sequential frames common to all sequential frames in two sequences. Hence, the temporal similarity between two video clips can be derived by LCS measure. Unfortunately, the computation cost for sequential visual feature comparisons is very high which causes sliding window based retrieval less effective.

Cluster Based Retrieval-Cluster analysis plays an important role in exploring the underlying structure of a given dataset. Cluster analysis is experiment oriented. Clustering depends on the content of data available in the video. The number of clusters is becoming larger and larger for many applications, such as video retrieval and image classification. Clustering algorithms complexity is proportional to the number of clusters. Computational time and the cluster numbers becomes severe problem with scalability. K-means algorithm is directly applied to very large datasets because of their time complexity with respect to the input size [3].

Source: Essay UK -

Not what you're looking for?


About this resource

This Information Technology essay was submitted to us by a student in order to help you with your studies.


No ratings yet!

Word count:

This page has approximately words.



If you use part of this page in your own work, you need to provide a citation, as follows:

Essay UK, Multimedia Technologies. Available from: <> [22-02-19].

More information:

If you are the original author of this content and no longer wish to have it published on our website then please click on the link below to request removal:

Essay and dissertation help