Cross-Media Video Classification, Indexing, Retrieval and Visualization
Challenging Problems:
Recent advances on high-performance video compression, storage and communication
technologies present an extraordinary opportunity to enable
evidence-based multimedia medical education by illustrating suitable medical
video clips in the class. With the real clinic examples documented in the relevant
medical video clips, medical instructors will have more capability and flexibility to
explain medical concepts, principles and diagnosis skills in the class.
As large-scale collections of medical education videos come into view, there is an
urgent need for the efforts that classify and access medical education videos at the
semantic level, so that the medical instructors can select the most relevant medical
video clips over such large-scale video collections quickly and easily.
In spite of some recent research progress, video classification and retrieval at the
semantic level are still open problems with many unsolved challenging issues:
(a) Multi-Modal Query Concept Specification:
There are three widely accepted approaches to specify query concepts: (1)
video examples; (2) keywords; (3) browsing. Each approach
represents a useful way of accessing a video database, but currently they all have
limitations.
(b) Automatic Video Concept Detection: Because of semantic gap, it is very
hard to detect the video concepts automatically, especially for the higher-level
video concepts with large variations.
(c) Video Visualization for Query Evaluation: Video shots are used for video
indexing, retrieval and displaying the query results, but the users may have stronger
interest on identifying the most relevant lengthy video clips with complete descriptions
of certain events.
Research Focus:
This project will tackle these challenging problems in a specific domain of nursing
education video, but we will also test and extend our algorithms for News Videos.
The proposed research include:
develop a new framework for video content representation that is able to characterize
the middle-level semantics of video contents;
incorporate domain knowledge to boost the video classifier training while reducing
the cost and complexity significantly;
integrate video visualization for fast decision making and query result evaluation.