Abstract
Surgical workflow recognition has numerous potential medical applications,
such as the automatic indexing of surgical video databases and the optimization
of real-time operating room scheduling, among others. As a result, phase
recognition has been studied in the context of several kinds of surgeries, such
as cataract, neurological, and laparoscopic surgeries. In the literature, two
types of features are typically used to perform this task: visual features and
tool usage signals. However, the visual features used are mostly handcrafted.
Furthermore, the tool usage signals are usually collected via a manual
annotation process or by using additional equipment. In this paper, we propose
a novel method for phase recognition that uses a convolutional neural network
(CNN) to automatically learn features from cholecystectomy videos and that
relies uniquely on visual information. In previous studies, it has been shown
that the tool signals can provide valuable information in performing the phase
recognition task. Thus, we present a novel CNN architecture, called EndoNet,
that is designed to carry out the phase recognition and tool presence detection
tasks in a multi-task manner. To the best of our knowledge, this is the first
work proposing to use a CNN for multiple recognition tasks on laparoscopic
videos. Extensive experimental comparisons to other methods show that EndoNet
yields state-of-the-art results for both tasks.
Description
[1602.03012] EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos
Links and resources
Tags
community