[1] KRANZFELDER M, SCHNEIDER A, FIOLKA A, et al. Reliability of sensor-based real-time workflow recognition in laparoscopic cholecystectomy[J]. International Journal of Computer Assisted Radiology and Surgery, 2014, 9(6): 941-948.
[2] MAIER-HEIN L, VEDULA S S, SPEIDEL S, et al. Surgical data science for next-generation interventions[J]. Nature Biomedical Engineering, 2017, 1(9): 691-696.
[3] MASCAGNI P, VARDAZARYAN A, ALAPATT D, et al. Artificial intelligence for surgical safety: automatic assessment of the critical view of safety in laparoscopic cholecystectomy using deep learning[J]. Annals of Surgery, 2022, 275(5): 955-961.
[4] AVCI A, BOSCH S, MARIN-PERIANU M, et al. Activity recognition using inertial sensing for healthcare, wellbeing and sports applications: A survey[C]//23th International Conference on Architecture of Computing Systems 2010. VDE, 2010: 1-10.
[5] AL HAJJ H, LAMARD M, CONZE P H, et al. Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks[J]. Medical Image Analysis, 2018, 47: 203-218.
[6] GARCIA-PERAZA-HERRERA L C, LI W, FIDON L, et al. Toolnet: holistically-nested real time segmentation of robotic surgical tools[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017: 5717-5722.
[7] YU T, MUTTER D, MARESCAUX J, et al. Learning from a tiny dataset of manual annota tions: a teacher/student approach for surgical phase recognition[C]//International Conference on Information Processing in Computer-Assisted Interventions (IPCAI), Rennes, France, juin 2019. 2019.
[8] ZISIMOPOULOS O, FLOUTY E, LUENGO I, et al. Deepphase: surgical phase recognition in cataracts videos[C]//International Conference on Medical Image Computing and Computer Assisted Intervention. Springer, 2018: 265-272.
[9] RAMESH S, DALL’ALBA D, GONZALEZ C, et al. Multi-task temporal convolutional networks for joint recognition of surgical phases and steps in gastric bypass procedures[J]. International Journal of Computer Assisted Radiology and Surgery, 2021, 16(7): 1111-1119.
[10] SAHU M, SZENGEL A, MUKHOPADHYAY A, et al. Surgical phase recognition by learning phase transitions[J]. Current Directions in Biomedical Engineering, 2020, 6(1).
[11] DIPIETRO R, AHMIDI N, MALPANI A, et al. Segmenting and classifying activities in robot-assisted surgery with recurrent neural networks[J]. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(11): 2005-2020.
[12] KITAGUCHI D, TAKESHITA N, MATSUZAKI H, et al. Real-time automatic surgical phase recognition in laparoscopic sigmoidectomy using the convolutional neural network-based deep learning approach[J]. Surgical Endoscopy, 2020, 34(11): 4924-4931.
[13] LOUKAS C, GEORGIOU E. Smoke detection in endoscopic surgery videos: a first step towards retrieval of semantic events[J]. The International Journal of Medical Robotics and Computer Assisted Surgery, 2015, 11(1): 80-94.
[14] ZAPPELLA L, BÉJAR B, HAGER G, et al. Surgical gesture classification from video and kinematic data[J]. Medical Image Analysis, 2013, 17(7): 732-745.
[15] BLUM T, FEUSSNER H, NAVAB N. Modeling and segmentation of surgical workflow from laparoscopic video[C]//International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2010: 400-407.
[16] LALYS F, RIFFAUD L, BOUGET D, et al. A framework for the recognition of high-level surgical tasks from video images for cataract surgeries[J]. IEEE Transactions on Biomedical Engineering, 2011, 59(4): 966-976.
[17] JIN Y, DOU Q, CHEN H, et al. SV-RCNet: workflow recognition from surgical videos using recurrent convolutional network[J]. IEEE Transactions on Medical Imaging, 2017, 37(5): 1114-1126.
[18] FUNKE I, BODENSTEDT S, OEHME F, et al. Using 3d convolutional neural networks to learn spatiotemporal features for automatic surgical gesture recognition in video[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2019: 467-475.
[19] CZEMPIEL T, PASCHALI M, KEICHER M, et al. Tecno: surgical phase recognition with multi-stage temporal convolutional networks[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2020: 343-352.
[20] YI F, YANG Y, JIANG T. Not end-to-end: explore multi-stage architecture for online surgical phase recognition[C]//Proceedings of the Asian Conference on Computer Vision. 2022: 2613-2628.
[21] DING X, LI X. Exploring segment-level semantics for online phase recognition from surgical videos[J]. IEEE Transactions on Medical Imaging, 2022, 41(11): 3309-3319.
[22] PARK M, OH S, JEONG T, et al. Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition[J]. Diagnostics, 2023, 13(1): 107.
[23] TWINANDA A P, SHEHATA S, MUTTER D, et al. EndoNet: a deep architecture for recognition tasks on laparoscopic videos[J]. IEEE Transactions on Medical Imaging, 2016, 36(1): 86-97.
[24] JIN Y, LI H, DOU Q, et al. Multi-task recurrent convolutional network with correlation loss for surgical video analysis[J]. Medical Image Analysis, 2020, 59: 101572.
[25] NAKAWALA H, BIANCHI R, PESCATORI L E, et al. "Deep-Onto" network for surgical workflow and context recognition[J]. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(4): 685-696.
[26] QI B, QIN X, LIU J, et al. A deep architecture for surgical workflow recognition with edge information[C]//2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2019: 1358-1364.
[27] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]//International Conference on Machine Learning. PMLR, 2020: 1597-1607.
[28] HE K, FAN H, WU Y, et al. Momentum contrast for unsupervised visual representation learning [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 9729-9738.
[29] CARON M, MISRA I, MAIRAL J, et al. Unsupervised learning of visual features by contrasting cluster assignments[J]. Advances in Neural Information Processing Systems, 2020, 33: 9912-9924.
[30] CARON M, TOUVRON H, MISRA I, et al. Emerging properties in self-supervised vision transformers[C]//Proceedings of the IEEE International Conference on Computer Vision. 2021: 9650-9660.
[31] HIRSCH R, CARON M, COHEN R, et al. Self-supervised learning for endoscopic video analysis[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2023: 569-578.
[32] FUNKE I, JENKE A, MEES S T, et al. Temporal coherence-based self-supervised learning for laparoscopic workflow analysis[C]//International Workshop on Computer-Assisted and Robotic Endoscopy. Springer, 2018: 85-93.
[33] SHAO S, PEI Z, CHEN W, et al. Self-supervised learning for monocular depth estimation on minimally invasive surgery scenes[C]//2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021: 7159-7165.
[34] PATHAK D, KRAHENBUHL P, DONAHUE J, et al. Context encoders: Feature learning by inpainting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2016: 2536-2544.
[35] DING X, LIU Z, LI X. Free lunch for surgical video understanding by distilling self-supervisions [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2022: 365-375.
[36] RAMESH S, SRIVASTAV V, ALAPATT D, et al. Dissecting self-supervised learning methods for surgical computer vision[J]. Medical Image Analysis, 2023, 88: 102844.
[37] CIREGAN D, MEIER U, SCHMIDHUBER J. Multi-column deep neural networks for image classification[C]//2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012: 3642-3649.
[38] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25.
[39] SATO I, NISHIMURA H, YOKOI K. APAC: Augmented PAttern Classification with Neural Networks[J/OL]. CoRR, 2015, abs/1505.03229. http://arxiv.org/abs/1505.03229.
[40] TOKOZUME Y, USHIKU Y, HARADA T. Between-class learning for image classification [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5486-5494.
[41] GUO H, MAO Y, ZHANG R. Mixup as locally linear out-of-manifold regularization[C]// Proceedings of the AAAI conference on artificial intelligence: volume 33. 2019: 3714-3722.
[42] DEVRIES T, TAYLOR G W. Improved Regularization of Convolutional Neural Networks with Cutout[J/OL]. CoRR, 2017, abs/1708.04552. http://arxiv.org/abs/1708.04552.
[43] YUN S, HAN D, OH S J, et al. Cutmix: Regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 6023-6032.
[44] HUANG Z, WANG H, XING E P, et al. Self-challenging improves cross-domain generalization [C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, 2020: 124-140.
[45] WANG S, YU L, LI C, et al. Learning from extrinsic and intrinsic supervisions for domain generalization[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX. Springer, 2020: 159-176.
[46] XU Q, ZHANG R, ZHANG Y, et al. A fourier-based framework for domain generalization[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 14383-14392.
[47] XU Q, ZHANG R, WU Y Y, et al. SimDE: A Simple Domain Expansion Approach for SingleSource Domain Generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 4797-4807.
[48] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-adversarial training of neural networks [J]. Journal of Machine Learning Research, 2016, 17(59): 1-35.
[49] LI Y, TIAN X, GONG M, et al. Deep domain generalization via conditional invariant adversarial networks[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 624-639.
[50] HINTON G, VINYALS O, DEAN J, et al. Distilling the knowledge in a neural network[Z].
[51] GOU J, YU B, MAYBANK S J, et al. Knowledge distillation: a survey[J]. International Journal of Computer Vision, 2021, 129(6): 1789-1819.
[52] SONG L, WU J, YANG M, et al. Handling difficult labels for multi-label image classification via uncertainty distillation[C]//Proceedings of the 29th ACM International Conference on Multimedia. 2021: 2410-2419.
[53] XU J, HUANG S, ZHOU F, et al. Boosting multi-label image classification with complementary parallel self-distillation[C/OL]//RAEDT L D. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22. International Joint Conferences on Artificial Intelligence Organization, 2022: 1495-1501. https://doi.org/10.24963/ijcai.2022/208.
[54] LIU Y, SHENG L, SHAO J, et al. Multi-label image classification via knowledge distillation from weakly-supervised detection[C]//Proceedings of the 26th ACM international conference on Multimedia. 2018: 700-708.
[55] NWOYE C I, ALAPATT D, YU T, et al. Cholectriplet2021: a benchmark challenge for surgical action triplet recognition[J]. Medical Image Analysis, 2023, 86: 102803.
[56] SUN C, SHRIVASTAVA A, SINGH S, et al. Revisiting unreasonable effectiveness of data in deep learning era[C]//Proceedings of the IEEE international conference on computer vision. 2017: 843-852.
[57] ZHANG H, CISSÉ M, DAUPHIN Y N, et al. mixup: Beyond Empirical Risk Minimization [J/OL]. CoRR, 2017, abs/1710.09412. http://arxiv.org/abs/1710.09412.
[58] HONG M, CHOI J, KIM G. Stylemix: Separating content and style for enhanced data augmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 14862-14870.
[59] DAI R, DAS S, KAHATAPITIYA K, et al. MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 20041-20051.
[60] PADOY N. Machine and deep learning for workflow recognition during surgery[J]. Minimally Invasive Therapy & Allied Technologies, 2019, 28(2): 82-90.
[61] MAIER-HEIN L, VEDULA S S, SPEIDEL S, et al. Surgical data science for next-generation interventions[J]. Nature Biomedical Engineering, 2017, 1(9): 691-696.
[62] TANWANI A K, SERMANET P, YAN A, et al. Motion2vec: Semi-supervised representation learning from surgical videos[C]//2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020: 2174-2181.
[63] HONG S, LEE J, PARK B, et al. Rethinking Generalization Performance of Surgical Phase Recognition with Expert-Generated Annotations[J/OL]. CoRR, 2021, abs/2110.11626. https: //arxiv.org/abs/2110.11626.
[64] JING L, TIAN Y. Self-supervised visual feature learning with deep neural networks: A survey [J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 43(11): 4037-4058.
[65] XU M, ISLAM M, LIM C M, et al. Learning domain adaptation with model calibration for surgical report generation in robotic surgery[C]//2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021: 12350-12356.
[66] STAUDER R, OSTLER D, KRANZFELDER M, et al. The TUM LapChole dataset for the M2CAI 2016 workflow challenge[J/OL]. CoRR, 2016, abs/1610.09278. http://arxiv.org/abs/1610.09278.
[67] WAGNER M, MÜLLER-STICH B P, KISILENKO A, et al. Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the heichole benchmark[J]. Medical Image Analysis, 2023, 86: 102770.
[68] LI D, YANG Y, SONG Y Z, et al. Deeper, broader and artier domain generalization[C]// Proceedings of the IEEE International Conference on Computer Vision. 2017: 5542-5550.
[69] VAN DEN OORD A, LI Y, VINYALS O. Representation Learning with Contrastive Predictive Coding[J/OL]. CoRR, 2018, abs/1807.03748. http://arxiv.org/abs/1807.03748.
[70] BUENO-BENITO E B, VECINO B T, DIMICCOLI M. Leveraging triplet loss for unsupervised action segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 4921-4929.
[71] PASZKE A, GROSS S, MASSA F, et al. PyTorch: an imperative style, high-performance deep learning library[M]//Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019: 8024-8035.
[72] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2016: 770-778.
[73] FARHA Y A, GALL J. Ms-tcn: Multi-stage temporal convolutional network for action segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3575-3584.
[74] RIVOIR D, FUNKE I, SPEIDEL S. On the pitfalls of batch normalization for end-to-end video learning: A study on surgical workflow analysis[J]. Medical Image Analysis, 2024: 103126.
[75] VERCAUTEREN T, UNBERATH M, PADOY N, et al. Cai4cai: the rise of contextual artificial intelligence in computer-assisted interventions[J]. Proceedings of the IEEE, 2019, 108(1): 198-214.
[76] PADOY N, BLUM T, AHMADI S A, et al. Statistical modeling and recognition of surgical workflow[J]. Medical Image Analysis, 2012, 16(3): 632-641.
[77] NWOYE C I, GONZALEZ C, YU T, et al. Recognition of instrument-tissue interactions in endoscopic videos via action triplets[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2020: 364-374.
[78] NWOYE C I, YU T, GONZALEZ C, et al. Rendezvous: Attention mechanisms for the recognition of surgical action triplets in endoscopic videos[J]. Medical Image Analysis, 2022, 78: 102433.
[79] YAMLAHI A, TRAN T N, GODAU P, et al. Self-distillation for surgical action recognition[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2023: 637-646.
[80] SHARMA S, NWOYE C I, MUTTER D, et al. Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition[J]. International Journal of Computer Assisted Radiology and Surgery, 2023: 1-7.
[81] CHENG Y, LIU L, WANG S, et al. Why deep surgical models fail?: revisiting surgical action triplet recognition through the lens of robustness[C]//International Workshop on Trustworthy Machine Learning for Healthcare. Springer, 2023: 177-189.
[82] LI L, LI X, DING S, et al. SIRNet: fine-grained surgical interaction recognition[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 4212-4219.
[83] XIANG L, DING G, HAN J. Learning from multiple experts: self-paced knowledge distillation for long-tailed classification[C]//Proceedings of the European Conference on Computer Vision (ECCV). Springer, 2020: 247-263.
[84] SCHMID F, KOUTINI K, WIDMER G. Efficient large-scale audio tagging via transformer-tocnn knowledge distillation[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023: 1-5.
[85] LIU S, ZHANG L, YANG X, et al. Query2Label: A Simple Transformer Way to Multi-Label Classification[J/OL]. CoRR, 2021, abs/2107.10834. https://arxiv.org/abs/2107.10834.
[86] NWOYE C I, PADOY N. Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets[A]. 2023. arXiv: 2204.05235.
[87] CUI J, LIU S, TIAN Z, et al. Reslt: residual learning for long-tailed recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(3): 3695-3706.
修改评论