南方科技大学知识苑(SUSTech KC): Deep Neural Networks on Genetic Motif Discovery: the Interpretability and Identifiability Issues

题名	Deep Neural Networks on Genetic Motif Discovery: the Interpretability and Identifiability Issues
姓名	张宇
姓名拼音	ZHANG Yu
学号	11756001
学位类型	博士
学位专业	计算机科学
导师	唐珂
导师单位	计算机科学与工程系
外机构导师	Peter Tino
外机构导师单位	伯明翰大学
论文答辩日期	2022-03-30
论文提交日期	2022-07-01
学位授予单位	伯明翰大学
学位授予地点	伯明翰
摘要	Deep neural networks have made great success in a wide range of research fields and real-world applications. However, as a black-box model, the drastic advances in the performance come at the cost of model interpretability. This becomes a big concern especially for domains that are safety-critical or have ethical and legal requirements (e.g., avoiding algorithmic discrimination). In other situations, interpretability might be able to help scientists gain new ``knowledge'' that is learnt by the neural networks (e.g., computational genomics), and neural network based genetic motif discovery is such a field. It naturally leads us to another question: Can current neural network based motif discovery methods identify the underlying motifs from the data? How robust and reliable is it? In other words, we are interested in the motif identifiability problem. In this thesis, we first conduct a comprehensive review of the current neural network interpretability research, and propose a novel unified taxonomy which, to the best of our knowledge, provides the most comprehensive and clear categorisation of the existing approaches. Then we formally study the motif identifiability problem in the context of neural network based motif discovery (i.e., if we only have access to the predictive performance of a neural network, which is a black-box, how well can we recover the underlying ``true'' motifs by interpreting the learnt model). Systematic controlled experiments show that although accurate models tend to recover the underlying motifs better, the motif identifiability (a measure of the similarity between true motifs and learnt motifs) still varies in a large range. Also, the over-complexity (without overfitting) of a high-accuracy model (e.g., using 128 kernels while 16 kernels are already good enough) may be harmful to the motif identifiability. We thus propose a robust neural network based motif discovery workflow addressing above issues, which is verified on both synthetic and real-world datasets. Finally, we propose probabilistic kernels in place of conventional convolutional kernels and study whether it would be better to directly learn probabilistic motifs in the neural networks rather than post hoc interpretation. Experiments show that although probabilistic kernels have some merits (e.g., stable output), their performance is not comparable to classic convolutional kernels under the same network setting (the number of kernels).
关键词	Deep Learning Motif Discovery Interpretability Identifiability
语种	英语
培养类别	联合培养
入学年份	2017
学位授予年份	2022-07
参考文献列表	[1] Julius Adebayo et al. “Sanity Checks for Saliency Maps”. Advances in Neural Information Processing Systems. Vol. 31. 2018. [2] Philip Adler et al. “Auditing black-box models for indirect influence”. Knowledge and Information Systems 54.1 (2018), pp. 95–122. [3] Babak Alipanahi, Andrew Delong, Matthew T Weirauch, and Brendan J Frey. “Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning”. Nature Biotechnology 33.8 (2015), pp. 831–838. [4] Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus Gross. “Towards better understanding of gradient-based attribution methods for Deep Neural Networks”. International Conference on Learning Representations. 2018. [5] Marco Ancona, Cengiz Oztireli, and Markus Gross. “Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Value Approximation”. Proceedings of the 36th International Conference on Machine Learning. Vol. 97. 2019. [6] Robert Andrews, Joachim Diederich, and Alan B Tickle. “Survey and critique of techniques for extracting rules from trained artificial neural networks”. Knowledge-based systems 8.6 (1995), pp. 373–389. [7] Floriane Anstett-Collin, Lilianne Denis-Vidal, and Gilles Millérioux. “A priori identifiability: An overview on definitions and approaches”. Annual Reviews in Control(2020). [8] Alejandro Barredo Arrieta et al. “Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI”. Information Fusion58 (2020), pp. 82–115. [9] Sebastian Bach et al. “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation”. PloS one 10.7 (2015), e0130140. [10] David Baehrens et al. “How to explain individual classification decisions”. The Journal of Machine Learning Research 11 (2010), pp. 1803–1831. [11] Timothy L Bailey. “DREME: motif discovery in transcription factor ChIP-seq data”.Bioinformatics 27.12 (2011), pp. 1653–1659. [12] Timothy L Bailey, Mikael Boden, et al. “MEME SUITE: tools for motif discovery and searching”. Nucleic acids research (2009). [13] Timothy L Bailey and Charles Elkan. “Fitting a mixture model by expectation maximization to discover motifs in biopolymers”. Proceedings of the Second InternationalConference on Intelligent Systems for Molecular Biology. 1994. [14] Pierre Baldi, Peter Sadowski, and Daniel Whiteson. “Searching for exotic particles in high-energy physics with deep learning”. Nature communications 5.1 (2014), pp. 1–9. [15] Andrew R Barron. “Approximation and estimation bounds for artificial neural networks”. Machine learning 14.1 (1994), pp. 115–133. [16] David Bau, Bolei Zhou, et al. “Network dissection: Quantifying interpretability of deep visual representations”. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, pp. 6541–6549. [17] David Bau, Jun-Yan Zhu, et al. “GAN Dissection: Visualizing and Understanding Generative Adversarial Networks”. Proceedings of the International Conference on Learning Representations (ICLR). 2019. [18] Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. “Reconciling modern machine-learning practice and the classical bias–variance trade-off”. Proceedings of theNational Academy of Sciences 116.32 (2019), pp. 15849–15854. [19] J. M. Benitez, J. L. Castro, and I. Requena. “Are artificial neural networks black boxes?” IEEE Transactions on Neural Networks 8 (1997). [20] Jacob Bien and Robert Tibshirani. “Prototype selection for interpretable classification”. The Annals of Applied Statistics 5.4 (2011), pp. 2403–2424. [21] Francesco Bodria et al. “Benchmarking and survey of explanation methods for black box models”. arXiv preprint arXiv:2102.13076 (2021). [22] Thierry Bouwmans, Sajid Javed, Maryam Sultana, and Soon Ki Jung. “Deep neural network concepts for background subtraction: A systematic review and comparative evaluation”. Neural Networks 117 (2019), pp. 8–66. [23] Olcay Boz. “Extracting decision trees from trained neural networks”. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. 2002, pp. 456–461. [24] Tom Brown et al. “Language Models are Few-Shot Learners”. Advances in Neural Information Processing Systems. Vol. 33. 2020, pp. 1877–1901. [25] Joan Bruna and Stéphane Mallat. “Invariant scattering convolution networks”. IEEE transactions on pattern analysis and machine intelligence 35.8 (2013), pp. 1872–1886. [26] Rich Caruana et al. “Case-based explanation of non-case-based learning methods.” Proceedings of the AMIA Symposium. 1999, p. 212. [27] J. L. Castro, C. J. Mantas, and J. M. Benitez. “Interpretation of artificial neural networks by means of fuzzy rules”. IEEE Transactions on Neural Networks 13 (2002)... [232] Luisa M Zintgraf, Taco S Cohen, Tameem Adel, and Max Welling. “Visualizing Deep Neural Network Decisions: Prediction Difference Analysis”. International Conference on Learning Representations. 2017.
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/347870
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	Zhang Y. Deep Neural Networks on Genetic Motif Discovery: the Interpretability and Identifiability Issues[D]. 伯明翰. 伯明翰大学,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
11756001-张宇-计算机科学与工程（11669KB）	--	--	限制开放	--	请求全文