Motif Discovery in DNA Sequences Using Scaled Conjugate Gradient Neural Networks
DOI:
https://doi.org/10.32792/jeps.v13i1.248Keywords:
Bioinformatics, Data Mining, Deoxyribonucleic Acid (DNA), Motif Discovery, Artificial Neural Networks (ANNs), SCGAbstract
Finding motifs in DNA sequences is a current challenge and an essential step in bioinformatics.
Processing these issues needs considerable data analysis due to technical advancements in the
industry. Artificial Neural Networks (ANNs) are increasingly used, particularly for motif
identification and genomic analysis. In order to find motifs in DNA sequences, this work proposed a
supervised learning algorithm for feed-forward neural networks called Scaled Conjugate Gradient
(SCG) algorithm. The SCG algorithm utilizes a step-size scaling mechanism that is fully automated to
minimize time-consuming row searches during each training iteration. This algorithm was used in this
work for motif discovery to train code patterns and to reduce a multivariate global error function
dependent on the network weights. It trains many code patterns of lengths between 4 to 509 bases to
find them in a database with 2,227,382 bases; many experiments were done with different numbers of
hidden layers; our finding ten hidden layers provide the best results, with training percentage is 100%.
Compared to the other supervised learning neural network algorithms, One Step Secant, Gradient
Descent, Bayesian Regularization, and BFGS Quasi-Newton; our find SCG algorithm produced
higher accuracy (100%) and less time during the training and testing phases.
References
P. Singh and N. Singh, “Role of Data Mining Techniques in Bioinformatics”, International Journal of
Applied Research in Bioinformatics, Vol. 11, No. 1, pp. 51–60, 2021, DOI:
4018/ijarb.2021010106.
Y. Wani et al., "Advances and applications of Bioinformatics in various fields of life",
International Journal of Fauna and Biological Studies, vol. 5, no. 2, pp. 3–10, 2018, [Online].
Available: http://www.ncbi.nlm.nih.gov/BLAST/ed.
P. Thareja and R. S. Chhillar, "A review of data mining optimization techniques for
bioinformatics applications", International Journal of Engineering Trends and Technology,
Vol. 68, No. 10, pp. 58–62, 2020, doi:10.14445/22315381/IJETT-V68I10P210.
M. Rocha and P. Ferreira, Bioinformatics Algorithms, Elsevier, Braga, Portugal, 2018.
S. Choudhuri, BIOINFORMATICS FOR BEGINNERS, Elsevier, Maryland, U.S., 2014.
G. Mariscal, Ó. Marbán, and C. Fernández, "A survey of data mining and knowledge discovery
process models and methodologies", The Knowledge Engineering Review Cambridge
University, Vol. 25, No. 2, pp. 137–166, 2010, DOI: 10.1017/S0269888910000032.
A. Yang, W. Zhang and J. Wang, "Review on the Application of Machine Learning
Algorithms in the Sequence Data Mining of DNA", Frontiers in Bioengineering and
Biotechnology, Vol. 8, No. 2, September, pp. 1– 13, 2020, DOI: 10.3389/fbioe.2020.01032.
R. Hasan and J. Uddin, "Data Mining Techniques for Informative Motif Discovery",
International Journal of Computer Applications, Vol. 88, No. 12, pp. 21–24, February 2014,
DOI:10.5120/15405-3901.
Y. He, Z. Shen and Q. Zhang, "A survey on deep learning in DNA/RNA motif mining", Briefings
in Bioinformatics, Vol. 22, No. 4, pp. 1–10, November 2021, DOI: 10.1093/bib/bbaa229.
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
for DNA Motif Discovery", Information Society of Serbia - ISOS, Serbia | Creative Commons
License: CC BY-NC-ND, pp. 232–236, 2018.
L. Cao, P. Liu, J. Chen, and L. Deng, "Prediction of Transcription Factor Binding
Sites Using a Combined Deep Learning Approach," the journal frontiers in Oncology,
Vol. 12, No.1 June, pp. 1–10, 2022, DOI:10.3389/fonc.2022.893520.
X. Shen????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????- Supervised Motif Learning Graph Neural
Network for Drug Discovery", Machine Learning for Molecules Workshop at NeurIPS,
pp.1–8, 2020, [Online]. Available: https://ml4molecules.github.io.
S. Mohanty, P. Kumar and A. Abdulhakim, "A Review on Planted (l, d) Motif Discovery
Algorithms for medical Diagnose", Multidisciplinary Digital Publishing Institute
(MDPI), Vol. 22, No. 3, pp. 1– 27, 2022, https://doi.org/10.3390/s22031204.
D. Wang, Q. Zhang, C. A. Yuan, X. Qin, Z. K. Huang, and L. Shang, "Motif Discovery via
Convolutional Networks with K-mer Embedding", Springer International Publishing,
Vol. 11644 LNCS., pp. 374–382, 2019, https://doi.org/10.1007/978-3-030-26969-2_36.
N. K. Lee, F. L. Azizan, Y. S. Wong, and N. Omar, "DeepFinder: An integration of feature
based and deep learning approach for DNA motif discovery", BIOTECHNOLOGY &
BIOTECHNOLOGICAL EQUIPMENT, Vol. 32, No. 3, pp. 759–768, 2018, DOI:
1080/13102818.2018.1438209.
J. Lanchantin, R. Singh, B. Wang, and Y. Qi, "Deep motif dashboard: Visualizing and
understanding genomic sequences using deep neural networks", Pacific Symposium on
Biocomputing, vol. 0, no. 212679, pp. 254–265, 2017, DOI: 10.1142/9789813207813_0025.
G. S. Pugalendhi, "Detection of Regulatory Motif in Eukaryotes by Self Organizing Map
Neural Networks", International Journal of Advanced Research in Computer Science, Vol.
, No. 10, pp. 92–96, 2013. Available Online at www.ijarcs.info, ISSN No. 0976-5697.
A. B. Yousif, H. K. Al-Khafaji, and T. Abbas, "A survey of exact motif finding algorithms",
Indones. J. Electr. Eng. Comput. Sci., Vol. 27, No. 2, pp. 1109–1118, 2022, DOI:
11591/IJEECS.v27.i2.pp1109-1118.
F. Zambelli, G. Pesole, and G. Pavesi, "Motif discovery and transcription factor binding sites
before and after the next-generation sequencing era", Briefings in Bioinformatics, Vol. 14, No.
, pp. 225–237, April 2012, DOI:10.1093/bib/bbs016.
G. Pavesi, G. Mauri, and G. Pesole, "In silico representation and discovery of transcription
factor binding sites", Briefings in bioinformatics, Vol. 5, No. 3, pp. 217–236,
September 2004, DOI: 10.1093/bib/5.3.217.
V. Rao, "C++ neural networks and fuzzy logic", Vol. 3, No. 8, IDG Books Worldwide, 1995.
C. Aggarwal, "Neural Networks and Deep Learning", USA, Springer, 2018.
X. Wu, F. Lü, B. Wang, and J. Cheng, "Analysis of DNA sequence pattern using probabilistic
neural network model", Journal of Research and Practice in Information Technology, Vol. 37,
No. 4, pp. 353–362, 2005, Online ISSN: 1443-458X.
U. S. Reddy, M. Arock, and A. V. Reddy, "Planted (l, d) - Motif Finding using Particle
Swarm Optimization", International Journal Computation Applied, Vol. ecot, No. 2, pp. 51–
, 2010, DOI: 10.5120/1541-144.
M. Moller, "A scaled conjugate gradient algorithm for fast supervised learning", the official
journal of the International Neural Network Society, Vol. 6, No. 4, pp. 525–533, November
, DOI:10.1016/S0893-6080(05)80056-5
Downloads
Published
Issue
Section
License
The Authors understand that, the copyright of the articles shall be assigned to Journal of education for Pure Science (JEPS), University of Thi-Qar as publisher of the journal.
Copyright encompasses exclusive rights to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms and any other similar reproductions, as well as translations. The reproduction of any part of this journal, its storage in databases and its transmission by any form or media, such as electronic, electrostatic and mechanical copies, photocopies, recordings, magnetic media, etc. , will be allowed only with a written permission from Journal of education for Pure Science (JEPS), University of Thi-Qar.
Journal of education for Pure Science (JEPS), University of Thi-Qar, the Editors and the Advisory International Editorial Board make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in the Journal of education for Pure Science (JEPS), University of Thi-Qar are sole and exclusive responsibility of their respective authors and advertisers.