Motif Discovery in DNA Sequences Using Scaled Conjugate Gradient Neural Networks
DOI:
https://doi.org/10.32792/jeps.v13i1.248الكلمات المفتاحية:
Bioinformatics، Data Mining، Deoxyribonucleic Acid (DNA)، Motif Discovery، Artificial Neural Networks (ANNs)، SCGالملخص
Finding motifs in DNA sequences is a current challenge and an essential step in bioinformatics.
Processing these issues needs considerable data analysis due to technical advancements in the
industry. Artificial Neural Networks (ANNs) are increasingly used, particularly for motif
identification and genomic analysis. In order to find motifs in DNA sequences, this work proposed a
supervised learning algorithm for feed-forward neural networks called Scaled Conjugate Gradient
(SCG) algorithm. The SCG algorithm utilizes a step-size scaling mechanism that is fully automated to
minimize time-consuming row searches during each training iteration. This algorithm was used in this
work for motif discovery to train code patterns and to reduce a multivariate global error function
dependent on the network weights. It trains many code patterns of lengths between 4 to 509 bases to
find them in a database with 2,227,382 bases; many experiments were done with different numbers of
hidden layers; our finding ten hidden layers provide the best results, with training percentage is 100%.
Compared to the other supervised learning neural network algorithms, One Step Secant, Gradient
Descent, Bayesian Regularization, and BFGS Quasi-Newton; our find SCG algorithm produced
higher accuracy (100%) and less time during the training and testing phases.
المراجع
P. Singh and N. Singh, “Role of Data Mining Techniques in Bioinformatics”, International Journal of
Applied Research in Bioinformatics, Vol. 11, No. 1, pp. 51–60, 2021, DOI:
4018/ijarb.2021010106.
Y. Wani et al., "Advances and applications of Bioinformatics in various fields of life",
International Journal of Fauna and Biological Studies, vol. 5, no. 2, pp. 3–10, 2018, [Online].
Available: http://www.ncbi.nlm.nih.gov/BLAST/ed.
P. Thareja and R. S. Chhillar, "A review of data mining optimization techniques for
bioinformatics applications", International Journal of Engineering Trends and Technology,
Vol. 68, No. 10, pp. 58–62, 2020, doi:10.14445/22315381/IJETT-V68I10P210.
M. Rocha and P. Ferreira, Bioinformatics Algorithms, Elsevier, Braga, Portugal, 2018.
S. Choudhuri, BIOINFORMATICS FOR BEGINNERS, Elsevier, Maryland, U.S., 2014.
G. Mariscal, Ó. Marbán, and C. Fernández, "A survey of data mining and knowledge discovery
process models and methodologies", The Knowledge Engineering Review Cambridge
University, Vol. 25, No. 2, pp. 137–166, 2010, DOI: 10.1017/S0269888910000032.
A. Yang, W. Zhang and J. Wang, "Review on the Application of Machine Learning
Algorithms in the Sequence Data Mining of DNA", Frontiers in Bioengineering and
Biotechnology, Vol. 8, No. 2, September, pp. 1– 13, 2020, DOI: 10.3389/fbioe.2020.01032.
R. Hasan and J. Uddin, "Data Mining Techniques for Informative Motif Discovery",
International Journal of Computer Applications, Vol. 88, No. 12, pp. 21–24, February 2014,
DOI:10.5120/15405-3901.
Y. He, Z. Shen and Q. Zhang, "A survey on deep learning in DNA/RNA motif mining", Briefings
in Bioinformatics, Vol. 22, No. 4, pp. 1–10, November 2021, DOI: 10.1093/bib/bbaa229.
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
for DNA Motif Discovery", Information Society of Serbia - ISOS, Serbia | Creative Commons
License: CC BY-NC-ND, pp. 232–236, 2018.
L. Cao, P. Liu, J. Chen, and L. Deng, "Prediction of Transcription Factor Binding
Sites Using a Combined Deep Learning Approach," the journal frontiers in Oncology,
Vol. 12, No.1 June, pp. 1–10, 2022, DOI:10.3389/fonc.2022.893520.
X. Shen????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????- Supervised Motif Learning Graph Neural
Network for Drug Discovery", Machine Learning for Molecules Workshop at NeurIPS,
pp.1–8, 2020, [Online]. Available: https://ml4molecules.github.io.
S. Mohanty, P. Kumar and A. Abdulhakim, "A Review on Planted (l, d) Motif Discovery
Algorithms for medical Diagnose", Multidisciplinary Digital Publishing Institute
(MDPI), Vol. 22, No. 3, pp. 1– 27, 2022, https://doi.org/10.3390/s22031204.
D. Wang, Q. Zhang, C. A. Yuan, X. Qin, Z. K. Huang, and L. Shang, "Motif Discovery via
Convolutional Networks with K-mer Embedding", Springer International Publishing,
Vol. 11644 LNCS., pp. 374–382, 2019, https://doi.org/10.1007/978-3-030-26969-2_36.
N. K. Lee, F. L. Azizan, Y. S. Wong, and N. Omar, "DeepFinder: An integration of feature
based and deep learning approach for DNA motif discovery", BIOTECHNOLOGY &
BIOTECHNOLOGICAL EQUIPMENT, Vol. 32, No. 3, pp. 759–768, 2018, DOI:
1080/13102818.2018.1438209.
J. Lanchantin, R. Singh, B. Wang, and Y. Qi, "Deep motif dashboard: Visualizing and
understanding genomic sequences using deep neural networks", Pacific Symposium on
Biocomputing, vol. 0, no. 212679, pp. 254–265, 2017, DOI: 10.1142/9789813207813_0025.
G. S. Pugalendhi, "Detection of Regulatory Motif in Eukaryotes by Self Organizing Map
Neural Networks", International Journal of Advanced Research in Computer Science, Vol.
, No. 10, pp. 92–96, 2013. Available Online at www.ijarcs.info, ISSN No. 0976-5697.
A. B. Yousif, H. K. Al-Khafaji, and T. Abbas, "A survey of exact motif finding algorithms",
Indones. J. Electr. Eng. Comput. Sci., Vol. 27, No. 2, pp. 1109–1118, 2022, DOI:
11591/IJEECS.v27.i2.pp1109-1118.
F. Zambelli, G. Pesole, and G. Pavesi, "Motif discovery and transcription factor binding sites
before and after the next-generation sequencing era", Briefings in Bioinformatics, Vol. 14, No.
, pp. 225–237, April 2012, DOI:10.1093/bib/bbs016.
G. Pavesi, G. Mauri, and G. Pesole, "In silico representation and discovery of transcription
factor binding sites", Briefings in bioinformatics, Vol. 5, No. 3, pp. 217–236,
September 2004, DOI: 10.1093/bib/5.3.217.
V. Rao, "C++ neural networks and fuzzy logic", Vol. 3, No. 8, IDG Books Worldwide, 1995.
C. Aggarwal, "Neural Networks and Deep Learning", USA, Springer, 2018.
X. Wu, F. Lü, B. Wang, and J. Cheng, "Analysis of DNA sequence pattern using probabilistic
neural network model", Journal of Research and Practice in Information Technology, Vol. 37,
No. 4, pp. 353–362, 2005, Online ISSN: 1443-458X.
U. S. Reddy, M. Arock, and A. V. Reddy, "Planted (l, d) - Motif Finding using Particle
Swarm Optimization", International Journal Computation Applied, Vol. ecot, No. 2, pp. 51–
, 2010, DOI: 10.5120/1541-144.
M. Moller, "A scaled conjugate gradient algorithm for fast supervised learning", the official
journal of the International Neural Network Society, Vol. 6, No. 4, pp. 525–533, November
, DOI:10.1016/S0893-6080(05)80056-5
التنزيلات
منشور
إصدار
القسم
الرخصة
Copyright Policy
Authors retain copyright of their articles published in the Journal of Education for Pure Science (JEPS).
By submitting their work, authors grant the journal a non-exclusive license to publish, distribute, and archive the article in all formats and media.
License
All articles published in JEPS are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license permits unrestricted use, distribution, and reproduction in any medium, provided that the original author(s) and the source are properly credited.
Author Rights
Authors have the right to:
-
Share their articles on personal websites, institutional repositories, and academic platforms
-
Reuse their work in future research and publications
-
Distribute the published version without restriction
Journal Rights
The journal retains the right to:
-
Publish and archive the articles
-
Include them in indexing and archiving systems such as LOCKSS and CLOCKSS
-
Promote and disseminate the published work
Responsibility
The contents of all articles are the sole responsibility of the authors. The journal, editors, and editorial board are not responsible for any errors, opinions, or statements expressed in the published articles.
Open Access Statement
JEPS provides immediate open access to its content, supporting the principle that making research freely available to the public enhances global knowledge exchange.
This work is licensed under a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0/