Decimal Digits Recognition from Lip Movement Using GoogleNet network
DOI:
https://doi.org/10.32792/jeps.v12i2.195Keywords:
viola jones and GoogleNetAbstract
Lip reading is a visual way to communicate with people through the movement of the lips, especially the
hearing impaired and people who are in noisy environments such as stadiums and airports. Lip reading is
not easy to face many difficulties, especially when taking a video of the person, including lighting,
rotation, the person’s position and different skin colors...etc. As researchers are constantly looking for
new techniques for lip-reading.
The main objective of the paper is to design and implement an effective system for identifying decimal
digits by movement. Our proposed system consists of two stages, namely, preprocessing, in which the
face and mouth area are detected, lips are determined and stored in a temporary folder to used viola jones.
The second stage is to take a GoogleNet neural network and insert the flange frame in it, where the
features will be extracted in the convolutional layer and then the classification process where the results
were convincing and we obtained an accuracy of 87% by using a database consisting of 35 videos and it
contained seven males and two females, and the number of the frame was 21,501 lips image.
References
A. Nagzkshay Chandra Aarkar, “ROI EXTRACTION AND FEATURE EXTRACTION FOR LIP
READING OF,” vol. 7, no. 1, pp. 484–487, 2020.
R. Bowden, “Comparing Visual Features for Lipreading,” no. September 2016.
A. Garg and J. Noyola, “Lip reading using CNN and LSTM,” Proc. - 30th IEEE Conf. Comput.
Vis. Pattern Recognition, CVPR 2017, vol. 2017-Jan, p. 3450, 2017.
J. S. Chung and A. Zisserman, “Learning to lip read words by watching videos,” Comput. Vis.
Image Underst., vol. 173, pp. 76–85, 2018, doi: 10.1016/j.cviu.2018.02.001.
A. Mesbah et al., “Lip Reading with Hahn Convolutional Neural Networks moments To cite this
version : HAL Id : hal-02109397 Lip Reading with Hahn Convolutional Neural Networks,” Image
Vis. Comput., vol. 88, pp. 76–83, 2019.
A. H. Kulkarni and D. Kirange, “Artificial Intelligence: A Survey on Lip-Reading Techniques,”
10th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2019, Jul. 2019, doi:
1109/ICCCNT45670.2019.8944628.
A. Bang et al., “Automatic Lip Reading using Image Processing,” vol. 2, no. 02, pp. 279–280,
I. Anina, Z. Zhou, G. Zhao, and M. Pietikainen, “OuluVS2: A multi-view audiovisual database for
non-rigid mouth motion analysis,” 2015 11th IEEE Int. Conf. Work. Autom. Face Gesture
Recognition, FG 2015, no. June 2016, 2015, doi: 10.1109/FG.2015.7163155.
Mr. Befkadu Belete Frew, “Audio-Visual Speech Recognition using LIP Movement for Amharic
Language,” Int. J. Eng. Res., vol. V8, no. 08, pp. 594–604, 2019, doi: 10.17577/ijertv8is080217.
Y. Lu and H. Li, “Automatic lip-reading system based on deep convolutional neural network and
attention-based long short-term memory,” Appl. Sci., vol. 9, no. 8, 2019, doi: 10.3390/app9081599.
K. K. Sudha and P. Sujatha, “A qualitative analysis of googlenet and alexnet for fabric defect
detection,” Int. J. Recent Technol. Eng., vol. 8, no. 1, pp. 86–92, 2019.
L. Pigou, S. Dieleman, P.-J. Kindermans, and B. Schrauwen, “Sign Language Recognition Using
Convolutional Neural Networks BT - Computer Vision - ECCV 2014 Workshops,” pp. 572–578,
, [Online]. Available: https://core.ac.uk/download/pdf/55693048.pdf.
C. D. Mccaig, “Electric Fields in Vertebrate Repair. Edited by R. B. Borgens, K. R. Robinson, J.
W. Vanable and M. E. McGinnis. Pp. 310. (Alan R. Liss, New York, 1989.) $69.50 hardback.
ISBN 0 8451 4274,” Exp. Physiol., vol. 75, no. 2, pp. 280–281, 1990, doi:
1113/expphysiol.1998.sp004170.
Y. LeCun and Y. Bengio, “Convolutional networks for images, speech, and time series,” The
handbook of brain theory and neural networks, vol. 3361. pp. 255–258, 1995, [Online]. Available:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.32.9297&rep=rep1&type=pdf.
A. Patil and M. Rane, “Convolutional Neural Networks: An Overview and Its Applications in
Pattern Recognition,” Smart Innov. Syst. Technol., vol. 195, pp. 21–30, 2021, doi: 10.1007/978-
-15-7078-0_3.
“Supervised Deep Learning Algorithms _ Types and Applications.” .
W. M. Learning and P. Kim, MATLAB Deep Learning. .
S. Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for
Classifying Images,” Int. J. Sci. Res. Publ., vol. 9, no. 10, p. p9420, 2019, doi:
29322/ijsrp.9.10.2019.p9420.
M. Z. Alom et al., “The History Began from AlexNet: A Comprehensive Survey on Deep Learning
Approaches,” 2018, [Online]. Available: http://arxiv.org/abs/1803.01164.
H. J. Jie and P. Wanda, “Runpool: A dynamic pooling layer for convolution neural network,” Int.
J. Comput. Intell. Syst., vol. 13, no. 1, pp. 66–76, 2020, doi: 10.2991/ijcis.d.200120.002.
N. A. Muhammad, A. A. Nasir, Z. Ibrahim, and N. Sabri, “Evaluation of CNN , Alexnet and
GoogleNet for Fruit Recognition,” vol. 12, no. 2, pp. 468–475, 2018, doi:
11591/ijeecs.v12.i2.pp468-475.
F. Altenberger and C. Lenz, “A Non-Technical Survey on Deep Convolutional Neural Network
Architectures.”
C. Szegedy, “Going deeper with convolutions,” no. January 2017, 2015, doi:
1109/CVPR.2015.7298594.
S. L. Wang, A. W. C. Liew, W. H. Lau, and S. H. Leung, “An automatic lipreading system for
spoken digitswith limited training data,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no.
, pp. 1760–1765, 2008, doi: 10.1109/TCSVT.2008.2004924.
Downloads
Published
Issue
Section
License
Copyright Policy
Authors retain copyright of their articles published in the Journal of Education for Pure Science (JEPS).
By submitting their work, authors grant the journal a non-exclusive license to publish, distribute, and archive the article in all formats and media.
License
All articles published in JEPS are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license permits unrestricted use, distribution, and reproduction in any medium, provided that the original author(s) and the source are properly credited.
Author Rights
Authors have the right to:
-
Share their articles on personal websites, institutional repositories, and academic platforms
-
Reuse their work in future research and publications
-
Distribute the published version without restriction
Journal Rights
The journal retains the right to:
-
Publish and archive the articles
-
Include them in indexing and archiving systems such as LOCKSS and CLOCKSS
-
Promote and disseminate the published work
Responsibility
The contents of all articles are the sole responsibility of the authors. The journal, editors, and editorial board are not responsible for any errors, opinions, or statements expressed in the published articles.
Open Access Statement
JEPS provides immediate open access to its content, supporting the principle that making research freely available to the public enhances global knowledge exchange.
This work is licensed under a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0/