News Classification by N-Gram and Machine Learning Algorithms
DOI:
https://doi.org/10.32792/jeps.v12i2.202Keywords:
Multinomial Naïve Bayes, Decision Tree, N-gramAbstract
News is information obtained from different sources such as television, internet, newspapers and
magazines. Online news is published in very large numbers, and because there are so many news, it will
be challenging for users to find the pertinent information that matches their preferences. In this paper, the
news is categorized so that a specific category can be obtained quickly and easily. The BBC's newsgroup
was used in its five categories: sports, politics, business, technology and entertainment. The classification
algorithms Multinomial Naive Bayes (MNB) and decision tree (DT) were applied to the news data set
after extracting the features from it using n-gram method . Multinomial naïve Bayes algorithm has proven
superiority over decision tree with an accuracy 98.2%.
References
K. M. Verspoor and K. B. Cohen, “Encyclopedia of Systems Biology,” Encycl. Syst. Biol., no.
June 2018, 2013, doi: 10.1007/978-1-4419-9863-7
S. Theodoridis, Machine learning A Bayesian, vol. 53, no. 9. 2019.
N. Ortiz, R. D. Hernandez, R. Jimenez, M. Mauledeoux, and O. Aviles,
“Survey of biometric pattern recognition via machine learning techniques,” Contemp. Eng. Sci., vol.
, no. 34, pp. 1677–1694, 2018, doi: 10.12988/ces.2018.84166.
S. S. Mousavi, M. Schukat, and E. Howley, “Deep Reinforcement Learning:
An Overview,” Lect. Notes Networks Syst., vol. 16, pp. 426–440, 2018, doi: 10.1007/978-3-319-56991-
_32.
A. Ławrynowicz and V. Tresp, “Introducing machine learning,” Perspect. Ontol. Learn., vol. 18,
no. November, pp. 35–50, 2014, doi: 10.1007/978-3-
-67626-1_8.
D. Liparas, Y. Hacohen-Kerner, A. Moumtzidou, S. Vrochidis, and I. Kompatsiaris, “LNCS
- News Articles Classification Using Random Forests and Weighted Multimodal Features,” 2014.
[Online]. Available: http://www.bbc.com/news/business-25445906.
J. Ahmed and M. Ahmed, “Online News Classification Using Machine Learning Techniques,”
IIUM Eng. J., vol. 22, no. 2, pp. 210–225, 2021, doi: 10.31436/iiumej.v22i2.1662.
Z. M. Jawad and Z. A. Khalaf, “The combination of text classification system 1,” vol. 1, no. 1,
M. M. Rahman, M. A. Z. Khan, and A. A. Biswas, “Bangla News Classification using Graph
Convolutional Networks,” Jan. 2021, doi: 10.1109/ICCCI50826.2021.9402567.
P. P. Ramadhani and S. Hadi, “Text classification on the Instagram caption using support
vector machine,” J. Phys. Conf. Ser., vol. 1722, no. 1, 2021, doi: 10.1088/1742-6596/1722/1/012023.
M. B. Khan, “Urdu News Classification using Application of Machine Learning Algorithms on
News Headline,” IJCSNS Int. J. Comput. Sci. Netw. Secur., vol. 21, no. 2, p. 229, 2021, doi:
22937/IJCSNS.2021.21.2.27.
S. Xu, Y. Li, and Z. Wang, “Bayesian multinomial naïve bayes classifier to text classification,”
Lect. Notes Electr. Eng., vol. 448, no. November, pp. 347–352, 2017, doi: 10.1007/978-981-10-5041-
_57.
H. M. Ismail, S. Harous, and B. Belkhouche, “A Comparative Analysis of Machine Learning
Classifiers for Twitter Sentiment Analysis,” Res. Comput. Sci., vol. 110, no. 1, pp. 71–83, 2016, doi:
13053/rcs-110-1-6.
A. Abdi, “Three types of Machine Learning Algorithms List of Common Machine Learning
Algorithms,” no. November, 2016, doi: 10.13140/RG.2.2.26209.10088.
A. Dey, “Machine Learning Algorithms: A Review,” Int. J. Comput. Sci. Inf. Technol., vol. 7,
no. 3, pp. 1174–1179, 2016, [Online]. Available: www.ijcsit.com.
J. Han, M. Kamber, and J. Pei, “Data Mining : Concepts and Solution Manual,” Data Min.
Concepts Tech. Solut. Man., p. 135, 2012, [Online]. Available: https://moam.info/data-mining-conceptsand-
techniques- solutionmanual_59894d1b1723ddd1695415f9.html.
S. Kumar Thapa and S. Pokhrel, “Nepali News Document Classification using
Global Vectors and Long Short Term Memory.”
Downloads
Published
Issue
Section
License
The Authors understand that, the copyright of the articles shall be assigned to Journal of education for Pure Science (JEPS), University of Thi-Qar as publisher of the journal.
Copyright encompasses exclusive rights to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms and any other similar reproductions, as well as translations. The reproduction of any part of this journal, its storage in databases and its transmission by any form or media, such as electronic, electrostatic and mechanical copies, photocopies, recordings, magnetic media, etc. , will be allowed only with a written permission from Journal of education for Pure Science (JEPS), University of Thi-Qar.
Journal of education for Pure Science (JEPS), University of Thi-Qar, the Editors and the Advisory International Editorial Board make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in the Journal of education for Pure Science (JEPS), University of Thi-Qar are sole and exclusive responsibility of their respective authors and advertisers.