News Classification by N-Gram and Machine Learning Algorithms
DOI:
https://doi.org/10.32792/jeps.v12i2.202Keywords:
Multinomial Naïve Bayes, Decision Tree, N-gramAbstract
News is information obtained from different sources such as television, internet, newspapers and
magazines. Online news is published in very large numbers, and because there are so many news, it will
be challenging for users to find the pertinent information that matches their preferences. In this paper, the
news is categorized so that a specific category can be obtained quickly and easily. The BBC's newsgroup
was used in its five categories: sports, politics, business, technology and entertainment. The classification
algorithms Multinomial Naive Bayes (MNB) and decision tree (DT) were applied to the news data set
after extracting the features from it using n-gram method . Multinomial naïve Bayes algorithm has proven
superiority over decision tree with an accuracy 98.2%.
References
K. M. Verspoor and K. B. Cohen, “Encyclopedia of Systems Biology,” Encycl. Syst. Biol., no.
June 2018, 2013, doi: 10.1007/978-1-4419-9863-7
S. Theodoridis, Machine learning A Bayesian, vol. 53, no. 9. 2019.
N. Ortiz, R. D. Hernandez, R. Jimenez, M. Mauledeoux, and O. Aviles,
“Survey of biometric pattern recognition via machine learning techniques,” Contemp. Eng. Sci., vol.
, no. 34, pp. 1677–1694, 2018, doi: 10.12988/ces.2018.84166.
S. S. Mousavi, M. Schukat, and E. Howley, “Deep Reinforcement Learning:
An Overview,” Lect. Notes Networks Syst., vol. 16, pp. 426–440, 2018, doi: 10.1007/978-3-319-56991-
_32.
A. Ławrynowicz and V. Tresp, “Introducing machine learning,” Perspect. Ontol. Learn., vol. 18,
no. November, pp. 35–50, 2014, doi: 10.1007/978-3-
-67626-1_8.
D. Liparas, Y. Hacohen-Kerner, A. Moumtzidou, S. Vrochidis, and I. Kompatsiaris, “LNCS
- News Articles Classification Using Random Forests and Weighted Multimodal Features,” 2014.
[Online]. Available: http://www.bbc.com/news/business-25445906.
J. Ahmed and M. Ahmed, “Online News Classification Using Machine Learning Techniques,”
IIUM Eng. J., vol. 22, no. 2, pp. 210–225, 2021, doi: 10.31436/iiumej.v22i2.1662.
Z. M. Jawad and Z. A. Khalaf, “The combination of text classification system 1,” vol. 1, no. 1,
M. M. Rahman, M. A. Z. Khan, and A. A. Biswas, “Bangla News Classification using Graph
Convolutional Networks,” Jan. 2021, doi: 10.1109/ICCCI50826.2021.9402567.
P. P. Ramadhani and S. Hadi, “Text classification on the Instagram caption using support
vector machine,” J. Phys. Conf. Ser., vol. 1722, no. 1, 2021, doi: 10.1088/1742-6596/1722/1/012023.
M. B. Khan, “Urdu News Classification using Application of Machine Learning Algorithms on
News Headline,” IJCSNS Int. J. Comput. Sci. Netw. Secur., vol. 21, no. 2, p. 229, 2021, doi:
22937/IJCSNS.2021.21.2.27.
S. Xu, Y. Li, and Z. Wang, “Bayesian multinomial naïve bayes classifier to text classification,”
Lect. Notes Electr. Eng., vol. 448, no. November, pp. 347–352, 2017, doi: 10.1007/978-981-10-5041-
_57.
H. M. Ismail, S. Harous, and B. Belkhouche, “A Comparative Analysis of Machine Learning
Classifiers for Twitter Sentiment Analysis,” Res. Comput. Sci., vol. 110, no. 1, pp. 71–83, 2016, doi:
13053/rcs-110-1-6.
A. Abdi, “Three types of Machine Learning Algorithms List of Common Machine Learning
Algorithms,” no. November, 2016, doi: 10.13140/RG.2.2.26209.10088.
A. Dey, “Machine Learning Algorithms: A Review,” Int. J. Comput. Sci. Inf. Technol., vol. 7,
no. 3, pp. 1174–1179, 2016, [Online]. Available: www.ijcsit.com.
J. Han, M. Kamber, and J. Pei, “Data Mining : Concepts and Solution Manual,” Data Min.
Concepts Tech. Solut. Man., p. 135, 2012, [Online]. Available: https://moam.info/data-mining-conceptsand-
techniques- solutionmanual_59894d1b1723ddd1695415f9.html.
S. Kumar Thapa and S. Pokhrel, “Nepali News Document Classification using
Global Vectors and Long Short Term Memory.”
Downloads
Published
Issue
Section
License
Copyright Policy
Authors retain copyright of their articles published in the Journal of Education for Pure Science (JEPS).
By submitting their work, authors grant the journal a non-exclusive license to publish, distribute, and archive the article in all formats and media.
License
All articles published in JEPS are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license permits unrestricted use, distribution, and reproduction in any medium, provided that the original author(s) and the source are properly credited.
Author Rights
Authors have the right to:
-
Share their articles on personal websites, institutional repositories, and academic platforms
-
Reuse their work in future research and publications
-
Distribute the published version without restriction
Journal Rights
The journal retains the right to:
-
Publish and archive the articles
-
Include them in indexing and archiving systems such as LOCKSS and CLOCKSS
-
Promote and disseminate the published work
Responsibility
The contents of all articles are the sole responsibility of the authors. The journal, editors, and editorial board are not responsible for any errors, opinions, or statements expressed in the published articles.
Open Access Statement
JEPS provides immediate open access to its content, supporting the principle that making research freely available to the public enhances global knowledge exchange.
This work is licensed under a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0/