Digitization and Archiving of Company Invoices using Deep Learning and Text Recognition-Processing Techniques
Keywords
Optical Character Recognition, Image Preprocessing, Document Digitization, Artificial Intelligence, Deep LearningAbstract
Nowadays, it is crucial to transfer official documents such as invoices, dispatch notes, and receipts into digital environments and establish correct semantic relationships. However, understanding and processing these documents is a difficult process that requires significant time and effort. In recent years, the use of deep learning, image preprocessing, text detection, and optical character recognition (OCR) technologies have made this process easier. However, for text recognition and processing techniques to produce accurate results, documents must be clean and readable. Additionally, difficulties arising from time-consuming, tiring, error-prone, and cost-incurring human-powered digitalization processes must be reduced. The aim of this study is to digitize and archive scanned invoices and similar official documents using current artificial intelligence technologies, thereby enabling the most effective use of components such as time, cost, and human resources. The dataset used in the study includes 10,000 ".jpg" image files and 10,000 ".xml" data files. The model trained with the ResNet-50 architecture can detect text with accuracy rates of up to 97% on randomly selected images from the dataset. In an environment where a person can process an average of 2,112 documents per month, it is predicted that the trained artificial intelligence model can process 108,000 documents per month. With this developed method, businesses can quickly digitize and archive official documents such as invoices, dispatch notes, and receipts. Future studies propose the development of new methods that can produce better results using larger and more diverse datasets.
Downloads
References
Published: 2023-12-28
Issue: Vol. 2 No. 4 (2023) (view)
Section: Research Articles
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IMIENS open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.