Systematic review of spell-checkers for highly inflectional languages
- Published: 14 November 2019
- Volume 53 , pages 4051–4092, ( 2020 )
Cite this article
- Shashank Singh 1 &
- Shailendra Singh 1
1653 Accesses
9 Citations
Explore all metrics
Performance of any word processor, search engine, social media relies heavily on the spell-checkers, grammar checkers etc. Spell-checkers are the language tools which break down the text to check the spelling errors. It cautions the user if there is any unintentional misspelling occurred in the text. In the area of spell-checking, we still lack an exhaustive study that covers aspects like strengths, limitations, handled errors, performance along with the evaluation parameters. In literature, spell-checkers for different languages are available and each one possesses similar characteristics however, have a different design. This study follows the guidelines of systematic literature review and applies it to the field of spell-checking. The steps of the systematic review are employed on 130 selected articles published in leading journals, premier conferences and workshops in the field of spell-checking of different inflectional languages. These steps include framing of the research questions, selection of research articles, inclusion/exclusion criteria and the extraction of the relevant information from the selected research articles. The literature about spell-checking is divided into key sub-areas according to the languages. Each sub-area is then described based on the technique being used. In this study, various articles are analyzed on certain criteria to reach the conclusion. This article suggests how the techniques from the other domains like morphology, part-of-speech, chunking, stemming, hash-table etc. can be used in development of spell-checkers. It also highlights the major challenges faced by researchers along with the future area of research in the field of spell-checking.
This is a preview of subscription content, log in via an institution to check access.
Access this article
Subscribe and save.
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Price includes VAT (Russian Federation)
Instant access to the full article PDF.
Rent this article via DeepDyve
Institutional subscriptions
Similar content being viewed by others
Urdu Spell Checker: A Scarce Resource Language
Automatic Spelling Detection and Correction in the Medical Domain: A Systematic Literature Review
Context Sensitive Tamil Language Spellchecker Using RoBERTa
Explore related subjects.
- Artificial Intelligence
Abbreviations
Finite state machine
Dictionary lookup method
Morphological analysis
Edit distance
Minimum edit distance
Unicode splitting
Character-based longest short term memory
Soundex method
Levenstein edit distance
Confusion set
Reverse minimum edit distance
Direct dictionary lookup method
Edit distance method
Phonetic encoding method
Finite state representation
State table method
Finite state automata
Partition around medoid clustering
Double metaphone encoding
Word frequency
Sound and shape similarity
Reverse edit distance method
Tree-based algorithm
Parts of speech
Hidden Markov model
Graphical user interface
Finite state transition
Unknown word handling
Unknown proper noun handling
Application programming interface
Constituent word
Memory based language model
Finite state transition model
Dictionary approach
Canti check
Crowd sourcing
Abdullah M, Islam Z, Khan M (2007) Error-tolerant finite-state recognizer and string pattern similarity based spelling-checker for Bangla. In: Proceeding of 5th international conference on natural language processing (ICON)
Abeera VP, Aparna S, Rekha RU, Kumar MA, Dhanalakshmi V (2012) Morphological analyzer for Malayalam. In: Data engineering and management, pp 252–254
Allen JD et al (2012) The unicode standard, vol 3. Mountain view, CA
Google Scholar
Ambili T, Panchami KS, Subash N (2016) Automatic error detection and correction in Malayalam. IJSTE Int J Sci Technol Eng 3(02):92–96
Angell RC, Freund GE, Willett P (1983) Automatic spelling correction using a tri-gram similarity measure. Inf Process Manag 19(4):255–261
Badugu S (2014) Morphology based POS tagging on Telugu. Int J Comput Sci Issues 11(1):181–187
Balabantaray C, Sahoo B, Swain M, Sahoo K (2012) IIIT-Bh FIRE 2012 submission: MET Track Odia, pp 1–3
Banks T (2008) Strategies, foreign language larning difficulaties and teching. Dominican University of California, San Rafael
Bansal A, Banerjee E, Jha GN (2013) Corpora creation for Indian language technologies—The ILCI Project. In: The 6th proceedings of language technology conference (LTC ‘13)
Bhatti Z, Ismaili IA (2016) Phonetic-based Sindhi spell-checker system using a hybrid model. Digit Scholarsh Humanit 31(2):264–282
Bhatti Z, Ismaili IA, Shaikh AA, Javaid W (2012) Spelling error trends and patterns in Sindhi. J Emerg Trends Comput Inf Sci 3(10):1435–1439
Bhatti Z, Ismaili IA, Soomro WJ, Hakro DN (2014) Word segmentation model for Sindhi text. Am J Comput Res Repos 2(1):1–7
Bhowmik K (2014) Development of a word-based spell-checker for Bangla language. Military Institute of Science and Technology, United International University, Dhaka
Borah PP, Talukdar G, Baruah A (2014) Assamese word sense disambiguation using supervised learning. In: International conference on contemporary computing and informatics (IC3I). IEEE, pp 946–950
Bruno M, Silva MJ (2004) Spelling correction for search engine queries. In: Advanced natural language processing. Springer, Berlin, pp 372–383
Budanitsky A, Hirst G (2006) Evaluating wordnet-based measures of lexical semantic relatedness. Comput Linguist 32(1):13–47
MATH Google Scholar
Budgen D, Brereton P (2006) Performing systematic literature reviews in software engineering. In: ICSE’06 Proceedings of the 28th international conference on Software engineering, pp 1051–1052
Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, pp 161–175
Chakrabarti B (1994) A comparative study of Santali and Bengali. K.P. Bagchi & Co., Kolkata
Chaudhuri BB (2001) Reversed word dictionary and phonetically similar word grouping based spell-checker to Bangla text. In: Proceeding of LESAL Workshop, Mumbai
Chaudhuri BB (2002) Towards Indian language spell-checker design. In: Proceedings—language engineering conference, LEC 2002, pp 139–146
Choudhury R, Deb N, Kashyap K (2019) Context sensitive spelling checker for Assamese language. In: Kalita J, Balas V, Borah S, Pradhan R (eds) Recent developments in machine learning and data analytics. Springer, Singapore, pp 177–188
Cordeiro de Amorim R, Zampieri M (2013) Recent advances in natural language processing. In: IEEE international conference on recent advances in natural language processing, pp 172–178
Dahar IA, Abbas F, Rajput U, Hussain A, Azhar F (2018) An efficient Sindhi spelling checker for microsoft word. Int J Comput Sci Netw Secur 18(5):144–150
Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Commun ACM 7(3):171–176
Das M, Borgohain S, Gogoi J, Nair SB (2002a) Design and implementation of a spell-checker for Assamese. In: Language engineering conference, Proceedings IEEE, pp 156–162
Das M, Borgohain S, Gogoi J (2002b) Design and implementation of a spell-checker for Assamese. In: Language engineering conference, proceedings IEEE, pp 156–162
Das M, Borgohain S, Gogoi J, Nair SB (2002c) Design and implementation of a spell-checker for Assamese. In: Proceedings—language engineering conference, LEC 2002, pp 156–162
Daud A, Khan W, Che D (2016) Urdu language processing: a survey. Artif Intell Rev 47(3):279–311
Dhanabalan T, Parthasarathi R, Geetha TV (2003) Tamil spell-checker. In: 6th Tamil internet conference, Chennai, Tamilnadu, India, pp 18–27
Dhanju KS, Lehal GS, Saini TS, Kaur A (2015) Design and implementation of Punjabi spell-checker. Int J Sci Technol 8(27):1–12
Dongre VJ, Mankar VH (2010) A review of research on Devnagari character recognition. Int J Comput Appl 12(2):8–15
Dowlagar S, Mamidi R (2015) A semi supervised dialog act tagging for Telugu. In: Proceedings of the 12th international conference on natural language processing, pp 376–383
Etoori P, Chinnakotla M, Mamidi R (2018) Automatic spelling correction for resource-scarce languages using deep learning. In: Proceeding of ACL 2018, Student research workshop, pp 146–152
Fossati F, Di Eugenio B (2007) I saw TREE trees in the park: how to correct real-word spelling mistakes. In: LREC , pp 896–901
Ganfure GO, Midekso D (2014) Design and implementation of morphology based spell-checker. Int J Sci Technol Res 3(12):118–125
Ghafour HHA, El-bastawissy A, Heggazy AFA (2011) AEDA : Arabic edit distance algorithm towards a new approach for Arabic name matching. In: IEEE, international conference on computer engineering and systems, pp 307–311
Gokcay E, Gokcay D (1995) Combining statistics and heuristics for language identification. In: Proceedings of the 4th annual symposium on document analysis and information retrieval
Gottron T, Lipka N (2010) A comparison of language identification approaches on short, query-style texts. Lecture notes in computer science, pp 611–614
Goyal V, Lehal GS (2010) Automatic standardization of spelling variations of Hindi text. In: International conference on computer and communication technology ICCCT 2010, pp 764–767
Gupta V (2014) Automatic stemming of words for Punjabi. In: Advances in signal processing and intelligent recognition systems, pp 73–84
Gupta P, Goyal V (2009) Implementation of rule-based algorithm for Sandhi-Vicheda of compound Hindi words. Int J Comput Sci Issues 3:45–49
Gupta V, Lehal GS (2011) Punjabi language stemmer for nouns and proper names. In: Proceedings of the 2nd workshop on South and Southeast Asian Natural Language Processing (WSSANLP), IJCNLP 2011, pp 35–39
Gupta V, Lehal GS (2019) Complete pre processing phase of Punjabi text extractive summarization system. In: Proceedings of COLING 2012: demonstration papers, pp 199–206
Harrison GL, Goegan LD, Jalbert R, Mcmanus K, Sinclair K, Spurling J (2016) Predictors of spelling and writing skills in first and second language learners. Read Writ 29(1):69–89
Hassan A, Amin MR, Al Azad AK, Mohammed N (2017) Sentiment analysis on Bangla and Romanized Bangla text using deep recurrent models. In: IWCI 2016—2016 international workshop on computational intelligence, pp 51–56
Hayes B, Lahiri A (1991) Bengali international phonology. Nat Lang Linguist Theory 9(1):47–96
Hema PH, Sunitha C (2016) Malayalam spell-checker using N-gram method. In: Computational intelligence in data mining-advances in intelligent systems and computing, vol 1, pp 217–225
Heshaam F (2010) Detection and correction of real-word spelling errors in Persian language. In: IEEE-international conference on natural language processing and knowledge engineering (NLP-KE)
Hoque T, Kaykobad M (2002) Coding system for Bangla spell-checker. In: 5th international conference on computer and information technology, pp 186–190
Huang A (2008) Similarity measures for text document clustering. In: Proceedings of the 6th New Zealand computer science research student conference (NZCSRSC2008)
Humayoun M, Ranta A (2014) Developing Punjabi morphology, corpus and lexicon. In: Proceedings of the 24th Pacific Asia conference on language, information and computation
Hussain I, Saharia N, Sharma U (2011) Development of assamese wordnet. In: Nath B, Sharma U, Bhattacharyya DK (eds) Machine intelligence: recent advances. Narosa Publishing House, ISBN-978-81-8487-140-1
Iqbal S, Anwar W, Bajwa UI, Rehman Z (2013) Urdu spell-checking: reverse edit distance approach. In: Proceedings of the 4th workshop on South and Southeast Asian Natural Language Processing, pp 58–65
Islam A, Inkpen D (2009) Real-word spelling correction using google web 1T 3-grams. In: EMNLP’09, conference on empirical methods in natural language processing, pp 1241–1249
Islam MZ, Uddin M, Khan M (2007) A light weight stemmer for Bengali and its use in spelling checker. In: Proceedings of international conference on digital communication and computer applications (DCCA), pp 19–23
Jain A, Jain M (2014) Detection and correction of non-word spelling errors in Hindi language. In: International conference on data mining and intelligent computing (ICDMIC)
Jain U, Kaur J (2015) Text chunker for Punjabi. Int J Curr Eng Technol 5(5):3349–3353
Jananie S, Sarveswaran K (2014) Hybrid approach for spell-checking of Tamil language. In: Proceedings of the Peradeniya University, International Research Session, vol 18, no 1
Jindal S (2017) Building English–Punjabi parallel corpus for machine translation. Int J Comput Appl 180(8):26–29
Justin Z, Dart P (1995) Finding approximate matches in large lexicons. Softw Pract Exp 25(3):331–345
Kabeer R, Idicula SM (2014) Text summarization for Malayalam documents—an experience. In: Proceedings of international conference on data science and engineering, ICDSE 2014, pp 145–150
Kashyap K, Sarma H, Sarma SK (2015) Luitspell: development of an Assamese language spell-checker for open office writer. Eur J Adv Eng Technol 2(5):135–138
Kashyap L, Joshi SR, Bhattacharyya P (2017) Insights on Hindi WordNet coming from the IndoWordNet. In: The Wordnet in Indian languages, pp 19–43
Kaur H, Kaur G, Kaur M (2015) Punjabi spell-checker using dictionary clustering. Int J Sci Eng Technol Res 4(7):2369–2374
Keselj V, Peng F, Cercone N, Thomas C (2003) N-gram based author profiles for authorship attribution. In: Proceedings of the conference of the Pacific association for computational linguistics (PACLING)
Khan NH, Saha GC (2014) Checking the correctness of Bangla words using N-gram. Int J Comput Appl 89(11):1–3
Kleenankandy J (2014) Implementation of Sandhi-rule based compound word generator for Malayalam. In: Proceedings of 4th international conference on advances in computing and communications, ICACC 2014, pp 134–137
Kukich K (1992) Technique for automatically correcting words in text. ACM Comput Surv 24(4):377–439
Kumar SS, Suma S, Sneha N (2017) Spell-checker for Kannada OCR. Int Digit Libr Technol Res 1(4):1–12
Lakshmi K, Babu T (2018) A new hybrid algorithm for Telugu word retrieval and recognition. Int J Intell Eng Syst 11(4):117–127
Lawaye AA, Purkayastha BS (2016) Design and implementation of spell-checker for Kashmiri. Int J Sci Res 5(7):199–202
Lehal GS (2007) Design and implementation of Punjabi spell-checker. Int J Syst Cybern Inform 3(8):70–75
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710
MathSciNet Google Scholar
Mahar JA, Shaikh H, Memon GQ (2012) A model for Sindhi text segmentation in word tokens. Sindh Univ Res J SUR J (Sci Ser) 44(1):43–48
Mala C, Parameshwari K, Rao GUM, Kulkarni AP (2012) Telugu spell-checker. In: International Telugu internet conference proceedings, pp 1–8
Mandal P, Hossain BMM (2017a) Clustering-based Bangla Spell-checker. In: IEEE international conference on imaging, vision and pattern recognition (icIVPR)
Mandal P, Hossain BMM (2017b) A systematic literature review on spell-checkers for Bangla language. Int J Mod Educ Comput Sci 9(6):40–47
Manohar N, Lekshmipriya PT, Jayan V, Bhadran VK (2015) Spell-checker for Malayalam using finite state transition models. In: IEEE recent advances in intelligent computational systems, RAICS 2015, pp 157–161
Mateen A, Malik MK, Nawaz Z, Danish HM, Siddiqui MH (2017) A hybrid stemmer of Punjabi Shahmukhi script. Int J Comput Sci Netw Secur 17(8):90–97
Mishra D, Venugopalan M, Gupta D (2016) Context-specific lexicon for Hindi reviews. In: 6th international conference on advances in computing and communications, ICACC 2016, vol 93, pp 554–563
Mittal S, Sethi NS, Sharma SK (2014) Part of speech tagging of Punjabi language using N gram model. Int J Comput Appl 100(19):20–23
Mohapatra DD (2018) A sketch of Odia morphology. Glob J Res Anal 7(4):80–81
Mon AM (2012) Spell-checker for Myanmar language. In: International conference on information retrieval and knowledge management (CAMP). IEEE, pp 12–16
Murthy KN (2001) Computer processing of Kannada language. Workshop at Kannada University, pp 1–10
Mustafa SH (2005) Character contiguity in N -gram-based word matching: the case for Arabic text searching. Inf Process Manag 41:819–827
Naseem T (2004) A hybrid approach for Urdu spell-checking. National University of Computer & Emerging Sciences
Naseem T, Hussain S (2007) A novel approach for ranking spelling error corrections for Urdu. Lang Resour Eval 41(2):117–128
Nielsen J (1999) Internet-based spelling checker dictionary system with automatic updating
Nisha M, Reji Rahmath K, Rekha Raj CT, Reghu Raj PC (2015) Malayalam morphological analysis using MBLP approach. In: Proceedings of international conference on soft-computing and network security, ICSNS 2015
Pareek G, Modi D (2016) Feature extraction in Hindi text summarization. Ski Res J 6(2):14–19
Peterson JL (1980) Computer programs for detecting and correcting spelling errors. Commun ACM 23(12):676–687
Prathibha RJ, Padma MC (2016) Design of morphological analyzer for Kannada inflectional words using hybrid approach. Int J Comput Linguist Res 7(4):133–161
Pratip S, Chaudhuri BB (2013) A simple real-word error detection and correction using local word bigram and trigram. In: Proceedings of the 25th conference on computational linguistics and speech processing (ROCLING 2013), pp 211–220
Puri R, Bedi RPS, Goyal V (2015) Punjabi stemmer using Punjabi wordnet database. Indian J Sci Technol 8(27):1–5
Rahman MU (2015) Towards Sindhi corpus construction. In: Conference on language and technology, pp 1–6
Rahutomo F, Kitasuka T, Aritsugi M (2012) Semantic cosine similarity. In: 7th international student conference on advanced science and technology ICAST
Rajashekara Murthy S, Akshatha AN, Upadhyaya CG, Ramakanth Kumar P (2017) Kannada spell-checker with Sandhi splitter. In: International conference on advances in computing, communications and informatics, ICACCI 2017, pp 950–956
Rajashekara Murthy S, Madi V, Sachin D, Ramakanth PK (2012) A non-word Kannada spell-checker using morphological analyzer and dictionary lookup method. Int J Eng Sci Emerg Technol 2(2):43–52
Rama T, Sowmya V (2018) A dependency treebank for Telugu. In: Proceedings of the 16th international workshop on treebanks and linguistics theories, pp 119–128
Robertson AM, Willet P (1998) Applications of N-grams in textual information systems. J Doc 54(1):48–67
Rout Y, Santi PK, Subudhi S, Sahu B (2013) An approach for designing Odia spell-checker. In: National conference on recent advances on business intelligence & data mining (RABIDM 2013), pp 1–7
Saharia N (2011) A first step towards parsing of Assamese text. Spec Vol Probl Parsing Indian Lang 11(5):30–34
Saharia N, Konwar KM (2012) LiuitPad: a fully unicode compatible Assamese writing software. In: Proceedings of the 2nd workshop an advances in text input methods (WTIM 2) COLLING 2012, pp 79–88
Saharia N, Das D, Sharma U, Kalita J (2009) Part of speech tagger for Assamese text. In: Proceedings of the ACL-IJCNLP conference short papers, pp 33–36
Saharia N, Sharma U, Kalita J (2012) Analysis and evaluation of stemming algorithms : a case study with Assamese. In: International conference on advances in computing, communications and informatics, ICACCI 2012, pp 842–846
Sahoo K, Vidyasagar VE (2003) Kannada WordNet—a lexical database. In: Conference on convergent technologies for Asia-Pacific Region (TENCON 2003), vol 4, pp 1352–1356
Sakuntharaj R, Mahesan S (2016) A novel hybrid approach to detect and correct spelling in Tamil text. In: International conference on information and automation for sustainability: interoperable sustainable smart systems for next generation, ICIAFS 2016. IEEE, pp 1–6
Sakuntharaj R, Mahesan S (2017) Use of a novel hash-table for speeding-up suggestions for misspelt Tamil words. In: International conference on industrial and information systems (ICIIS) IEEE, pp 1–5
Santosh T, Sulochana KG, Kumar RR (2002) Malayalam spell-checker. In: Proceedings of the international conference on universal knowledge and language
Saranya SK (2008) Morphological analyzer for Malayalam verbs. Amrita Vishwa Vidyapeetham, Amrita School of Engineering, Coimbatore
Sarma P (2017) An approach to prepare lexicons of Assamese text for unit selection concatenation TTS. Int J Emerg Trends Sci Technol 4(8):5631–5637
Sarma SK, Medhi R, Gogoi M, Saikia U (2010) Foundation and structure of developing an Assamese wordnet. In: Proceedings of 5th international conference of the global WordNet Association
Sarmah J, Barman AK, Sharma SK (2013) Automatic Assamese text categorization using wordnet. In: International conference of advances in computing, communications and informatics IEEE, pp 85–89
Segar J, Sarveswaran K (2015) Contextual spell-checking for Tamil language. In: 14th Tamil internet conference, pp 1–5
Sekhar N, Pushpak D, Jyoti B (2017) The WordNet in Indian Languages. Springer Nature, Singapore
Sethi DP (2014) A survey on Odia computational morphology. Int J Adv Res Comput Eng Technol 3(3):623–625
Shaalan K, Allam A, Gomah A (2003) Towards automatic spell-checking for Arabic. In: Proceedings of the 4th conference on language engineering, Egyptian Society of language engineering (ELSE), Egypt, pp 240–247
Shah ZA, Mashori GM (2013) Oxford English-Sindhi dictionary: a critical study in lexicography. ELF Annu Res J 13:37–46
Shambhavi BR, Ramakanth Kumar P, Srividya K, Jyothi BJ, Kundargi S, Shastri G (2011) Kannada morphological analyser and generator using trie. Int J Comput Sci Netw Secur 11(1):112–116
Sheykholeslam MH, Minaei-Bidgoli B, Juzi H (2013) A framework for spelling correction in Persian language using noisy channel model. In: LREC, pp 58–65
Singh A (2016) Review for dialects in Punjabi language. Int J Innov Adv Comput Sci 5(8):25–30
Singh J, Singh G, Singh R, Singh P (2018) Morphological evaluation and sentiment analysis of Punjabi text using deep learning classification. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.04.003
Article Google Scholar
Sinha RMK, Singh KS (1984) A programme for correction of single spelling errors in Hindi words. IETE J Res 30(6):249–251
Solak A (1993) Design and implementation of a spelling checker for Turkish. Institute of Engineering & Sciences, Bilkent University, Ankara
Sooraj S, Manjusha K, Anand Kumar M, Soman KP (2018) Deep learning based spell-checker for Malayalam language. J Intell Fuzzy Syst 34(3):1427–1434
Strnad J (2001) Hindi dictionaries and the Hindi lexicographical corpus. Festschrift Helmut Nespital, pp 1–14
Subhashini R, Kumar VJS (2010) Evaluating the performance of similarity measures used in document clustering and information retrieval. In: IEEE, 1st international conference on integrated intelligent computing
Tomovic A, Janicic P, Keselj V (2006) N-gram based classification and unsupervised hierarchical clustering of genome sequences. Comput Methods Program Biomed 81:137–153
Uzzaman N, Khan M (2006) A comprehensive Bangla spelling checker. BRAC University, Dhaka
Varghese ST, Sulochana KG, Kumar RR (2002) Malayalam spell-checker. In: Proceedings of the international conference on universal knowledge and language
Veerappan R, Antony PJ, Saravanan S, Soman KP (2011) A rule-based Kannada morphological analyzer and generator using finite state transducer. Int J Comput Appl 27(10):45–52
Verberne S (2002) Context-sensitive spell-checking based on word trigram probabilities
Wasala A, Weerasinghe R, Pushpananda R (2010) A data-driven approach to checking and correcting spelling errors in Sinhala. Int J Adv ICT Emerg Reg 03(01):11–24
Wu S, Mamber U (1992) AGREP—a fast approximate pattern matching tool. In: Proceedings of the Winter 1992 USENIX conference San Francisco USA. Berkeley, pp 153–162
Yue T, Briand LC, Labiche Y (2011) A systematic review of transformation approaches between user requirements and analysis models. Requir Eng 16(2):75–99
Zampieri M, Cordeiro de Amorim R (2014) Between sound and spelling: combining phonetics and clustering algorithms to improve target word recovery. In: International conference on natural language processing, pp 438–449
Zhang Y, Zhao X (2013) Automatic error detection and correction of text: the state of the art. In: 6th international conference on intelligent networks and intelligent systems, ICINIS, pp 274–277
Zhuang L, Bao T, Zhu X, Wang C, Naoi S (2004) A Chinese OCR spelling check approach based on statistical language models. In: International conference on systems, man and cybernetics, IEEE, vol 5, pp 4727–4732
Download references
Acknowledgements
The authors thank the reviewers for their insightful comments. The authors would also like to thank Ministry of Electronics and IT, Government of INDIA, for providing fellowship under Grant Number: PhD-MLA-4 (69)/2015-16 (Visvesvaraya PhD Scheme for Electronics and IT) to pursue Ph.D. work.
Author information
Authors and affiliations.
Department of Computer Science and Engineering, Punjab Engineering College (Deemed to be University), Chandigarh, India
Shashank Singh & Shailendra Singh
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Shashank Singh .
Additional information
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Reprints and permissions
About this article
Singh, S., Singh, S. Systematic review of spell-checkers for highly inflectional languages. Artif Intell Rev 53 , 4051–4092 (2020). https://doi.org/10.1007/s10462-019-09787-4
Download citation
Published : 14 November 2019
Issue Date : August 2020
DOI : https://doi.org/10.1007/s10462-019-09787-4
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Spell-check
- Non-word errors
- Real-word errors
- Dictionary lookup
- Edit-distance
- Recurrent neural network (RNN)
- Find a journal
- Publish with us
- Track your research
Systematic review of spell-checkers for highly inflectional languages
New citation alert added.
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
New Citation Alert!
Please log in to your account
Information & Contributors
Bibliometrics & citations, view options.
- Talpur N Abdulkadir S Alhussian H Hasan M Aziz N Bamhdi A (2023) Deep Neuro-Fuzzy System application trends, challenges, and future perspectives: a systematic survey Artificial Intelligence Review 10.1007/s10462-022-10188-3 56 :2 (865-913) Online publication date: 1-Feb-2023 https://dl.acm.org/doi/10.1007/s10462-022-10188-3
- Buşe-Dragomir A Popescu P Mihăescu M (2021) Spell Checker Application Based on Levenshtein Automaton Intelligent Data Engineering and Automated Learning – IDEAL 2021 10.1007/978-3-030-91608-4_5 (45-53) Online publication date: 25-Nov-2021 https://dl.acm.org/doi/10.1007/978-3-030-91608-4_5
Index Terms
Applied computing
Arts and humanities
Language translation
Document management and text processing
Computing methodologies
Artificial intelligence
Natural language processing
Language resources
Recommendations
Error detection in highly inflectional languages.
Error detection in OCR output using dictionaries and statistical language models (SLMs) have become common practice for some time now, while designing post-processors. Multiple strategies have been used successfully in English to achieve this. However, ...
A morphosyntactic Brill Tagger for inflectional languages
In this paper we present and evaluate a Brill morphosyntactic transformation-based tagger adapted for specifics of highly inflectional languages. Multi-phase tagging with grammatical category matching transformations and lexical transformations brings ...
HINDIA: a deep-learning-based model for spell-checking of Hindi language
The spelling error is a mistake occurred while typing the text document. The applications like search engines, information retrieval, emails, etc., require user typing. In such applications, good spell-checker is essential to rectify the ...
Information
Published in.
Kluwer Academic Publishers
United States
Publication History
Author tags.
- Spell-check
- Non-word errors
- Real-word errors
- Dictionary lookup
- Edit-distance
- Recurrent neural network (RNN)
- Research-article
Contributors
Other metrics, bibliometrics, article metrics.
- 2 Total Citations View Citations
- 0 Total Downloads
- Downloads (Last 12 months) 0
- Downloads (Last 6 weeks) 0
View options
Login options.
Check if you have access through your login credentials or your institution to get full access on this article.
Full Access
Share this publication link.
Copying failed.
Share on social media
Affiliations, export citations.
- Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
- Download citation
- Copy citation
We are preparing your search results for download ...
We will inform you here when the file is ready.
Your file of search results citations is now ready.
Your search export query has expired. Please try again.
- Corpus ID: 14232632
Spell Checking Techniques in NLP: A Survey
- Nikhil Gupta , Pratistha Mathur
- Published 2012
- Computer Science, Linguistics
40 Citations
Survey of spell checking techniques for malayalam: nlp, design and implementation of hinspell -hindi spell checker using hybrid approach, study of spell checking techniques and available spell checkers in regional languages: a survey, spell checker for non word error detection: survey, a novel hybrid approach to detect and correct spelling in tamil text, spell checking and error correcting system for text paragraphs written in punjabi language using hybrid approach, design and implementation of online punjabi spell checker based on dynamic programming, frequency based spell checking and rule based grammar checking.
- Highly Influenced
A New Algorithm to Design and Implementation of Multilingual Spellchecker and Corrector
Sequence clustering algorithm for spell checking and spell suggestion in tamil language, 9 references, correcting spelling errors by modelling their causes, a technique for computer detection and correction of spelling errors, natural language processing and information retrieval, error pattern in bangla text, natural language processing and information retrieval, related papers.
Showing 1 through 3 of 0 Related Papers
Accommodations Toolkit
Spell check: research.
Share this page
- Share this page on Facebook.
- Share this page on Twitter.
- Share this page on LinkedIn.
- Share this page via email.
- Print this page.
This fact sheet on spell check is part of the Accommodations Toolkit published by the National Center on Educational Outcomes (NCEO). It summarizes information and research findings on spell check as an accommodation [1] . This toolkit also contains a summary of states’ accessibility policies for spell check .
What is spell check? Spell check is a software feature that identifies possible misspellings, and either autocorrects or suggests possible corrections (Cullen et al, 2008; MacArthur, 1999). It is sometimes referred to as spell checker, spelling checker, spelling assistance. Spell check can help students correct spelling errors with less time focused on the writing mechanics of spelling which then allows them to concentrate more broadly on developing ideas or content in the writing process (MacArthur, 1999).
What are the research findings on who should use this accommodation? Spell check has been used for students with various disabilities in the elementary grades (Finch & Finch, 2013) and secondary grades (Finizio, 2008; Koretz & Hamilton, 2001). According to research findings, most of the students who receive this accommodation have specific learning disabilities (SLD) (Finizio, 2008; Koretz & Hamilton, 2001).
What are the research findings on implementation of spell check? No studies were identified on the implementation of spell check. Three studies examined the frequency of spell check.
- Two studies examined how frequently students received the spell check accommodation, and both found that spell check was one of the least frequently assigned accommodations at the elementary (Finch & Finch, 2013) and secondary (Koretz & Hamilton, 2001) levels.
- Finizio (2008) examined the match relationship between instructional accommodations and state assessment accommodations documented in the individualized education programs (IEPs) of secondary students with various disabilities, most of whom had SLD. The results indicated that spell checking was mostly used as an instructional accommodation and not generally used on assessments.
What perceptions do students and teachers have about spell check? No studies were found that examined student or teacher perceptions of spell check as an assessment accommodation.
What have we learned overall? Research studies found that spell check is one of least assigned assessment accommodations, though it may be used more often during instruction. It is used for elementary and secondary students with various disabilities, and is most frequently provided to students with SLD. No studies were identified that examined the effect of spell check on student performance. Research is needed on the effect of spell check on the performance of students with different disabilities, including English learners with disabilities. Likewise, there is a need to explore teacher and student perceptions of the spell check accommodation.
- Cullen, J., Richards, S., & Frank, C. L. (2008). Using software to enhance the writing skills of students with special needs . Journal of Special Education Technology , 23 (2), 33–44. https://doi.org/10.1177/016264340802300203
- Finch, W. H., & Finch, M. E. H. (2013). Differential item functioning analysis using a multilevel Rasch mixture model: Investigating the impact of disability status and receipt of testing accommodations . Journal of Applied Measurement , 15 (2), 133–151. http://jampress.org/
- Finizio, N. J., II. (2008). The relationship between instructional and assessment accommodations on student IEPs in a single urban school district (Publication No. 3313763) [Doctoral dissertation, University of Massachusetts Boston]. ProQuest Dissertations and Theses Global.
- Koretz, D., & Hamilton, L. (2001). The performance of students with disabilities on New York’s Revised Regents Comprehensive Examination in English (CSE Technical Report No. 540). Center for the Study of Evaluation (CRESST), UCLA. http://www.rand.org/content/dam/rand/pubs/drafts/2008/DRU2608.pdf
- MacArthur, C. A. (1999). Word prediction for students with severe spelling problems . Learning Disability Quarterly , 22 (3), 158–172. https://doi.org/10.2307/1511283
Attribution
All rights reserved. Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as:
- Goldstone, L., Lazarus, S. S., Hendrickson, K., Rogers, C., & Hinkle, A. R. (2022). Spell check: Research (NCEO Accommodations Toolkit #27a) . National Center on Educational Outcomes.
The Center is supported through a Cooperative Agreement (#H326G210002) with the Research to Practice Division, Office of Special Education Programs, U.S. Department of Education. The Center is affiliated with the Institute on Community Integration at the College of Education and Human Development, University of Minnesota. Consistent with EDGAR §75.62, the contents of this report were developed under the Cooperative Agreement from the U.S. Department of Education, but do not necessarily represent the policy or opinions of the U.S. Department of Education or Offices within it. Readers should not assume endorsement by the federal government. Project Officer: David Egnor
Icon(s) used on this page:
#1 AI Proofreader for Academic Writing
Make your writing flawless in 1 upload.
- Ensure your document is error-free and consistent
- Save time and focus on your writing
- Feel confident about the quality before submission
Powerful Features
Perfect your document with these powerful tools.
Make sure you submit an error-free document by checking it for grammar, unclear and hard to read sentences, citations errors, style inconsistencies and AI generated text
Proofreader
The AI-Proofreader ensures your writing is error-free and coherent without changing the meaning, context, or style of your text.
APA Citation Checker
Receive an interactive report highlighting all citation errors and an outline of solutions.
AI detector
Make sure you don’t get flagged for AI generated content. Detects AI-generated content, like ChatGPT, Copilot and Gemini.
Consistency Checker
Check for stylistic inconsistencies, such as serial comma, quotation marks and dashes.
Always start with a free language scan
After uploading your document you’ll receive a writing score. The scanner looks for any language mistake in your document and tells you how submit-ready it is.
Scan my document for errors
This is how it works
1. Find out your writing score
The AI Proofreader scans for 100s of academic language errors. Within 5 minutes, a personalized report will reveal what mistakes are found in your document.
2. Download your document and review changes
Download the .docx document to accept or reject the corrections inside your document. You can also accept all changes with one click.
3. Submit your error-free document
Be confident that you submit an error-free document that’s up to academic standards. You can now also upload unlimited documents for 30 days.
High accuracy guaranteed
We created the AI Proofreader to correct academic texts. To achieve this, we trained it on 1000s of academic papers. That’s why it covers more advanced mistakes in academic writing. It also makes sure that your writing is clear and easy to understand.
Privacy guarantee
Submissions don’t get added to our database. Your document gets deleted after it’s corrected.
12 years of experience
Scribbr has improved thousands of academic documents and published hundreds of helpful articles on writing.
100% satisfaction guarantee
If you’re not completely happy, let us know! Together, we’re guaranteed to find a solution that leaves you 100% satisfied.
Did you know that we’ve helped over 5,000,000 students graduate since 2012?
I thought ai proofreading was useless but...
“I’ve been using Scribbr for years now and I know it’s a service that won’t dissapoint. I want to seem professional and straight to the point when I submit my work. I’m happy with the correction. It does a good job spotting grammar mistakes”
Going beyond correcting your grammar
The Scribbr AI proofreader fixes grammatical errors like:
- Sentence fragments & run-on sentences
- Subject-verb agreement errors
- Issues with parallelism
Basic spell-checks often miss academic terms in writing and mark them as errors. Scribbr has a large dictionary of recognized (academic) words, so you can feel confident every word is 100% correct.
Punctuation
The AI Proofreader takes away all your punctuation worries. Avoid common mistakes with:
- Apostrophes
- Parentheses
- Question marks
- Colons and semicolons
Wrong word choice
Fix problems with commonly confused words, like affect vs. effect, which vs. that and who vs. that.
The proofreader suggests fluency corrections to make your writing easier to read.
Unclear sentences
Long, complex sentences can make your writing hard to read. The AI Proofreader makes sure you express your ideas clearly.
Passive voice
Active voice makes your sentences clear and concise. The AI proofreader reduces the overuse of passive voice in your text.
Overused expressions
Clichés can make your writing seem lazy and predictable. Eliminating them will make your text more engaging and compelling.
Value: $9.95
Free bonus feature: citation checker.
Get your citations checked on all APA guidelines. You’ll receive an interactive report highlighting all errors and an outline of their solutions. Normally $9.95, now included with the AI Proofreader for free.
Find out if your writing is submit-ready
Ask our team.
Want to contact us directly? No problem. We are always here for you.
- Email [email protected]
- Start live chat
- Call +1 (510) 822-8066
- WhatsApp +31 20 261 6040
Frequently asked questions
Our AI Proofreader has been trained on academic texts. It also addresses commonly confused words, and it’s more accurate than Word’s autocorrect feature. Word’s autocorrect feature usually operates on a word level, whereas our AI Proofreader can proofread on the sentence and, to an extent, even the paragraph level. Because it’s more accurate and fixes more than just grammar mistakes, our AI Proofreader identifies and corrects more mistakes overall. Furthermore, because you check your document with our AI Proofreader after you’ve finished writing it, your workflow won’t be interrupted.
Rest assured: Your documents are safe. The document you upload is deleted immediately after it’s been processed by our AI Proofreader, and your processed document will automatically be deleted from our servers after 12 months. If you’d like to delete the stored copy of your document sooner, you can do so manually through your user profile at any time. For more information, please consult our articles on how we ensure the security of your documents.
For now, the AI Proofreader only corrects based on the conventions of US English. We will add other dialects at a later stage.
You can only upload .docx (Word) files to the AI Proofreader.
Absolutely! The AI Proofreader is particularly useful for non-native English speakers, as it can detect mistakes that may have gone unnoticed.
There’s no need for any downloads! You can use our AI Proofreader right in your web browser. Just upload your document and sit back; you’ll receive a revised version of your document within 10 minutes.
No; the AI Proofreader currently focuses on grammar, spelling, and punctuation errors. If you’re interested in detecting any potential plagiarism in a document, we recommend that you consider our Plagiarism Checker . The AI Proofreader is included for free in that service.
Absolutely! Every change suggested by the AI Proofreader is indicated as a tracked change in Word. You can decide which changes to accept or reject in your document, and, if you’re feeling confident, you can even accept all of the changes with just one click.
The cost is $9.95 per document, no matter the length. You won’t pay more based on the number of words or characters. Our AI Proofreader is ideal for academic papers and dissertations!
The exact time depends on the length of your document, but, in most cases, the proofreading will be completed within a maximum of 10 minutes.
No.To make sure that your reference list isn’t disrupted, we’ve implemented suppression rules in our model.
No. You can, however, get a free report that tells you exactly how many and what kinds of mistakes there are in your document.
IMAGES
VIDEO
COMMENTS
The Effect of Spell-Checker Features on Spelling Competence among EFL Learners: An Empirical Study ... Our survey selected papers about spelling correction indexed in Scopus and Web of Science ...
Out of these seven papers, one paper is related to Malayalam word generation, two are related to morphological analysis and the remaining 4 papers propose spell-checking techniques. First three papers answer the research questions Q3, Q4 and Q5 along with Q9 and Q10 where as other four papers which propose the spell-checking techniques are able ...
Interestingly, spell-checker provided better ... This paper contributes to systematically reviewing 19 eligible related studies through a step-by-step protocol of identification, screening ...
Scribbr offers a free online tool to proofread your essay and correct grammar, spelling, punctuation and word choice errors. You can also upload your entire document and get feedback on 100+ academic language issues in minutes.
Furthermore, the paper discusses the existing work in the domain of Sindhi spell checker software and identifies that there is no Sindhi spell checker add-in for widely used Microsoft Word ...
Index Terms—spell checker, auto-correct, n-grams, tokenizer, context-aware, real-time I. INTRODUCTION Spell checker and correction is a well-known and well-researched problem in Natural Language Processing [1]-[4]. However, most state-of-the-art research has been done on spell checkers for English [5], [6]. Some systems might be extended
This article suggests how the techniques from the other domains like morphology, part-of-speech, chunking, stemming, hash-table etc. can be used in development of spell-checkers. It also highlights the major challenges faced by researchers along with the future area of research in the field of spell-checking.
Scribbr Spell Checker instantly corrects your spelling and writing on 100+ language issues. It supports major English dialects, academic texts, and is free to use.
under the "Web-Scale Spell-Checking Research Project - WSSCRP2011" Abstract In computing, spell checking is the process of detecting and sometimes providing spelling suggestions for incorrectly spelled words in a text. Basically, a spell checker is a computer program that uses a dictionary of words to perform spell checking.
Scribbr offers an AI-powered tool and a human service to proofread your documents for grammar, spelling, punctuation, and more. Whether you need to write an essay, a report, or a book, Scribbr can help you avoid errors and improve your writing.
Spell checkers have been an area of research since the 1960s (Kukich, ... In this paper, we are interested in the Amazigh language spelling correction, based on the combination of Damerau-Levenshtein algorithm and N-gram. ... A spell checker is, essentially, proceeds in two stages: the detection and the correction of spelling errors.
International Journal of Scientific and Research Publications, Volume 5, Issue 4, April 2015 1 ISSN 2250- 3153 www.ijsrp.org SPELL CHECKER Vibhakti V. Bhaire, Ashiki A. Jadhav, Pradnya A. Pashte, Mr. Magdum P.G . Computer Engineering , Rajendra Mane College of Engineering and Technology ... IEEE Paper-SSCS: A Smart Spell Checker System ...
PDF | On Nov 9, 2020, Pawan Kumar and others published Design and Implementation of NLP-based Spell Checker for the Tamil Language | Find, read and cite all the research you need on ResearchGate
Recently, "paper checker" applications that tout features far beyond basic spell checking have emerged. Most at least provide sentence-by-sentence grammar feedback. Some apps even boast the ability to pinpoint tricky voice and style concerns (like, for instance, inappropriate use of the passive voice) and give on-the-fly suggestions for ...
Keywords— Spell Checker, OCR, OCR-generated text, Confusion Matrix, N-gram Analysis. ... This paper proposes a novel technique for resolution of post processing errors that occurs with respect to Telugu OCR using word level Unicode Approximation Models (UAM) through a mapper module. ... Research aimed at correcting words in text has focused ...
This paper is discussing both the approaches and their roles in various applications of spell checkers in Indian languages. Spell checkers in Indian languages are the basic tools that need to be developed. A spell checker is a software tool that identifies and corrects any spelling mistakes in a text. Spell checkers can be combined with other applications or they can be distributed individually.
This paper addresses the latter, more challenging sub-task, which takes a sentence and outputs the EDUs for that particular sentence. (1) Saturday, he amended his ...
Scribbr offers a free grammar checker that corrects your writing on 100+ language issues, including grammar, spelling, punctuation and word choice. It also supports different English variants and has a Chrome extension for easy proofreading.
This fact sheet on spell check is part of the Accommodations Toolkit published by the National Center on Educational Outcomes (NCEO). It summarizes information and research findings on spell check as an accommodation [1].This toolkit also contains a summary of states' accessibility policies for spell check.
QuillBot's essay checker helps you spot and fix grammar, spelling, punctuation, and phrasing errors in your writing. It also offers other tools to improve your writing, such as plagiarism checker, summarizer, citation generator, and paraphraser.
Spell Checker as an Expert System", Journal of Computing and Information Technology-ICCIT, 2004. [2] Dustin Boswell, "Language M odels for Spelling Correction",
Scribbr AI Proofreader is a tool that corrects grammar, spelling, punctuation, word choice and fluency in your academic documents. You can upload unlimited papers for 30 days, get a free citation check and accept or reject corrections directly in your .docx file.
Do not solely rely on your computer's spell-check—it will not get everything! ... Edited version: "I have to write a research paper for my class about extreme sports, and all I know about the subject is that I'm interested in it." The two highlighted portions are independent clauses. They are connected by the appropriate conjunction "and ...