A Copy detection Method for Malayalam Text Documents using N-grams Model

Dyuthi/Manakin Repository

A Copy detection Method for Malayalam Text Documents using N-grams Model

Show simple item record

dc.contributor.author Sumam, Mary Idicula
dc.contributor.author Bindu, Baby Thomas
dc.contributor.author Sindhu, L
dc.date.accessioned 2014-07-18T05:07:37Z
dc.date.available 2014-07-18T05:07:37Z
dc.date.issued 2013-02-09
dc.identifier.uri http://dyuthi.cusat.ac.in/purl/4104
dc.description.abstract In this paper a method of copy detection in short Malayalam text passages is proposed. Given two passages one as the source text and another as the copied text it is determined whether the second passage is plagiarized version of the source text. An algorithm for plagiarism detection using the n-gram model for word retrieval is developed and found tri-grams as the best model for comparing the Malayalam text. Based on the probability and the resemblance measures calculated from the n-gram comparison , the text is categorized on a threshold. Texts are compared by variable length n-gram(n={2,3,4}) comparisons. The experiments show that trigram model gives the average acceptable performance with affordable cost in terms of complexity en_US
dc.description.sponsorship Cochin University Of Science And Technology en_US
dc.language.iso en en_US
dc.subject Copy detection en_US
dc.subject N-gram Model en_US
dc.subject Bi-gram en_US
dc.subject Tri-gram en_US
dc.subject Malayalam en_US
dc.subject Plagiarism en_US
dc.title A Copy detection Method for Malayalam Text Documents using N-grams Model en_US
dc.type Article en_US


Files in this item

Files Size Format View Description
A Copy detectio ... ntsusing N-grams Model.pdf 493.5Kb PDF View/Open pdf

This item appears in the following Collection(s)

Show simple item record

Search Dyuthi


Advanced Search

Browse

My Account