Domain Ontology Based Similarity and Analysis in Higher Education
Keywords:Text Reuse, Semantics, Domain Ontology, N-Gram
Text Re-use is a process of creating documents using existing ones. Text reuse is a common fact and rises, For example, copying text from different sites or re-using text in free blogs. Submitting someone else’s work as your own, Cutting and pasting from sources without documenting, Media “borrowing” without documentation, Web publishing without permission of creators, Providing false documentation or data etc. Detecting text re-use has been studies in range of tasks and many applications. It is traditionally detected by computing similarity between contents as source text and possibly re-used text. Text similarity methods have been proposed for calculating similarity based on surface level or semantic features. But it is difficult to detect the similarity of text on concept bases. Conceptual similarity is silent problematic. Semantic based text re-use and its detection are receiving attention within the research community, where parts of certain source text has been re-used by using similar words or phrases. Ontology is explicit formal specifications of the terms in the domain and relations among them. Use of Ontologies for information extraction is common in information retrieval domain. These are limitedly used for concept mapping and extraction as well. To deal with cases of text re-use in which the text documents has been paraphrased. A DOBS (Domain Ontology Based Similarity) is proposed for detection of conceptual or topical similarity between two documents.