PRETO: A high-performance text mining tool for preprocessing Turkish texts
dc.contributor.author | Tunali, V. | |
dc.contributor.author | Bilgin, T.T. | |
dc.date.accessioned | 2024-07-12T21:40:38Z | |
dc.date.available | 2024-07-12T21:40:38Z | |
dc.date.issued | 2012 | en_US |
dc.department | [Belirlenecek] | en_US |
dc.description | 13th International Conference on Computer Systems and Technologies, CompSysTech 2012 -- 22 June 2012 through 23 June 2012 -- Ruse -- 93756 | en_US |
dc.description.abstract | Text documents are usually unstructured and written in natural language. To apply conventional data mining techniques on text documents, a preprocessing operation is indispensable. In this paper, we introduce PRETO, a cross-platform, powerful and scalable preprocessing tool developed specifically for preprocessing Turkish texts, with a wide range of preprocessing options like stemming, stopword filtering, statistical term filtering, and n-gram generation. We demonstrate the performance and scalability of PRETO with some experiments on large document collections. Copyright ©2012 ACM. | en_US |
dc.identifier.doi | 10.1145/2383276.2383297 | |
dc.identifier.endpage | 140 | en_US |
dc.identifier.isbn | 9.78145E+12 | |
dc.identifier.scopus | 2-s2.0-84869002711 | en_US |
dc.identifier.scopusquality | N/A | en_US |
dc.identifier.startpage | 134 | en_US |
dc.identifier.uri | https://doi.org/10.1145/2383276.2383297 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12415/7411 | |
dc.indekslendigikaynak | Scopus | |
dc.language.iso | en | en_US |
dc.relation.ispartof | ACM International Conference Proceeding Series | en_US |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.snmz | KY08754 | |
dc.subject | Data Mining | en_US |
dc.subject | Natural Language Processing | en_US |
dc.subject | Text Mining | en_US |
dc.subject | Text Preprocessing | en_US |
dc.title | PRETO: A high-performance text mining tool for preprocessing Turkish texts | en_US |
dc.type | Conference Object | |
dspace.entity.type | Publication |