The Basic Techniques in Text Mining using Open-source Tools

This presentation is part of the WikiSym + OpenSym 2013 program.

Jun Iio

There are many text mining tools provided commercially and non-commercially. However, the elementary text-based analysis can be done with basic Unix commands, shell-scripts, and small program of scripting languages, instead of using such extensive software. This paper introduces the basic techniques for text mining, using combination of a set of standard commands, small code, and generic tools provided as the open-source software. The target of the analysis are sixty-seven articles written by one author in a relay column since 1998. Several text-based analyses reveals a trend of interest moved within about fifteen years. In addition, at the end of this paper, the results of text-based analysis are compared with that of non-text-based analysis and the efficiency of non-parametric analysis is discussed.

A PDF file will be made available on August 5, 2013, through the WikiSym + OpenSym 2013 conference proceedings.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.