SCM Repository

[tm] View of /trunk/tm/man/TermDocMatrix.Rd
ViewVC logotype

View of /trunk/tm/man/TermDocMatrix.Rd

Parent Directory Parent Directory | Revision Log Revision Log

Revision 726 - (download) (as text) (annotate)
Sun Apr 8 00:31:22 2007 UTC (12 years, 4 months ago) by feinerer
File size: 1692 byte(s)
Added a lot of examples to the manuals.
\title{Term-document matrix}
  Constructs a term-document matrix.
\S4method{TermDocMatrix}{TextDocCol}(object, weighting = "tf", stemming
= FALSE, language = "english", minWordLength = 3, minDocFreq = 1,
stopwords = NULL)
  \item{object}{a text document collection}
  \item{weighting}{the weighting mode for the term-document
    matrix. Possible settings are
      \item \code{tf} Term frequency
      \item \code{tf-idf} Term frequency inverse document frequency
      \item \code{bin} Binary frequency
      \item \code{logical} Similar to binary frequency but with Boolean values
  \item{stemming}{if set, stems words before making the term-document matrix.}
  \item{language}{the language determines the stemming rules}
  \item{minWordLength}{words smaller than this number are discarded for
    the term-document matrix.}
  \item{minDocFreq}{words that appear less often in documents than this
    number are discarded for the term-document matrix.}
  \item{stopwords}{either a plain text file with all stopwords or a
    Boolean value. In the latter case the default stopwords in
    accordance with the documents' language are used.}
  An S4 object of class \code{TermDocMatrix} which extends the class
  \code{matrix} containing a term-document matrix. The following slots
  contain useful information:

  \item{Weighting}{The weighting mode applied to the term-document matrix}
TermDocMatrix(crude, weighting = "tf-idf", stopwords = TRUE)
\author{Ingo Feinerer}
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge