SCM

SCM Repository

[tm] Log of /pkg/R/matrix.R
[tm] / pkg / R / matrix.R  
ViewVC logotype

Log of /pkg/R/matrix.R

Parent Directory Parent Directory


Links to HEAD: (view) (download) (annotate)
Sticky Revision:

Revision 1404 - (view) (download) (annotate) - [select for diffs]
Modified Tue Feb 17 18:04:22 2015 UTC (4 years, 4 months ago) by feinerer
File length: 13487 byte(s)
Diff to previous 1384
Avoid (rather expensive) structure()

Revision 1384 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 1 07:59:56 2014 UTC (5 years ago) by feinerer
File length: 13513 byte(s)
Diff to previous 1381
Improve handling of empty matrices

Revision 1381 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 27 18:56:32 2014 UTC (5 years ago) by khornik
File length: 13492 byte(s)
Diff to previous 1329
Re-add undocumented and unexported CategorizedDocumentTermMatrix().

Revision 1329 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 15 17:16:03 2014 UTC (5 years, 2 months ago) by feinerer
File length: 13100 byte(s)
Diff to previous 1321
Synchronize print() appearance with NLP

Revision 1321 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 7 08:21:12 2014 UTC (5 years, 2 months ago) by feinerer
File length: 13045 byte(s)
Diff to previous 1320
Remove undocumented and unexported CategorizedDocumentTermMatrix

Revision 1320 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 6 07:05:45 2014 UTC (5 years, 2 months ago) by feinerer
File length: 13437 byte(s)
Diff to previous 1313
Use words() as default tokenizer in termFreq()

Revision 1313 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 30 09:28:00 2014 UTC (5 years, 2 months ago) by feinerer
File length: 13348 byte(s)
Diff to previous 1312
content() and as.list() now give the full documents

Revision 1312 - (view) (download) (annotate) - [select for diffs]
Modified Sat Mar 29 09:35:44 2014 UTC (5 years, 2 months ago) by feinerer
File length: 13609 byte(s)
Diff to previous 1310
Simplify corpus metadata and PCorpus metadata storage

Revision 1310 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 26 19:23:13 2014 UTC (5 years, 3 months ago) by feinerer
File length: 13486 byte(s)
Diff to previous 1307
Remove text repository, various improvements and bug fixes

Revision 1307 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 12:15:51 2014 UTC (5 years, 3 months ago) by feinerer
File length: 13496 byte(s)
Diff to previous 1306
Redesign corpora

Revision 1306 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 08:37:05 2014 UTC (5 years, 3 months ago) by feinerer
File length: 13501 byte(s)
Diff to previous 1304
Improve writeCorpus, use lower case in internal data structures

Revision 1304 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 24 14:02:54 2014 UTC (5 years, 3 months ago) by feinerer
File length: 13509 byte(s)
Diff to previous 1300
Bug fix and simplification

Revision 1300 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 14:30:05 2014 UTC (5 years, 3 months ago) by feinerer
File length: 13569 byte(s)
Diff to previous 1299
Redesign text documents

This is a major change and causes fallout. Soon to be fixed ...

Revision 1299 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 09:45:14 2014 UTC (5 years, 3 months ago) by feinerer
File length: 13476 byte(s)
Diff to previous 1278
Use setNames() instead of structure(..., names)

Revision 1278 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 5 19:07:33 2014 UTC (5 years, 5 months ago) by feinerer
File length: 13457 byte(s)
Diff to previous 1276
Remove trailing spaces

Revision 1276 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 5 15:27:12 2014 UTC (5 years, 5 months ago) by feinerer
File length: 13481 byte(s)
Diff to previous 1275
Allow multiple and non-existing terms in findAssocs() (suggested by Christian Buchta)

Revision 1275 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 5 12:28:16 2014 UTC (5 years, 5 months ago) by feinerer
File length: 13346 byte(s)
Diff to previous 1274
Some more sanity checks

Revision 1274 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 5 10:51:18 2014 UTC (5 years, 5 months ago) by feinerer
File length: 13295 byte(s)
Diff to previous 1273
More sanity checks

Revision 1273 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 5 08:42:02 2014 UTC (5 years, 5 months ago) by feinerer
File length: 13275 byte(s)
Diff to previous 1272
Some sanity checks

Revision 1272 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 28 09:59:02 2013 UTC (5 years, 5 months ago) by feinerer
File length: 13064 byte(s)
Diff to previous 1270
Make corlimit inclusive; clarify return value

Revision 1270 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 24 08:37:26 2013 UTC (5 years, 6 months ago) by feinerer
File length: 13063 byte(s)
Diff to previous 1269
Check for duplicated terms

Revision 1269 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 24 08:13:34 2013 UTC (5 years, 6 months ago) by feinerer
File length: 13029 byte(s)
Diff to previous 1268
Check corlimit (avoids costly cor() if corlimit is missing)

Revision 1268 - (view) (download) (annotate) - [select for diffs]
Modified Wed Dec 18 16:37:48 2013 UTC (5 years, 6 months ago) by feinerer
File length: 12999 byte(s)
Diff to previous 1266
Show label for single result item, do not export findAssocs.matrix()

Revision 1266 - (view) (download) (annotate) - [select for diffs]
Modified Sun Dec 15 09:14:16 2013 UTC (5 years, 6 months ago) by feinerer
File length: 12954 byte(s)
Diff to previous 1254
Allow multiple terms for findAssocs(), make it more efficient on spare matrices

Revision 1254 - (view) (download) (annotate) - [select for diffs]
Modified Sat Sep 7 08:45:50 2013 UTC (5 years, 9 months ago) by feinerer
File length: 12809 byte(s)
Diff to previous 1246
Avoid tm::

Revision 1246 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 20 15:40:28 2013 UTC (5 years, 10 months ago) by khornik
File length: 12826 byte(s)
Diff to previous 1227
Simplify method dispatch.

Revision 1227 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 16 08:37:10 2013 UTC (6 years ago) by feinerer
File length: 12875 byte(s)
Diff to previous 1223
Use package parallel instead of Rmpi and snow

Revision 1223 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jun 15 10:59:56 2013 UTC (6 years ago) by feinerer
File length: 12973 byte(s)
Diff to previous 1220
Handle (but warn about) invalid/empty document IDs in term-document matrix construction

Revision 1220 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 11 08:37:43 2013 UTC (6 years ago) by feinerer
File length: 12844 byte(s)
Diff to previous 1210
Use SnowballC instead of Snowball and RWeka

Revision 1210 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 22 18:40:48 2013 UTC (6 years, 5 months ago) by khornik
File length: 12868 byte(s)
Diff to previous 1207
Make nDocs()/nTerms() and Docs()/Terms() generic with methods for TDMs
and DTMs, and ensure that Docs()/Terms() returns a character vector with
length the number of documents and terms, respectively.

Revision 1207 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 12 15:27:20 2013 UTC (6 years, 5 months ago) by khornik
File length: 12550 byte(s)
Diff to previous 1206
Add as.DocumentTermMatrix() method for textcnt objects.

Revision 1206 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 11 20:15:37 2013 UTC (6 years, 5 months ago) by khornik
File length: 12424 byte(s)
Diff to previous 1203
Add TermDocumentMatix() method for textcnt objects.

Revision 1203 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 11 19:43:37 2013 UTC (6 years, 5 months ago) by khornik
File length: 12376 byte(s)
Diff to previous 1200
Improve formals for c() methods.

Revision 1200 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 14 15:07:38 2012 UTC (6 years, 6 months ago) by feinerer
File length: 12619 byte(s)
Diff to previous 1181
Ensure dimnames of type character when generating a simple_triplet_matrix

Revision 1181 - (view) (download) (annotate) - [select for diffs]
Modified Thu Mar 8 11:22:56 2012 UTC (7 years, 3 months ago) by feinerer
File length: 12601 byte(s)
Diff to previous 1173
Performance improvement as suggested by Milan Bouchet-Valat

Revision 1173 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 16 15:05:22 2012 UTC (7 years, 5 months ago) by feinerer
File length: 12544 byte(s)
Diff to previous 1168
Process tolower and tokenize options first in termFreq()

Revision 1168 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 11 10:35:44 2012 UTC (7 years, 5 months ago) by feinerer
File length: 12461 byte(s)
Diff to previous 1167
Fix processing of user provided stopwords

Revision 1167 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 23 09:44:33 2011 UTC (7 years, 6 months ago) by feinerer
File length: 12430 byte(s)
Diff to previous 1162
Fix invalid handling of control[1] argument to termFreq()

Revision 1162 - (view) (download) (annotate) - [select for diffs]
Modified Wed Dec 7 06:36:02 2011 UTC (7 years, 6 months ago) by feinerer
File length: 12364 byte(s)
Diff to previous 1160
Fix argument pass over

Revision 1160 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 6 15:15:01 2011 UTC (7 years, 6 months ago) by feinerer
File length: 12354 byte(s)
Diff to previous 1159
Remove debug output

Revision 1159 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 6 15:11:45 2011 UTC (7 years, 6 months ago) by feinerer
File length: 12379 byte(s)
Diff to previous 1153
Make termFreq() sensitive to the order of control options

Revision 1153 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 15:45:31 2011 UTC (7 years, 7 months ago) by feinerer
File length: 11957 byte(s)
Diff to previous 1151
Add SMART stopword list

Revision 1151 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 14:21:49 2011 UTC (7 years, 7 months ago) by feinerer
File length: 11948 byte(s)
Diff to previous 1150
Add generalized bounds checking

Revision 1150 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 15 15:37:17 2011 UTC (7 years, 7 months ago) by feinerer
File length: 11802 byte(s)
Diff to previous 1146
Document MC_tokenizer(), scan_tokenizer(), and getTokenizers()

Revision 1146 - (view) (download) (annotate) - [select for diffs]
Modified Mon Oct 10 06:44:29 2011 UTC (7 years, 8 months ago) by khornik
File length: 11771 byte(s)
Diff to previous 1145
Add CategorizedDocumentTermMatrix().

Revision 1145 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 30 18:05:19 2011 UTC (7 years, 9 months ago) by feinerer
File length: 11379 byte(s)
Diff to previous 1135
Add class label for term frequencies and corresponding c() and as.TermDocumentMatrix() implementation

Revision 1135 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 15 06:18:54 2011 UTC (8 years, 2 months ago) by khornik
File length: 10622 byte(s)
Diff to previous 1133
Export and document Blei et al reader.

Revision 1133 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 14 16:26:25 2011 UTC (8 years, 2 months ago) by khornik
File length: 10210 byte(s)
Diff to previous 1131
Cannot set names on NULL dimnames.

Revision 1131 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 14 09:52:31 2011 UTC (8 years, 2 months ago) by feinerer
File length: 10220 byte(s)
Diff to previous 1129
Assume that all arguments have the same weighting when c()ing term-document matrices

Revision 1129 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 13 08:29:00 2011 UTC (8 years, 2 months ago) by feinerer
File length: 10063 byte(s)
Diff to previous 1128
Use as.TermDocumentMatrix()

Revision 1128 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 8 17:36:10 2011 UTC (8 years, 2 months ago) by khornik
File length: 10411 byte(s)
Diff to previous 1116
Add functionality for obtaining DTMs and TDMs from t/f matrices coercible
to simple triplet matrices.

Revision 1116 - (view) (download) (annotate) - [select for diffs]
Modified Thu Feb 3 09:11:03 2011 UTC (8 years, 4 months ago) by feinerer
File length: 7545 byte(s)
Diff to previous 1108
Stem after stopword removal.

Revision 1108 - (view) (download) (annotate) - [select for diffs]
Modified Fri Oct 22 18:32:47 2010 UTC (8 years, 8 months ago) by feinerer
File length: 7545 byte(s)
Diff to previous 1106
Change Weighting from list element to attribute, access documents by name

Revision 1106 - (view) (download) (annotate) - [select for diffs]
Modified Mon Oct 18 09:25:40 2010 UTC (8 years, 8 months ago) by khornik
File length: 7496 byte(s)
Diff to previous 1103
Simplify using MC tokenizer.

Revision 1103 - (view) (download) (annotate) - [select for diffs]
Modified Sat Oct 16 10:01:45 2010 UTC (8 years, 8 months ago) by feinerer
File length: 7425 byte(s)
Diff to previous 1100
Fix invalid variable name

Revision 1100 - (view) (download) (annotate) - [select for diffs]
Modified Thu Oct 14 10:35:14 2010 UTC (8 years, 8 months ago) by feinerer
File length: 7423 byte(s)
Diff to previous 1097
Check bounds globally instead per document (report and bug fix by Thomas Zapf-Schramm)

Revision 1097 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 31 05:45:37 2010 UTC (8 years, 9 months ago) by feinerer
File length: 7411 byte(s)
Diff to previous 1093
Fix

Revision 1093 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 23 18:36:33 2010 UTC (8 years, 10 months ago) by feinerer
File length: 7426 byte(s)
Diff to previous 1080
Allow removePunctuation parameter for termFreq() to be a function or a list

Revision 1080 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jun 17 13:47:05 2010 UTC (9 years ago) by feinerer
File length: 7177 byte(s)
Diff to previous 1073
Use all words from a dictionary when tabulating against it in a term-document matrix

Revision 1073 - (view) (download) (annotate) - [select for diffs]
Modified Fri May 28 12:32:46 2010 UTC (9 years, 1 month ago) by feinerer
File length: 7120 byte(s)
Diff to previous 1068
Use IETF language tags for language codes

Revision 1068 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 5 10:09:47 2010 UTC (9 years, 1 month ago) by feinerer
File length: 7120 byte(s)
Diff to previous 1067
Improve stem completion.

Revision 1067 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 11 06:38:47 2010 UTC (9 years, 2 months ago) by feinerer
File length: 7274 byte(s)
Diff to previous 1058
Use match() instead of %in%

Revision 1058 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 15 17:15:04 2010 UTC (9 years, 3 months ago) by feinerer
File length: 7256 byte(s)
Diff to previous 1054
Names of dimnames are preserved in newer slam versions automatically.

Revision 1054 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 12 15:56:30 2010 UTC (9 years, 3 months ago) by feinerer
File length: 7299 byte(s)
Diff to previous 1039
Restore names of dimnames after subsetting.

Revision 1039 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 22 13:01:33 2010 UTC (9 years, 5 months ago) by feinerer
File length: 7256 byte(s)
Diff to previous 1038
Add stemDocument.character().

Revision 1038 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 15 12:12:41 2010 UTC (9 years, 5 months ago) by feinerer
File length: 7283 byte(s)
Diff to previous 1036
Extract more meta data from Reuters Corpus Volume 1 data set.

Revision 1036 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 14 09:09:43 2010 UTC (9 years, 5 months ago) by stefan7th
File length: 7281 byte(s)
Diff to previous 1033
pf improvement of c.TermDoc

Revision 1033 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 9 09:33:54 2010 UTC (9 years, 5 months ago) by feinerer
File length: 7245 byte(s)
Diff to previous 1027
Clean up and prepare for CRAN release.

Revision 1027 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 11 12:44:59 2009 UTC (9 years, 6 months ago) by stefan7th
File length: 9817 byte(s)
Diff to previous 1026
improved c() method for TDMs (thanks to Kurt)

Revision 1026 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 11 10:31:42 2009 UTC (9 years, 6 months ago) by feinerer
File length: 8904 byte(s)
Diff to previous 1025
Fix c.TermDocumentMatrix().

Revision 1025 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 11 08:56:22 2009 UTC (9 years, 6 months ago) by feinerer
File length: 8017 byte(s)
Diff to previous 1024
Register S3 document classes to be recognized by S4 methods.

Revision 1024 - (view) (download) (annotate) - [select for diffs]
Modified Wed Dec 2 11:06:52 2009 UTC (9 years, 6 months ago) by feinerer
File length: 8014 byte(s)
Diff to previous 1023
Improved c() method for term-document matrices.

Revision 1023 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 25 06:08:20 2009 UTC (9 years, 7 months ago) by feinerer
File length: 7044 byte(s)
Diff to previous 1022
Add option to termFreq() to remove punctuation characters.

Revision 1022 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 19 21:33:19 2009 UTC (9 years, 7 months ago) by feinerer
File length: 6929 byte(s)
Diff to previous 1019
Added a combine method for merging multiple term-document matrices.

Revision 1019 - (view) (download) (annotate) - [select for diffs]
Modified Mon Nov 16 08:20:55 2009 UTC (9 years, 7 months ago) by feinerer
File length: 6243 byte(s)
Diff to previous 1018
Use whitespace oriented tokenizer instead of AlphabeticTokenizer (from RWeka) as default.

Revision 1018 - (view) (download) (annotate) - [select for diffs]
Modified Sun Nov 15 15:53:49 2009 UTC (9 years, 7 months ago) by feinerer
File length: 6101 byte(s)
Diff to previous 1015
Fix bug in removeWords(). Refactoring of term-document matrix constructor. Clean up of defunct functions.

Revision 1015 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 7 11:15:19 2009 UTC (9 years, 7 months ago) by feinerer
File length: 5980 byte(s)
Diff to previous 987
Avoid prefixes from named documents when building a term-document matrix.

Revision 987 - (view) (download) (annotate) - [select for diffs]
Modified Wed Sep 2 17:54:45 2009 UTC (9 years, 9 months ago) by feinerer
File length: 5958 byte(s)
Diff to previous 985
Update documentation.

Revision 985 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 27 18:09:05 2009 UTC (9 years, 10 months ago) by feinerer
File length: 6103 byte(s)
Diff to previous 963
Use S3 instead of S4 class system.

Revision 963 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jun 29 07:01:19 2009 UTC (9 years, 11 months ago) by feinerer
File length: 6947 byte(s)
Diff to previous 962
Rename SCorpus to VCorpus (Volatile Corpus).

Revision 962 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 28 15:52:33 2009 UTC (9 years, 11 months ago) by feinerer
File length: 6947 byte(s)
Diff to previous 954
Fix documentation.

Revision 954 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 27 18:33:32 2009 UTC (10 years, 1 month ago) by feinerer
File length: 6932 byte(s)
Diff to previous 950
Handle empty matrices gracefully.

Revision 950 - (view) (download) (annotate) - [select for diffs]
Modified Thu May 14 15:17:18 2009 UTC (10 years, 1 month ago) by feinerer
File length: 6863 byte(s)
Diff to previous 946
Experimental FCorpus (fast corpus).

Revision 946 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 13 18:07:35 2009 UTC (10 years, 1 month ago) by feinerer
File length: 6833 byte(s)
Diff to previous 945
A lot of major improvements (see NEWS).

Revision 945 - (view) (download) (annotate) - [select for diffs]
Modified Mon May 4 10:57:01 2009 UTC (10 years, 1 month ago) by feinerer
File length: 5722 byte(s)
Diff to previous 941
Export some simple_triplet_matrix functions.

Revision 941 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 27 15:36:43 2009 UTC (10 years, 2 months ago) by feinerer
File length: 6217 byte(s)
Copied from: pkg/R/termdocmatrix.R revision 940
Diff to previous 938
Create two distinct classes for term-document and document-term matrices.

Revision 938 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 25 19:05:50 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 5566 byte(s)
Diff to previous 937
Get rid of Matrix package dependency.

Revision 937 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 16 21:09:49 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 8551 byte(s)
Diff to previous 933
Documentation update. Remove some require() calls.

Revision 933 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 12 08:26:53 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 8587 byte(s)
Diff to previous 929
Refactoring and clean up.

Revision 929 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 9 06:22:21 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 9056 byte(s)
Diff to previous 928
Always use Snowball for stemming.

Revision 928 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 18:27:35 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 9362 byte(s)
Diff to previous 927
Update documentation.

Revision 927 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 12:16:12 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 9369 byte(s)
Diff to previous 926
Minor fixes and documentation updates.

Revision 926 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 06:50:02 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 9350 byte(s)
Diff to previous 925
tmReduce() allows to combine multiple maps into one transformation.

Revision 925 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 3 17:39:44 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 9353 byte(s)
Diff to previous 924
Removed TermDocMatrix. Use DocumentTermMatrix or TermDocumentMatrix instead.

Revision 924 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 3 15:41:48 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 11173 byte(s)
Diff to previous 923
Improve weighting functions.

Revision 923 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 3 08:07:20 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 11090 byte(s)
Diff to previous 922
Further work on new TermDocumentMatrix.

Revision 922 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 31 16:41:02 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 8492 byte(s)
Diff to previous 918
Fix invalid slot access in subset method for TermDocumentMatrix.

Revision 918 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 27 13:45:36 2009 UTC (10 years, 3 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 8292 byte(s)
Diff to previous 893
Start to work on new TermDocumentMatrix and DocumentTermMatrix representations.

Revision 893 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 2 21:58:13 2009 UTC (10 years, 3 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 6488 byte(s)
Diff to previous 886
Suppress pointless loading message in vignette.

Revision 886 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 22:47:34 2009 UTC (10 years, 4 months ago) by feinerer
Original Path: pkg/R/termdocmatrix.R
File length: 6467 byte(s)
Diff to previous 885
Speed up package loading (Depends -> Suggests).

Revision 885 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 09:34:44 2009 UTC (10 years, 4 months ago) by stefan7th
Original Path: pkg/R/termdocmatrix.R
File length: 6386 byte(s)
Diff to previous 884
moved package to /pkg

Revision 884 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 28 10:24:27 2009 UTC (10 years, 4 months ago) by stefan7th
Original Path: pkg/tm/R/termdocmatrix.R
File length: 6386 byte(s)
Diff to previous 877
R-Forge transition completed

Revision 877 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 16 11:31:47 2008 UTC (10 years, 6 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6386 byte(s)
Diff to previous 870
Sort row indices when generating a term-document matrix (fixes a problem with the Matrix package).

Revision 870 - (view) (download) (annotate) - [select for diffs]
Modified Mon Nov 10 15:29:22 2008 UTC (10 years, 7 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6327 byte(s)
Diff to previous 869
Fix documentation and codoc mismatches.

Revision 869 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 8 09:16:37 2008 UTC (10 years, 7 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6321 byte(s)
Diff to previous 865
Sources now have a Length slot. Knowing the length in advance makes corpus construction a lot faster (~ 8 times faster).

Revision 865 - (view) (download) (annotate) - [select for diffs]
Modified Sun Aug 3 13:20:22 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6303 byte(s)
Diff to previous 861
Introduce name abbreviations for weighting functions.

Revision 861 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 24 09:55:09 2008 UTC (10 years, 11 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6284 byte(s)
Diff to previous 857
tmIndex(), tmFilter(), tmMap(), and TermDocMatrix() now use a MPI cluster if available.

Revision 857 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jul 8 16:01:47 2008 UTC (10 years, 11 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6145 byte(s)
Diff to previous 846
Removed tm-internal. Better (consistent) naming for dictionary functions.

Revision 846 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 26 08:11:14 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6399 byte(s)
Diff to previous 844
Added as.matrix() method for TermDocMatrix.

Revision 844 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 25 12:59:45 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6263 byte(s)
Diff to previous 843
Improved findFreqTerms.

Revision 843 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 25 12:31:51 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6306 byte(s)
Diff to previous 823
Added Dublin Core documentation.

Revision 823 - (view) (download) (annotate) - [select for diffs]
Modified Wed Feb 6 13:47:59 2008 UTC (11 years, 4 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6286 byte(s)
Diff to previous 818
Added removeNumbers transformation.

Revision 818 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 30 12:58:12 2008 UTC (11 years, 4 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6145 byte(s)
Diff to previous 817
Ensure that always a sparse matrix is built (there was a minor bug when using the dictionary argument).

Revision 817 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 30 11:25:20 2008 UTC (11 years, 4 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6161 byte(s)
Diff to previous 816
Ensure that dimnames are always set correctly when generating a TermDocMatrix.

Revision 816 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 24 14:36:41 2008 UTC (11 years, 5 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6168 byte(s)
Diff to previous 806
Renamed TextDocCol to Corpus, and Corpus to Content.

Revision 806 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 2 10:29:14 2008 UTC (11 years, 5 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6170 byte(s)
Diff to previous 796
Modular TermDocMatrix constructor is now default.

Revision 796 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 6 15:22:34 2007 UTC (11 years, 7 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 8487 byte(s)
Diff to previous 790
Correct processing of empty documents.

Revision 790 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 21 08:27:13 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 8361 byte(s)
Diff to previous 788
Exported termFreq to NAMESPACE. New modular constructor for TermDocMatrix (called TermDocMatrix2 at the moment).

Revision 788 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 14 12:16:26 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 7178 byte(s)
Diff to previous 787
Weighting functions for TermDocMatrix.

Revision 787 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 14 10:08:36 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 7296 byte(s)
Diff to previous 785
Renamed textvector to termFreq. Modified return value of termFreq (now integer vector of frequencies with names of terms).

Revision 785 - (view) (download) (annotate) - [select for diffs]
Modified Sat Oct 13 10:46:28 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 7539 byte(s)
Diff to previous 782
Added plot function for term-document matrices.

Revision 782 - (view) (download) (annotate) - [select for diffs]
Modified Sun Sep 30 11:35:10 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 7039 byte(s)
Diff to previous 781
Use new textvector function supporting modules.

Revision 781 - (view) (download) (annotate) - [select for diffs]
Modified Sun Sep 30 09:13:51 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 8264 byte(s)
Diff to previous 774
Prototype for new modularized term-document matrix constructor.

Revision 774 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 21 16:25:54 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 6147 byte(s)
Diff to previous 770
Added convenience methods for term-document matrices.

Revision 770 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jul 17 12:41:04 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 5487 byte(s)
Diff to previous 765
Improved TermDocMatrix's efficiency. Kudos to Christian Buchta.

Revision 765 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jul 13 15:53:45 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 5310 byte(s)
Diff to previous 760
See ChangeLog.

Revision 760 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jun 21 22:40:15 2007 UTC (12 years ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 5186 byte(s)
Diff to previous 758
require() uses the quietly option to suppress loading messages.

Revision 758 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jun 13 02:25:36 2007 UTC (12 years ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 5170 byte(s)
Diff to previous 752
Added dictionary support.

Revision 752 - (view) (download) (annotate) - [select for diffs]
Modified Sat May 19 22:39:04 2007 UTC (12 years, 1 month ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 4645 byte(s)
Diff to previous 748
Small bug fix in textvector(). Added new function removeSparseTerms().

Revision 748 - (view) (download) (annotate) - [select for diffs]
Modified Fri May 4 18:52:42 2007 UTC (12 years, 1 month ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 4051 byte(s)
Diff to previous 746
findFreqTerms operates now (very) efficiently on (big) sparse matrices. Thanks to Martin Maechler.

Revision 746 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 23 14:05:17 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 4068 byte(s)
Diff to previous 745
Small bugfix in textvector function. Recreated vignette PDF.

Revision 745 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 23 00:57:26 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 4042 byte(s)
Diff to previous 744
Fixed dimnames in sparse matrix. Updated date in DESCRIPTION.

Revision 744 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 23 00:35:10 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3910 byte(s)
Diff to previous 721
TermDocMatrix is now built by direct stepwise insertion, i.e., we save a lot of memory on construction.

Revision 721 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 21 13:54:43 2007 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3678 byte(s)
Diff to previous 719
Simplified sFilter.

Revision 719 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 18 09:24:47 2007 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3678 byte(s)
Diff to previous 718
Improved database support.

Revision 718 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 16 12:55:16 2007 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3594 byte(s)
Diff to previous 717
We now use sparse matrices.

Revision 717 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 16 11:13:04 2007 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3566 byte(s)
Diff to previous 716
Added Language slot to text documents. Refactored TextDocCol constructor.

Revision 716 - (view) (download) (annotate) - [select for diffs]
Modified Thu Mar 15 17:22:39 2007 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3461 byte(s)
Diff to previous 713
Some improvements for TermDocMatrix.

Revision 713 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 14 13:44:11 2007 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3604 byte(s)
Diff to previous 702
Added Snowball support. Added function returning stopwords (English, German, French).

Revision 702 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 9 09:39:33 2007 UTC (12 years, 5 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3502 byte(s)
Diff to previous 698
wordStem now explicitly uses Rstem namespace.

Revision 698 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 6 17:05:44 2007 UTC (12 years, 5 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3495 byte(s)
Diff to previous 696
Changes due to Kurt's review.

Revision 696 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 5 15:04:53 2007 UTC (12 years, 5 months ago) by hornik
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3507 byte(s)
Diff to previous 694
Avoid non-standard eval (makes codetools happier).

Revision 694 - (view) (download) (annotate) - [select for diffs]
Modified Sun Dec 31 14:47:46 2006 UTC (12 years, 5 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3505 byte(s)
Diff to previous 693
Implemented improvements based upon comments by David.

Revision 693 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 22 13:21:30 2006 UTC (12 years, 6 months ago) by feinerer
Original Path: trunk/tm/R/termdocmatrix.R
File length: 3345 byte(s)
Diff to previous 78
Renamed textmin to tm directory since the package name changed.

Revision 78 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 29 14:56:36 2006 UTC (12 years, 6 months ago) by zeileis
Original Path: trunk/textmin/R/termdocmatrix.R
File length: 3345 byte(s)
Diff to previous 67
removed old repos structure, now only R packages

Revision 67 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 1 17:29:59 2006 UTC (12 years, 7 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 3345 byte(s)
Diff to previous 66
See ChangeLog

Revision 66 - (view) (download) (annotate) - [select for diffs]
Modified Tue Oct 31 22:03:33 2006 UTC (12 years, 7 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 3343 byte(s)
Diff to previous 63
See ChangeLog.

Revision 63 - (view) (download) (annotate) - [select for diffs]
Modified Thu Oct 26 14:59:09 2006 UTC (12 years, 8 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 3351 byte(s)
Diff to previous 62
See ChangeLog.

Revision 62 - (view) (download) (annotate) - [select for diffs]
Modified Tue Oct 24 10:08:58 2006 UTC (12 years, 8 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 3355 byte(s)
Diff to previous 53
See ChangeLog.

Revision 53 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 24 13:06:50 2006 UTC (12 years, 10 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 3271 byte(s)
Diff to previous 52
See ChangeLog for changes.

Revision 52 - (view) (download) (annotate) - [select for diffs]
Modified Sat Aug 12 12:43:39 2006 UTC (12 years, 10 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 3686 byte(s)
Diff to previous 51
Various updates. See ChangeLog and diff source code.

Revision 51 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 7 12:14:09 2006 UTC (12 years, 10 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 2395 byte(s)
Diff to previous 50
Various changes due to new layout.

Revision 50 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 7 08:55:57 2006 UTC (12 years, 10 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 2395 byte(s)
Diff to previous 48
Corrected TermDocMatrix and NAMESPACE.

Revision 48 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 13 13:47:31 2006 UTC (12 years, 11 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 2395 byte(s)
Diff to previous 47
Clean up of old stuff.

Revision 47 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jul 10 12:22:35 2006 UTC (12 years, 11 months ago) by feinerer
Original Path: trunk/R/textmin/R/termdocmatrix.R
File length: 2355 byte(s)
Diff to previous 46
Renamed tm to textmin directory.

Revision 46 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 5 18:08:41 2006 UTC (12 years, 11 months ago) by meyer
Original Path: trunk/R/tm/R/termdocmatrix.R
File length: 2355 byte(s)
Diff to previous 45
move


Revision 45 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 5 17:27:29 2006 UTC (12 years, 11 months ago) by meyer
Original Path: trunk/R/trunk/tm/R/termdocmatrix.R
File length: 2355 byte(s)
Diff to previous 42
move in subdir


Revision 42 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 1 08:42:26 2006 UTC (12 years, 11 months ago) by feinerer
Original Path: trunk/R/trunk/R/termdocmatrix.R
File length: 2355 byte(s)
Diff to previous 33
Changed S4 method signatures.

Revision 33 - (view) (download) (annotate) - [select for diffs]
Modified Thu Dec 15 13:29:17 2005 UTC (13 years, 6 months ago) by feinerer
Original Path: trunk/R/trunk/R/termdocmatrix.R
File length: 2321 byte(s)
Diff to previous 26
See ChangeLog

Revision 26 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 3 15:20:17 2005 UTC (13 years, 6 months ago) by feinerer
Original Path: trunk/R/trunk/R/termdocmatrix.R
File length: 2824 byte(s)
Diff to previous 25
See ChangeLog

Revision 25 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 30 18:53:50 2005 UTC (13 years, 6 months ago) by feinerer
Original Path: trunk/R/trunk/R/termdocmatrix.R
File length: 2822 byte(s)
Copied from: trunk/R/trunk/R/textmatrix.R revision 24
Diff to previous 22
See ChangeLog

Revision 22 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 19 16:58:34 2005 UTC (13 years, 7 months ago) by feinerer
Original Path: trunk/R/trunk/R/textmatrix.R
File length: 2822 byte(s)
Diff to previous 21
See ChangeLog

Revision 21 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 19 10:23:19 2005 UTC (13 years, 7 months ago) by feinerer
Original Path: trunk/R/trunk/R/textmatrix.R
File length: 2652 byte(s)
Diff to previous 20
See ChangeLog

Revision 20 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 8 16:40:52 2005 UTC (13 years, 7 months ago) by feinerer
Original Path: trunk/R/trunk/R/textmatrix.R
File length: 1843 byte(s)
Diff to previous 16
See ChangeLog

Revision 16 - (view) (download) (annotate) - [select for diffs]
Added Fri Oct 7 09:42:57 2005 UTC (13 years, 8 months ago) by feinerer
Original Path: trunk/R/trunk/R/textmatrix.R
File length: 1351 byte(s)
Textmatrix code runs. Simple k-means text clustering (similarity based upon word frequences) works.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge