SCM

SCM Repository

[tm] Log of /pkg/DESCRIPTION
[tm] / pkg / DESCRIPTION  
ViewVC logotype

Log of /pkg/DESCRIPTION

Parent Directory Parent Directory


Links to HEAD: (view) (download) (annotate)
Sticky Revision:

Revision 1445 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 9 09:30:58 2016 UTC (2 years, 5 months ago) by feinerer
File length: 950 byte(s)
Diff to previous 1443 , to selected 1121
Speed up termFreq(), general cleanup

- Avoid parallel::mclapply()
- Use custom .table()
- Use rep.int(), rep_len() and lengths()
- Fix typos
- Shorten overlong lines
- Consistent formatting

Revision 1443 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 22 11:26:41 2016 UTC (2 years, 7 months ago) by feinerer
File length: 960 byte(s)
Diff to previous 1442 , to selected 1121
Process all arguments in tm_map.SimpleCorpus()

Revision 1442 - (view) (download) (annotate) - [select for diffs]
Modified Sat Aug 6 17:19:22 2016 UTC (2 years, 7 months ago) by feinerer
File length: 960 byte(s)
Diff to previous 1441 , to selected 1121
Recheck local bounds after stemming in TermDocumentMatrix.SimpleCorpus()

Revision 1441 - (view) (download) (annotate) - [select for diffs]
Modified Sat Aug 6 16:46:33 2016 UTC (2 years, 7 months ago) by feinerer
File length: 960 byte(s)
Diff to previous 1440 , to selected 1121
Simplify termFreq()

- Return table instead of named integer vector (avoids internal conversion)
- Always skip terms with a zero count

Revision 1440 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 30 06:34:57 2016 UTC (2 years, 7 months ago) by feinerer
File length: 960 byte(s)
Diff to previous 1439 , to selected 1121
Corpus() now chooses between SimpleCorpus and VCorpus based on its arguments

Revision 1439 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jul 17 07:02:40 2016 UTC (2 years, 8 months ago) by feinerer
File length: 964 byte(s)
Diff to previous 1438 , to selected 1121
Polish TermDocumentMatrix.SimpleCorpus()

Revision 1438 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 16 18:32:59 2016 UTC (2 years, 8 months ago) by feinerer
File length: 964 byte(s)
Diff to previous 1437 , to selected 1121
Use Rcpp for efficient term-document matrix construction from a SimpleCorpus

Revision 1437 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 13 19:23:49 2016 UTC (2 years, 8 months ago) by feinerer
File length: 938 byte(s)
Diff to previous 1435 , to selected 1121
Add SimpleCorpus

SimpleCorpus provides a corpus which is optimized for the most common usage
scenario: importing plain texts from files in a directory or directly from a
vector in R, preprocessing and transforming the texts, and finally exporting
them to a term-document matrix. The aim is to boost performance and minimize
memory pressure. It loads all documents into memory, and is designed for
medium-sized to large data sets.

Revision 1435 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 18 09:53:21 2015 UTC (3 years, 4 months ago) by feinerer
File length: 938 byte(s)
Diff to previous 1433 , to selected 1121
Provide inspect.TextDocument() as shorthand for writeLines(as.character())

Revision 1433 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 2 10:52:03 2015 UTC (3 years, 8 months ago) by feinerer
File length: 936 byte(s)
Diff to previous 1432 , to selected 1121
Avoid simplification to ensure that the result is a named list

Revision 1432 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 1 19:17:31 2015 UTC (3 years, 8 months ago) by feinerer
File length: 936 byte(s)
Diff to previous 1431 , to selected 1121
Update NEWS

Revision 1431 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 1 10:32:42 2015 UTC (3 years, 8 months ago) by khornik
File length: 938 byte(s)
Diff to previous 1430 , to selected 1121
Improve namespace.

Revision 1430 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 9 12:34:44 2015 UTC (3 years, 9 months ago) by feinerer
File length: 922 byte(s)
Diff to previous 1426 , to selected 1121
Highlight the character representation of documents

Revision 1426 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 6 16:43:10 2015 UTC (3 years, 10 months ago) by feinerer
File length: 922 byte(s)
Diff to previous 1421 , to selected 1121
Avoid overlong line

Revision 1421 - (view) (download) (annotate) - [select for diffs]
Modified Mon May 4 19:21:11 2015 UTC (3 years, 10 months ago) by feinerer
File length: 922 byte(s)
Diff to previous 1420 , to selected 1121
Do not require() Rgraphviz

Revision 1420 - (view) (download) (annotate) - [select for diffs]
Modified Mon May 4 19:04:00 2015 UTC (3 years, 10 months ago) by feinerer
File length: 913 byte(s)
Diff to previous 1419 , to selected 1121
Accept NLP::Span_Tokenizer

Revision 1419 - (view) (download) (annotate) - [select for diffs]
Modified Sat May 2 17:23:47 2015 UTC (3 years, 10 months ago) by feinerer
File length: 913 byte(s)
Diff to previous 1417 , to selected 1121
Sync format()/print() with NLP

Revision 1417 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 28 18:02:42 2015 UTC (3 years, 10 months ago) by feinerer
File length: 913 byte(s)
Diff to previous 1413 , to selected 1121
Mark scan_tokenizer() and MC_tokenizer() as NLP::Token_Tokenizer

Revision 1413 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 08:21:38 2015 UTC (3 years, 11 months ago) by feinerer
File length: 911 byte(s)
Diff to previous 1409 , to selected 1121
Correctly process words being truncations of others

Revision 1409 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 27 16:10:18 2015 UTC (4 years ago) by feinerer
File length: 911 byte(s)
Diff to previous 1406 , to selected 1121
Add as.VCorpus.list()

Revision 1406 - (view) (download) (annotate) - [select for diffs]
Modified Mon Feb 23 17:21:49 2015 UTC (4 years ago) by feinerer
File length: 911 byte(s)
Diff to previous 1401 , to selected 1121
Add readTagged(): a reader for text documents containing POS-tagged words

Revision 1401 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 28 18:38:56 2015 UTC (4 years, 1 month ago) by feinerer
File length: 911 byte(s)
Diff to previous 1399 , to selected 1121
Sync documentation with code (log2 vs. log)

Reported by Marcus Spies.

Revision 1399 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 21 15:31:33 2015 UTC (4 years, 2 months ago) by feinerer
File length: 911 byte(s)
Diff to previous 1397 , to selected 1121
Show TOPICS categories

Reported by Diego M. Barreiro FandiƱo.

Revision 1397 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 12 19:30:27 2014 UTC (4 years, 6 months ago) by feinerer
File length: 911 byte(s)
Diff to previous 1390 , to selected 1121
Add open() and close() for sources

Useful for sources with complex or expensive setup, e.g., database connections
or file handles.

Revision 1390 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jun 6 12:37:33 2014 UTC (4 years, 9 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1384 , to selected 1121
Ensure data types

Revision 1384 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 1 07:59:56 2014 UTC (4 years, 9 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1379 , to selected 1121
Improve handling of empty matrices

Revision 1379 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 27 17:55:29 2014 UTC (4 years, 9 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1376 , to selected 1121
Provide names<-() for VCorpus and PCorpus

Revision 1376 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 21 14:36:35 2014 UTC (4 years, 10 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1375 , to selected 1121
Remove names() from Source API

Revision 1375 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 20 18:21:27 2014 UTC (4 years, 10 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1369 , to selected 1121
Do not force author to be a person object

Suggested by Milan Bouchet-Valat.

Revision 1369 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 29 07:42:53 2014 UTC (4 years, 10 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1365 , to selected 1121
Fallback to English if meta(doc, "language") is invalid

Revision 1365 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 28 14:02:53 2014 UTC (4 years, 10 months ago) by feinerer
File length: 909 byte(s)
Diff to previous 1360 , to selected 1121
Fix and improve documentation, suggest tm.lexicon.GeneralInquirer

Revision 1360 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 25 18:10:01 2014 UTC (4 years, 10 months ago) by khornik
File length: 881 byte(s)
Diff to previous 1358 , to selected 1121
Add info on additional repository hosting Rcampdf.

Revision 1358 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 24 07:43:38 2014 UTC (4 years, 11 months ago) by feinerer
File length: 831 byte(s)
Diff to previous 1348 , to selected 1121
Document content_transformer()

Revision 1348 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 22 07:09:41 2014 UTC (4 years, 11 months ago) by feinerer
File length: 831 byte(s)
Diff to previous 1345 , to selected 1121
Provide as.VCorpus() generic

Revision 1345 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 20 16:48:32 2014 UTC (4 years, 11 months ago) by feinerer
File length: 831 byte(s)
Diff to previous 1336 , to selected 1121
Update NEWS

Revision 1336 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 19 08:59:39 2014 UTC (4 years, 11 months ago) by feinerer
File length: 831 byte(s)
Diff to previous 1332 , to selected 1121
Implement and describe Source API

Revision 1332 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 18 09:00:55 2014 UTC (4 years, 11 months ago) by feinerer
File length: 824 byte(s)
Diff to previous 1323 , to selected 1121
Update TextDocument documentation

Revision 1323 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 12 17:16:38 2014 UTC (4 years, 11 months ago) by feinerer
File length: 824 byte(s)
Diff to previous 1320 , to selected 1121
Use tools::find_gs_cmd()

Revision 1320 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 6 07:05:45 2014 UTC (4 years, 11 months ago) by feinerer
File length: 817 byte(s)
Diff to previous 1319 , to selected 1121
Use words() as default tokenizer in termFreq()

Revision 1319 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 2 18:03:37 2014 UTC (4 years, 11 months ago) by feinerer
File length: 817 byte(s)
Diff to previous 1316 , to selected 1121
Provide words.PlainTextDocument(), clean NAMESPACE

Revision 1316 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 31 14:41:41 2014 UTC (4 years, 11 months ago) by feinerer
File length: 817 byte(s)
Diff to previous 1309 , to selected 1121
Remove dissimilarity() (a trivial wrapper around proxy:dist())

Revision 1309 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 26 09:15:04 2014 UTC (4 years, 11 months ago) by feinerer
File length: 824 byte(s)
Diff to previous 1307 , to selected 1121
Move content and meta generics to package NLP

Revision 1307 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 12:15:51 2014 UTC (5 years ago) by feinerer
File length: 808 byte(s)
Diff to previous 1300 , to selected 1121
Redesign corpora

Revision 1300 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 14:30:05 2014 UTC (5 years ago) by feinerer
File length: 808 byte(s)
Diff to previous 1299 , to selected 1121
Redesign text documents

This is a major change and causes fallout. Soon to be fixed ...

Revision 1299 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 09:45:14 2014 UTC (5 years ago) by feinerer
File length: 813 byte(s)
Diff to previous 1297 , to selected 1121
Use setNames() instead of structure(..., names)

Revision 1297 - (view) (download) (annotate) - [select for diffs]
Modified Thu Mar 20 18:43:22 2014 UTC (5 years ago) by feinerer
File length: 806 byte(s)
Diff to previous 1295 , to selected 1121
Redesign sources

Revision 1295 - (view) (download) (annotate) - [select for diffs]
Modified Tue Feb 25 10:54:41 2014 UTC (5 years ago) by feinerer
File length: 806 byte(s)
Diff to previous 1294 , to selected 1121
Export pGetElem.URISource

Revision 1294 - (view) (download) (annotate) - [select for diffs]
Modified Sun Feb 23 07:41:45 2014 UTC (5 years, 1 month ago) by feinerer
File length: 806 byte(s)
Diff to previous 1293 , to selected 1121
Avoid spurious duplicate results

Revision 1293 - (view) (download) (annotate) - [select for diffs]
Modified Thu Feb 20 14:39:33 2014 UTC (5 years, 1 month ago) by khornik
File length: 804 byte(s)
Diff to previous 1292 , to selected 1121
Commit to trigger rebuild.

Revision 1292 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 28 16:31:18 2014 UTC (5 years, 1 month ago) by feinerer
File length: 804 byte(s)
Diff to previous 1289 , to selected 1121
Process three letter codes; based on Kurt's contribution

Revision 1289 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 13 16:23:06 2014 UTC (5 years, 2 months ago) by khornik
File length: 804 byte(s)
Diff to previous 1279 , to selected 1121
Need R >= 3.0.0.

Revision 1279 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 7 06:29:58 2014 UTC (5 years, 2 months ago) by feinerer
File length: 805 byte(s)
Diff to previous 1268 , to selected 1121
Improve documentation, update ChangeLog, prepare for CRAN release

Revision 1268 - (view) (download) (annotate) - [select for diffs]
Modified Wed Dec 18 16:37:48 2013 UTC (5 years, 3 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1266 , to selected 1121
Show label for single result item, do not export findAssocs.matrix()

Revision 1266 - (view) (download) (annotate) - [select for diffs]
Modified Sun Dec 15 09:14:16 2013 UTC (5 years, 3 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1261 , to selected 1121
Allow multiple terms for findAssocs(), make it more efficient on spare matrices

Revision 1261 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 27 09:37:35 2013 UTC (5 years, 5 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1258 , to selected 1121
Allow multiple URIs for URISource, default to vectorized sources, simplify eoi()

Revision 1258 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 20 12:15:42 2013 UTC (5 years, 6 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1257 , to selected 1121
Remove GmaneSource() and readGmane(), simplify readers, improve documentation

Revision 1257 - (view) (download) (annotate) - [select for diffs]
Modified Thu Sep 19 10:48:07 2013 UTC (5 years, 6 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1255 , to selected 1121
Export Source constructor, extend documentation

Revision 1255 - (view) (download) (annotate) - [select for diffs]
Modified Wed Sep 11 07:30:06 2013 UTC (5 years, 6 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1254 , to selected 1121
Rename tm_tag_score() to tm_term_score()

Revision 1254 - (view) (download) (annotate) - [select for diffs]
Modified Sat Sep 7 08:45:50 2013 UTC (5 years, 6 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1253 , to selected 1121
Avoid tm::

Revision 1253 - (view) (download) (annotate) - [select for diffs]
Modified Fri Aug 30 10:03:09 2013 UTC (5 years, 6 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1252 , to selected 1121
Remove getFilters(), searchFullText(), and tm_intersect() (use grep() instead)

Revision 1252 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 26 14:00:31 2013 UTC (5 years, 6 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1251 , to selected 1121
Report non-existent or non-readable files

Revision 1251 - (view) (download) (annotate) - [select for diffs]
Modified Wed Aug 21 08:44:25 2013 UTC (5 years, 7 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1247 , to selected 1121
Document readPDF() rewrite

Revision 1247 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 20 16:45:26 2013 UTC (5 years, 7 months ago) by feinerer
File length: 806 byte(s)
Diff to previous 1245 , to selected 1121
Suggest Rcampdf

Revision 1245 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 20 07:48:15 2013 UTC (5 years, 7 months ago) by feinerer
File length: 797 byte(s)
Diff to previous 1243 , to selected 1121
Suggest Rpoppler

Revision 1243 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 19 09:37:32 2013 UTC (5 years, 7 months ago) by feinerer
File length: 787 byte(s)
Diff to previous 1242 , to selected 1121
Interface several PDF extraction engines (draft)

Revision 1242 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 19 05:33:57 2013 UTC (5 years, 7 months ago) by feinerer
File length: 787 byte(s)
Diff to previous 1240 , to selected 1121
Do not register VCorpus and PlainTextDocument as S4 classes anymore

Revision 1240 - (view) (download) (annotate) - [select for diffs]
Modified Sun Aug 18 13:18:28 2013 UTC (5 years, 7 months ago) by khornik
File length: 796 byte(s)
Diff to previous 1238 , to selected 1121
Mention pdf_info.ps copyright.

Revision 1238 - (view) (download) (annotate) - [select for diffs]
Modified Fri Aug 9 08:49:58 2013 UTC (5 years, 7 months ago) by feinerer
File length: 653 byte(s)
Diff to previous 1234 , to selected 1121
Switch to GPL-3

Revision 1234 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 25 17:45:00 2013 UTC (5 years, 7 months ago) by feinerer
File length: 658 byte(s)
Diff to previous 1231 , to selected 1121
Report NA instead of error for no completions in prevalent heuristic, reformatting

Revision 1231 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 10 06:51:26 2013 UTC (5 years, 8 months ago) by feinerer
File length: 658 byte(s)
Diff to previous 1228 , to selected 1121
Use pdfinfo command line tool

Revision 1228 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jun 17 08:28:18 2013 UTC (5 years, 9 months ago) by feinerer
File length: 644 byte(s)
Diff to previous 1227 , to selected 1121
s/Suggests/Imports/ for parallel package

Revision 1227 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 16 08:37:10 2013 UTC (5 years, 9 months ago) by feinerer
File length: 644 byte(s)
Diff to previous 1226 , to selected 1121
Use package parallel instead of Rmpi and snow

Revision 1226 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 16 07:38:58 2013 UTC (5 years, 9 months ago) by feinerer
File length: 646 byte(s)
Diff to previous 1223 , to selected 1121
Document SnowballC switch in NEWS

Revision 1223 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jun 15 10:59:56 2013 UTC (5 years, 9 months ago) by feinerer
File length: 648 byte(s)
Diff to previous 1220 , to selected 1121
Handle (but warn about) invalid/empty document IDs in term-document matrix construction

Revision 1220 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 11 08:37:43 2013 UTC (5 years, 9 months ago) by feinerer
File length: 648 byte(s)
Diff to previous 1216 , to selected 1121
Use SnowballC instead of Snowball and RWeka

Revision 1216 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 11 12:05:53 2013 UTC (5 years, 11 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1211 , to selected 1121
Document UCP change

Revision 1211 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 28 08:40:24 2013 UTC (6 years, 1 month ago) by feinerer
File length: 654 byte(s)
Diff to previous 1205 , to selected 1121
Update version and date for CRAN release

Revision 1205 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 11 19:43:40 2013 UTC (6 years, 2 months ago) by khornik
File length: 654 byte(s)
Diff to previous 1201 , to selected 1121
Update version and date.

Revision 1201 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 14 15:08:35 2012 UTC (6 years, 3 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1200 , to selected 1121
Update Version as well

Revision 1200 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 14 15:07:38 2012 UTC (6 years, 3 months ago) by feinerer
File length: 652 byte(s)
Diff to previous 1199 , to selected 1121
Ensure dimnames of type character when generating a simple_triplet_matrix

Revision 1199 - (view) (download) (annotate) - [select for diffs]
Modified Mon Dec 10 14:37:54 2012 UTC (6 years, 3 months ago) by feinerer
File length: 652 byte(s)
Diff to previous 1198 , to selected 1121
Document right to left folding in tm_reduce

Revision 1198 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 4 13:19:31 2012 UTC (6 years, 3 months ago) by feinerer
File length: 652 byte(s)
Diff to previous 1197 , to selected 1121
Prepare for CRAN release

Revision 1197 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 4 12:54:31 2012 UTC (6 years, 3 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1195 , to selected 1121
Update version and date

Revision 1195 - (view) (download) (annotate) - [select for diffs]
Modified Mon Nov 26 10:10:07 2012 UTC (6 years, 3 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1194 , to selected 1121
Make termFreq() more visible in TermDocumentMatrix() documentation

Revision 1194 - (view) (download) (annotate) - [select for diffs]
Modified Fri Nov 2 15:15:03 2012 UTC (6 years, 4 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1191 , to selected 1121
Ensure data types for document creation

Revision 1191 - (view) (download) (annotate) - [select for diffs]
Modified Wed Oct 3 17:31:39 2012 UTC (6 years, 5 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1189 , to selected 1121
Gracefully handle empty columns and rows in weighting functions

Revision 1189 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 16 05:31:22 2012 UTC (6 years, 7 months ago) by feinerer
File length: 654 byte(s)
Diff to previous 1188 , to selected 1121
Update Authors@R

Revision 1188 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jul 27 08:47:50 2012 UTC (6 years, 7 months ago) by feinerer
File length: 589 byte(s)
Diff to previous 1181 , to selected 1121
Allow more simultaneous (stop)words in removeWords()

Revision 1181 - (view) (download) (annotate) - [select for diffs]
Modified Thu Mar 8 11:22:56 2012 UTC (7 years ago) by feinerer
File length: 589 byte(s)
Diff to previous 1176 , to selected 1121
Performance improvement as suggested by Milan Bouchet-Valat

Revision 1176 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 3 07:22:32 2012 UTC (7 years, 1 month ago) by feinerer
File length: 589 byte(s)
Diff to previous 1175 , to selected 1121
Prepare for CRAN minor patch release

Revision 1175 - (view) (download) (annotate) - [select for diffs]
Modified Wed Feb 1 06:08:02 2012 UTC (7 years, 1 month ago) by feinerer
File length: 589 byte(s)
Diff to previous 1174 , to selected 1121
Readers can now set the document language

Revision 1174 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 23 09:55:47 2012 UTC (7 years, 2 months ago) by feinerer
File length: 589 byte(s)
Diff to previous 1173 , to selected 1121
Add Catalan stopwords

Revision 1173 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 16 15:05:22 2012 UTC (7 years, 2 months ago) by feinerer
File length: 589 byte(s)
Diff to previous 1169 , to selected 1121
Process tolower and tokenize options first in termFreq()

Revision 1169 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 14 11:32:38 2012 UTC (7 years, 2 months ago) by feinerer
File length: 589 byte(s)
Diff to previous 1168 , to selected 1121
Simplify XMLSource; Use vignettes directory

Revision 1168 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 11 10:35:44 2012 UTC (7 years, 2 months ago) by feinerer
File length: 589 byte(s)
Diff to previous 1167 , to selected 1121
Fix processing of user provided stopwords

Revision 1167 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 23 09:44:33 2011 UTC (7 years, 3 months ago) by feinerer
File length: 589 byte(s)
Diff to previous 1166 , to selected 1121
Fix invalid handling of control[1] argument to termFreq()

Revision 1166 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 17 10:32:05 2011 UTC (7 years, 3 months ago) by feinerer
File length: 587 byte(s)
Diff to previous 1164 , to selected 1121
Prepare for CRAN Christmas release

Revision 1164 - (view) (download) (annotate) - [select for diffs]
Modified Mon Dec 12 06:42:28 2011 UTC (7 years, 3 months ago) by feinerer
File length: 658 byte(s)
Diff to previous 1161 , to selected 1121
Map empty input to 'porter'

Revision 1161 - (view) (download) (annotate) - [select for diffs]
Modified Wed Dec 7 06:10:32 2011 UTC (7 years, 3 months ago) by feinerer
File length: 657 byte(s)
Diff to previous 1159 , to selected 1121
Add option to removePunctuation() to preserve intra-word dashes

Revision 1159 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 6 15:11:45 2011 UTC (7 years, 3 months ago) by feinerer
File length: 657 byte(s)
Diff to previous 1157 , to selected 1121
Make termFreq() sensitive to the order of control options

Revision 1157 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 17:20:31 2011 UTC (7 years, 4 months ago) by feinerer
File length: 657 byte(s)
Diff to previous 1155 , to selected 1121
Depend on R >= 2.14.0

Revision 1155 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 16:53:26 2011 UTC (7 years, 4 months ago) by feinerer
File length: 657 byte(s)
Diff to previous 1153 , to selected 1121
Use tools:::pdf_info() instead of external pdfinfo tool

Revision 1153 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 15:45:31 2011 UTC (7 years, 4 months ago) by feinerer
File length: 671 byte(s)
Diff to previous 1151 , to selected 1121
Add SMART stopword list

Revision 1151 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 14:21:49 2011 UTC (7 years, 4 months ago) by feinerer
File length: 671 byte(s)
Diff to previous 1150 , to selected 1121
Add generalized bounds checking

Revision 1150 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 15 15:37:17 2011 UTC (7 years, 4 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1149 , to selected 1121
Document MC_tokenizer(), scan_tokenizer(), and getTokenizers()

Revision 1149 - (view) (download) (annotate) - [select for diffs]
Modified Fri Nov 4 15:48:50 2011 UTC (7 years, 4 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1142 , to selected 1121
Export and document c.term_frequency() and as.TermDocumentMatrix.term_frequency()

Revision 1142 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 30 10:16:29 2011 UTC (7 years, 6 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1139 , to selected 1121
Documentation for weighting schemata in SMART notation

Revision 1139 - (view) (download) (annotate) - [select for diffs]
Modified Wed Aug 24 15:21:02 2011 UTC (7 years, 7 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1136 , to selected 1121
Raise error if no stopwords are available for requested language

Revision 1136 - (view) (download) (annotate) - [select for diffs]
Modified Fri May 27 11:50:39 2011 UTC (7 years, 9 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1135 , to selected 1121
Improve SMART weighting (still buggy)

Revision 1135 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 15 06:18:54 2011 UTC (7 years, 11 months ago) by khornik
File length: 567 byte(s)
Diff to previous 1128 , to selected 1121
Export and document Blei et al reader.

Revision 1128 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 8 17:36:10 2011 UTC (7 years, 11 months ago) by khornik
File length: 567 byte(s)
Diff to previous 1122 , to selected 1121
Add functionality for obtaining DTMs and TDMs from t/f matrices coercible
to simple triplet matrices.

Revision 1122 - (view) (download) (annotate) - [select for diffs]
Modified Sun Feb 20 07:38:31 2011 UTC (8 years, 1 month ago) by feinerer
File length: 567 byte(s)
Diff to previous 1121
Use document language for stemDocument().

Revision 1121 - (view) (download) (annotate) - [selected]
Modified Thu Feb 17 17:13:45 2011 UTC (8 years, 1 month ago) by feinerer
File length: 569 byte(s)
Diff to previous 1117
Bug fix. Use language argument for stemDocument().

Revision 1117 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 4 20:44:37 2011 UTC (8 years, 1 month ago) by feinerer
File length: 569 byte(s)
Diff to previous 1114 , to selected 1121
Sources now store strings and connections instead of unevaluated calls. Improve documentation.

Revision 1114 - (view) (download) (annotate) - [select for diffs]
Modified Fri Nov 26 14:05:54 2010 UTC (8 years, 3 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1113 , to selected 1121
Allow init and exit hooks for readers

Revision 1113 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 11 15:22:22 2010 UTC (8 years, 4 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1110 , to selected 1121
First draft of words()

Revision 1110 - (view) (download) (annotate) - [select for diffs]
Modified Fri Oct 29 13:59:52 2010 UTC (8 years, 4 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1108 , to selected 1121
Add OpenOffice reader, getTokenizer() lists available tokenizers

Revision 1108 - (view) (download) (annotate) - [select for diffs]
Modified Fri Oct 22 18:32:47 2010 UTC (8 years, 5 months ago) by feinerer
File length: 569 byte(s)
Diff to previous 1107 , to selected 1121
Change Weighting from list element to attribute, access documents by name

Revision 1107 - (view) (download) (annotate) - [select for diffs]
Modified Mon Oct 18 09:26:16 2010 UTC (8 years, 5 months ago) by khornik
File length: 569 byte(s)
Diff to previous 1102 , to selected 1121
Improve code/docs for system requirements.

Revision 1102 - (view) (download) (annotate) - [select for diffs]
Modified Sat Oct 16 10:01:09 2010 UTC (8 years, 5 months ago) by feinerer
File length: 486 byte(s)
Diff to previous 1101 , to selected 1121
Access documents by their document ID

Revision 1101 - (view) (download) (annotate) - [select for diffs]
Modified Thu Oct 14 13:03:25 2010 UTC (8 years, 5 months ago) by feinerer
File length: 486 byte(s)
Diff to previous 1098 , to selected 1121
Update NEWS

Revision 1098 - (view) (download) (annotate) - [select for diffs]
Modified Mon Sep 27 13:44:38 2010 UTC (8 years, 5 months ago) by khornik
File length: 486 byte(s)
Diff to previous 1093 , to selected 1121
New release.

Revision 1093 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 23 18:36:33 2010 UTC (8 years, 7 months ago) by feinerer
File length: 486 byte(s)
Diff to previous 1092 , to selected 1121
Allow removePunctuation parameter for termFreq() to be a function or a list

Revision 1092 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 23 18:19:53 2010 UTC (8 years, 7 months ago) by feinerer
File length: 486 byte(s)
Diff to previous 1091 , to selected 1121
Add SystemRequirements for antiword, pdfinfo, and pdftotext

Revision 1091 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 19 08:22:22 2010 UTC (8 years, 7 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1084 , to selected 1121
Prepare for new CRAN release

Revision 1084 - (view) (download) (annotate) - [select for diffs]
Modified Fri Aug 6 21:47:23 2010 UTC (8 years, 7 months ago) by feinerer
File length: 393 byte(s)
Diff to previous 1080 , to selected 1121
Remove convert_UTF_8() (use enc2utf8() instead)

Revision 1080 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jun 17 13:47:05 2010 UTC (8 years, 9 months ago) by feinerer
File length: 393 byte(s)
Diff to previous 1075 , to selected 1121
Use all words from a dictionary when tabulating against it in a term-document matrix

Revision 1075 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jun 2 17:52:04 2010 UTC (8 years, 9 months ago) by feinerer
File length: 393 byte(s)
Diff to previous 1073 , to selected 1121
Plotting functions for Zipf's and Heaps' law

Revision 1073 - (view) (download) (annotate) - [select for diffs]
Modified Fri May 28 12:32:46 2010 UTC (8 years, 9 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 1070 , to selected 1121
Use IETF language tags for language codes

Revision 1070 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 18 08:58:22 2010 UTC (8 years, 10 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 1068 , to selected 1121
Use element names as document IDs if provided by a source

Revision 1068 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 5 10:09:47 2010 UTC (8 years, 10 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 1067 , to selected 1121
Improve stem completion.

Revision 1067 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 11 06:38:47 2010 UTC (8 years, 11 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 1063 , to selected 1121
Use match() instead of %in%

Revision 1063 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 9 10:36:39 2010 UTC (8 years, 11 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 1062 , to selected 1121
Sources can now provide document names

Revision 1062 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 7 17:25:20 2010 UTC (8 years, 11 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 1061 , to selected 1121
content_or_meta utility function

Revision 1061 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 19 11:41:37 2010 UTC (9 years ago) by feinerer
File length: 392 byte(s)
Diff to previous 1059 , to selected 1121
Extract TOPICS, LEWISSPLIT, CGISPLIT, and OLDID meta tags from Reuters-21578 documents

Revision 1059 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 15 21:34:29 2010 UTC (9 years ago) by feinerer
File length: 392 byte(s)
Diff to previous 1055 , to selected 1121
Depend on recent slam version.

Revision 1055 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 15 15:55:02 2010 UTC (9 years ago) by feinerer
File length: 391 byte(s)
Diff to previous 1054 , to selected 1121
First attempt for weightings using SMART notation.

Revision 1054 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 12 15:56:30 2010 UTC (9 years ago) by feinerer
File length: 391 byte(s)
Diff to previous 1048 , to selected 1121
Restore names of dimnames after subsetting.

Revision 1048 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 3 06:14:10 2010 UTC (9 years ago) by feinerer
File length: 391 byte(s)
Diff to previous 1047 , to selected 1121
Add General Inquirer example for sentiment analysis.

Revision 1047 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 26 15:08:01 2010 UTC (9 years ago) by feinerer
File length: 391 byte(s)
Diff to previous 1042 , to selected 1121
Avoid Internet access for examples in the documentation.

Revision 1042 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 19 07:26:44 2010 UTC (9 years, 1 month ago) by feinerer
File length: 389 byte(s)
Diff to previous 1041 , to selected 1121
Prepare for new CRAN release.

Revision 1041 - (view) (download) (annotate) - [select for diffs]
Modified Thu Feb 18 06:15:15 2010 UTC (9 years, 1 month ago) by feinerer
File length: 391 byte(s)
Diff to previous 1040 , to selected 1121
Added new stem completion heuristics. Improved plot function for term-document matrices.

Revision 1040 - (view) (download) (annotate) - [select for diffs]
Modified Sat Feb 6 10:33:03 2010 UTC (9 years, 1 month ago) by feinerer
File length: 391 byte(s)
Diff to previous 1039 , to selected 1121
Depend on R (>= 2.10.0).

Revision 1039 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 22 13:01:33 2010 UTC (9 years, 2 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1038 , to selected 1121
Add stemDocument.character().

Revision 1038 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 15 12:12:41 2010 UTC (9 years, 2 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1035 , to selected 1121
Extract more meta data from Reuters Corpus Volume 1 data set.

Revision 1035 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 14 08:59:43 2010 UTC (9 years, 2 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1034 , to selected 1121
Add readRCV1asPlain reader.

Revision 1034 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 12 16:47:41 2010 UTC (9 years, 2 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1033 , to selected 1121
Be careful with names attribute.

Revision 1033 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 9 09:33:54 2010 UTC (9 years, 2 months ago) by feinerer
File length: 388 byte(s)
Diff to previous 1032 , to selected 1121
Clean up and prepare for CRAN release.

Revision 1032 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 7 12:09:51 2010 UTC (9 years, 2 months ago) by stefan7th
File length: 389 byte(s)
Diff to previous 1030 , to selected 1121
changelog, new version

Revision 1030 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 5 17:38:58 2010 UTC (9 years, 2 months ago) by dmeyer
File length: 389 byte(s)
Diff to previous 1029 , to selected 1121
rowSums -> row_sums due to change in slam


Revision 1029 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 22 13:40:25 2009 UTC (9 years, 3 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1026 , to selected 1121
Use encoding argument in URISource.

Revision 1026 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 11 10:31:42 2009 UTC (9 years, 3 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1025 , to selected 1121
Fix c.TermDocumentMatrix().

Revision 1025 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 11 08:56:22 2009 UTC (9 years, 3 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1023 , to selected 1121
Register S3 document classes to be recognized by S4 methods.

Revision 1023 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 25 06:08:20 2009 UTC (9 years, 4 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1022 , to selected 1121
Add option to termFreq() to remove punctuation characters.

Revision 1022 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 19 21:33:19 2009 UTC (9 years, 4 months ago) by feinerer
File length: 390 byte(s)
Diff to previous 1020 , to selected 1121
Added a combine method for merging multiple term-document matrices.

Revision 1020 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 17 09:16:13 2009 UTC (9 years, 4 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1019 , to selected 1121
Use \dontrun{} in plot.TermDocumentMatrix \examples{} section in the hope that CRAN Mac OS X builds do not fail any longer due to missing Rgraphviz dependencies.

Revision 1019 - (view) (download) (annotate) - [select for diffs]
Modified Mon Nov 16 08:20:55 2009 UTC (9 years, 4 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1018 , to selected 1121
Use whitespace oriented tokenizer instead of AlphabeticTokenizer (from RWeka) as default.

Revision 1018 - (view) (download) (annotate) - [select for diffs]
Modified Sun Nov 15 15:53:49 2009 UTC (9 years, 4 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1017 , to selected 1121
Fix bug in removeWords(). Refactoring of term-document matrix constructor. Clean up of defunct functions.

Revision 1017 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 12 16:18:54 2009 UTC (9 years, 4 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1015 , to selected 1121
Improve DirSource().

Revision 1015 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 7 11:15:19 2009 UTC (9 years, 4 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1014 , to selected 1121
Avoid prefixes from named documents when building a term-document matrix.

Revision 1014 - (view) (download) (annotate) - [select for diffs]
Modified Tue Oct 27 15:14:55 2009 UTC (9 years, 4 months ago) by feinerer
File length: 379 byte(s)
Diff to previous 1013 , to selected 1121
Update version for CRAN upload.

Revision 1013 - (view) (download) (annotate) - [select for diffs]
Modified Wed Oct 21 12:34:39 2009 UTC (9 years, 5 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1011 , to selected 1121
Improve regular expressions in removeWords().

Revision 1011 - (view) (download) (annotate) - [select for diffs]
Modified Mon Oct 19 12:20:43 2009 UTC (9 years, 5 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1010 , to selected 1121
Allow lower case Dublin Core tags.

Revision 1010 - (view) (download) (annotate) - [select for diffs]
Modified Fri Oct 9 12:48:37 2009 UTC (9 years, 5 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1009 , to selected 1121
Use xmlChildren().

Revision 1009 - (view) (download) (annotate) - [select for diffs]
Modified Sat Oct 3 07:00:48 2009 UTC (9 years, 5 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1007 , to selected 1121
Fix typo.

Revision 1007 - (view) (download) (annotate) - [select for diffs]
Modified Tue Sep 15 18:02:44 2009 UTC (9 years, 6 months ago) by feinerer
File length: 381 byte(s)
Diff to previous 1004 , to selected 1121
Fix generated file names.

Revision 1004 - (view) (download) (annotate) - [select for diffs]
Modified Tue Sep 8 10:28:28 2009 UTC (9 years, 6 months ago) by feinerer
File length: 377 byte(s)
Diff to previous 1003 , to selected 1121
Improve vignette.

Revision 1003 - (view) (download) (annotate) - [select for diffs]
Modified Tue Sep 8 06:00:14 2009 UTC (9 years, 6 months ago) by feinerer
File length: 377 byte(s)
Diff to previous 1001 , to selected 1121
Remove extra LICENCE file, as we want GPL.

Revision 1001 - (view) (download) (annotate) - [select for diffs]
Modified Mon Sep 7 20:23:44 2009 UTC (9 years, 6 months ago) by feinerer
File length: 392 byte(s)
Diff to previous 996 , to selected 1121
Add copyright and licence statements.

Revision 996 - (view) (download) (annotate) - [select for diffs]
Modified Mon Sep 7 08:27:30 2009 UTC (9 years, 6 months ago) by feinerer
File length: 372 byte(s)
Diff to previous 993 , to selected 1121
Small fix in meta().

Revision 993 - (view) (download) (annotate) - [select for diffs]
Modified Sun Sep 6 17:51:08 2009 UTC (9 years, 6 months ago) by feinerer
File length: 372 byte(s)
Diff to previous 988 , to selected 1121
Update NEWS.

Revision 988 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 4 12:27:12 2009 UTC (9 years, 6 months ago) by feinerer
File length: 372 byte(s)
Diff to previous 987 , to selected 1121
Update documentation.

Revision 987 - (view) (download) (annotate) - [select for diffs]
Modified Wed Sep 2 17:54:45 2009 UTC (9 years, 6 months ago) by feinerer
File length: 386 byte(s)
Diff to previous 986 , to selected 1121
Update documentation.

Revision 986 - (view) (download) (annotate) - [select for diffs]
Modified Tue Sep 1 15:33:30 2009 UTC (9 years, 6 months ago) by feinerer
File length: 386 byte(s)
Diff to previous 985 , to selected 1121
Further changes due to S3 class system.

Revision 985 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 27 18:09:05 2009 UTC (9 years, 6 months ago) by feinerer
File length: 386 byte(s)
Diff to previous 981 , to selected 1121
Use S3 instead of S4 class system.

Revision 981 - (view) (download) (annotate) - [select for diffs]
Modified Fri Aug 7 09:04:37 2009 UTC (9 years, 7 months ago) by feinerer
File length: 399 byte(s)
Diff to previous 973 , to selected 1121
Factor out mail handling functionality to tm.plugin.mail package.

Revision 973 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 4 08:10:25 2009 UTC (9 years, 8 months ago) by feinerer
File length: 399 byte(s)
Diff to previous 972 , to selected 1121
Rename readNewsgroup to readMail.

Revision 972 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jul 3 16:16:59 2009 UTC (9 years, 8 months ago) by feinerer
File length: 399 byte(s)
Diff to previous 969 , to selected 1121
Move removeCitation, removeMultipart, and removeSignature to the tau package.

Revision 969 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 30 09:31:12 2009 UTC (9 years, 8 months ago) by feinerer
File length: 395 byte(s)
Diff to previous 968 , to selected 1121
Imports slam (instead of Depends).

Revision 968 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 30 07:08:54 2009 UTC (9 years, 8 months ago) by feinerer
File length: 387 byte(s)
Diff to previous 963 , to selected 1121
Remove internal tm functions provided by slam.

Revision 963 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jun 29 07:01:19 2009 UTC (9 years, 8 months ago) by feinerer
File length: 376 byte(s)
Diff to previous 962 , to selected 1121
Rename SCorpus to VCorpus (Volatile Corpus).

Revision 962 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 28 15:52:33 2009 UTC (9 years, 8 months ago) by feinerer
File length: 376 byte(s)
Diff to previous 960 , to selected 1121
Fix documentation.

Revision 960 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jun 26 17:43:45 2009 UTC (9 years, 8 months ago) by feinerer
File length: 376 byte(s)
Diff to previous 959 , to selected 1121
Add slam dependency and readReut21578XMLasPlain reader.

Revision 959 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jun 17 18:22:35 2009 UTC (9 years, 9 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 958 , to selected 1121
Fix character(0) handling in stemDoc().

Revision 958 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jun 13 06:06:42 2009 UTC (9 years, 9 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 957 , to selected 1121
Code cleanup.

Revision 957 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jun 12 12:47:57 2009 UTC (9 years, 9 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 956 , to selected 1121
Pretty print.

Revision 956 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jun 12 06:41:09 2009 UTC (9 years, 9 months ago) by gruen
File length: 370 byte(s)
Diff to previous 954 , to selected 1121
year 2009

Revision 954 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 27 18:33:32 2009 UTC (9 years, 9 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 952 , to selected 1121
Handle empty matrices gracefully.

Revision 952 - (view) (download) (annotate) - [select for diffs]
Modified Mon May 18 13:43:01 2009 UTC (9 years, 10 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 950 , to selected 1121
Further work on FCorpus integration.

Revision 950 - (view) (download) (annotate) - [select for diffs]
Modified Thu May 14 15:17:18 2009 UTC (9 years, 10 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 946 , to selected 1121
Experimental FCorpus (fast corpus).

Revision 946 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 13 18:07:35 2009 UTC (9 years, 10 months ago) by feinerer
File length: 370 byte(s)
Diff to previous 945 , to selected 1121
A lot of major improvements (see NEWS).

Revision 945 - (view) (download) (annotate) - [select for diffs]
Modified Mon May 4 10:57:01 2009 UTC (9 years, 10 months ago) by feinerer
File length: 461 byte(s)
Diff to previous 942 , to selected 1121
Export some simple_triplet_matrix functions.

Revision 942 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 28 11:02:24 2009 UTC (9 years, 10 months ago) by feinerer
File length: 459 byte(s)
Diff to previous 941 , to selected 1121
Adapt tf-idf to new matrix format.

Revision 941 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 27 15:36:43 2009 UTC (9 years, 10 months ago) by feinerer
File length: 459 byte(s)
Diff to previous 938 , to selected 1121
Create two distinct classes for term-document and document-term matrices.

Revision 938 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 25 19:05:50 2009 UTC (9 years, 11 months ago) by feinerer
File length: 459 byte(s)
Diff to previous 937 , to selected 1121
Get rid of Matrix package dependency.

Revision 937 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 16 21:09:49 2009 UTC (9 years, 11 months ago) by feinerer
File length: 469 byte(s)
Diff to previous 930 , to selected 1121
Documentation update. Remove some require() calls.

Revision 930 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 11 08:49:37 2009 UTC (9 years, 11 months ago) by feinerer
File length: 469 byte(s)
Diff to previous 929 , to selected 1121
Fix code/documentation mismatch in vignette.

Revision 929 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 9 06:22:21 2009 UTC (9 years, 11 months ago) by feinerer
File length: 469 byte(s)
Diff to previous 928 , to selected 1121
Always use Snowball for stemming.

Revision 928 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 18:27:35 2009 UTC (9 years, 11 months ago) by feinerer
File length: 480 byte(s)
Diff to previous 926 , to selected 1121
Update documentation.

Revision 926 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 06:50:02 2009 UTC (9 years, 11 months ago) by feinerer
File length: 480 byte(s)
Diff to previous 923 , to selected 1121
tmReduce() allows to combine multiple maps into one transformation.

Revision 923 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 3 08:07:20 2009 UTC (9 years, 11 months ago) by feinerer
File length: 480 byte(s)
Diff to previous 922 , to selected 1121
Further work on new TermDocumentMatrix.

Revision 922 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 31 16:41:02 2009 UTC (9 years, 11 months ago) by feinerer
File length: 480 byte(s)
Diff to previous 919 , to selected 1121
Fix invalid slot access in subset method for TermDocumentMatrix.

Revision 919 - (view) (download) (annotate) - [select for diffs]
Modified Sat Mar 28 17:13:29 2009 UTC (9 years, 11 months ago) by feinerer
File length: 480 byte(s)
Diff to previous 917 , to selected 1121
Finished vignette 'Extensions: How to Handle Custom File Formats'.

Revision 917 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 27 11:55:45 2009 UTC (9 years, 11 months ago) by feinerer
File length: 480 byte(s)
Diff to previous 915 , to selected 1121
Update documentation.

Revision 915 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 25 20:04:35 2009 UTC (10 years ago) by feinerer
File length: 480 byte(s)
Diff to previous 914 , to selected 1121
Improve readCustom().

Revision 914 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 24 20:10:57 2009 UTC (10 years ago) by feinerer
File length: 480 byte(s)
Diff to previous 909 , to selected 1121
Use readXML() for readGmane().

Revision 909 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 22 12:45:59 2009 UTC (10 years ago) by feinerer
File length: 480 byte(s)
Diff to previous 904 , to selected 1121
Sources now can be vectorized.

Revision 904 - (view) (download) (annotate) - [select for diffs]
Modified Sat Mar 21 08:15:11 2009 UTC (10 years ago) by feinerer
File length: 480 byte(s)
Diff to previous 900 , to selected 1121
No longer try to start a MPI cluster in .onLoad().

Revision 900 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 20 16:50:27 2009 UTC (10 years ago) by feinerer
File length: 480 byte(s)
Diff to previous 895 , to selected 1121
Add URL to DESCRIPTION. Use Reduce() function.

Revision 895 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 10 17:59:34 2009 UTC (10 years ago) by feinerer
File length: 442 byte(s)
Diff to previous 886 , to selected 1121
Add pattern and ignore.case arguments to DirSource constructor.

Revision 886 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 22:47:34 2009 UTC (10 years, 1 month ago) by feinerer
File length: 442 byte(s)
Diff to previous 885 , to selected 1121
Speed up package loading (Depends -> Suggests).

Revision 885 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 09:34:44 2009 UTC (10 years, 1 month ago) by stefan7th
File length: 436 byte(s)
Copied from: pkg/tm/DESCRIPTION revision 884
Diff to previous 884 , to selected 1121
moved package to /pkg

Revision 884 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 28 10:24:27 2009 UTC (10 years, 1 month ago) by stefan7th
Original Path: pkg/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 882 , to selected 1121
R-Forge transition completed

Revision 882 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 8 15:35:49 2009 UTC (10 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 881 , to selected 1121
The readNewsgroup() reader function can now be configured for

Revision 881 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 20 09:06:13 2008 UTC (10 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 434 byte(s)
Diff to previous 877 , to selected 1121
Fix off-by-one error in convertMboxEml() function.

Revision 877 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 16 11:31:47 2008 UTC (10 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 875 , to selected 1121
Sort row indices when generating a term-document matrix (fixes a problem with the Matrix package).

Revision 875 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 6 13:25:03 2008 UTC (10 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 874 , to selected 1121
Fixed non-standard call evaluation.

Revision 874 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 29 16:24:45 2008 UTC (10 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 873 , to selected 1121
New URISource.

Revision 873 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 27 08:26:53 2008 UTC (10 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 872 , to selected 1121
Code refactoring for sources.

Revision 872 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 25 16:36:08 2008 UTC (10 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 436 byte(s)
Diff to previous 870 , to selected 1121
Use tryCatch() to handle misconfigured Rmpi installations more gracefully.

Revision 870 - (view) (download) (annotate) - [select for diffs]
Modified Mon Nov 10 15:29:22 2008 UTC (10 years, 4 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 434 byte(s)
Diff to previous 869 , to selected 1121
Fix documentation and codoc mismatches.

Revision 869 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 8 09:16:37 2008 UTC (10 years, 4 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 434 byte(s)
Diff to previous 868 , to selected 1121
Sources now have a Length slot. Knowing the length in advance makes corpus construction a lot faster (~ 8 times faster).

Revision 868 - (view) (download) (annotate) - [select for diffs]
Modified Mon Nov 3 16:43:04 2008 UTC (10 years, 4 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 434 byte(s)
Diff to previous 866 , to selected 1121
Add Rmpi to Suggests of tm.

Revision 866 - (view) (download) (annotate) - [select for diffs]
Modified Sun Nov 2 09:11:00 2008 UTC (10 years, 4 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 428 byte(s)
Diff to previous 865 , to selected 1121
Fixed variable binding warning and signature mismatch in documentation.

Revision 865 - (view) (download) (annotate) - [select for diffs]
Modified Sun Aug 3 13:20:22 2008 UTC (10 years, 7 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 428 byte(s)
Diff to previous 862 , to selected 1121
Introduce name abbreviations for weighting functions.

Revision 862 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 24 10:41:25 2008 UTC (10 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 428 byte(s)
Diff to previous 861 , to selected 1121
Use namespace.

Revision 861 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 24 09:55:09 2008 UTC (10 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 434 byte(s)
Diff to previous 860 , to selected 1121
tmIndex(), tmFilter(), tmMap(), and TermDocMatrix() now use a MPI cluster if available.

Revision 860 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jul 18 05:05:20 2008 UTC (10 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 424 byte(s)
Diff to previous 857 , to selected 1121
Removed some forgotten debug print out.

Revision 857 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jul 8 16:01:47 2008 UTC (10 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 429 byte(s)
Diff to previous 856 , to selected 1121
Removed tm-internal. Better (consistent) naming for dictionary functions.

Revision 856 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jun 6 11:45:39 2008 UTC (10 years, 9 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 429 byte(s)
Diff to previous 854 , to selected 1121
Improved meta data extraction from Reuters Corpus Volume 1 documents.

Revision 854 - (view) (download) (annotate) - [select for diffs]
Modified Sun May 25 13:15:06 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 429 byte(s)
Diff to previous 853 , to selected 1121
searchFullText is now the default function used for tmFilter and tmIndex.

Revision 853 - (view) (download) (annotate) - [select for diffs]
Modified Sun May 18 13:09:35 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 429 byte(s)
Diff to previous 852 , to selected 1121
Improved stem completion. Some documentation fixes.

Revision 852 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 14 14:35:32 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 429 byte(s)
Diff to previous 850 , to selected 1121
Minor documentation fix.

Revision 850 - (view) (download) (annotate) - [select for diffs]
Modified Thu May 1 16:37:19 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 429 byte(s)
Diff to previous 849 , to selected 1121
Removed Encoding tag in DESCRIPTION.

Revision 849 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 30 06:05:48 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 848 , to selected 1121
Removed PDF example from vignette to avoid R CMD check warnings under Windows.

Revision 848 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 29 16:51:43 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 443 byte(s)
Diff to previous 847 , to selected 1121
Improved vignette.

Revision 847 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 27 16:16:47 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 443 byte(s)
Diff to previous 843 , to selected 1121
Improved manuals.

Revision 843 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 25 12:31:51 2008 UTC (10 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 839 , to selected 1121
Added Dublin Core documentation.

Revision 839 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 23 12:35:01 2008 UTC (10 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 837 , to selected 1121
Added documentation for VectorSource.

Revision 837 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 23 09:16:25 2008 UTC (10 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 836 , to selected 1121
Improved show methods.

Revision 836 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 19 17:08:07 2008 UTC (10 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 834 , to selected 1121
Improved meta data handling. Added coerce method from list to corpus. Updated CITATION file.

Revision 834 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 26 13:57:07 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 833 , to selected 1121
Commented out faulty code parts (relevant under Windows) in vignette.

Revision 833 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 10:55:11 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 445 byte(s)
Diff to previous 832 , to selected 1121
Included improvements suggested by Christian Buchta. Added CITATION file.

Revision 832 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 12 12:59:48 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 358 byte(s)
Diff to previous 831 , to selected 1121
Added VectorSource.

Revision 831 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 12 09:10:46 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 358 byte(s)
Diff to previous 829 , to selected 1121
Fixed bug in [[<- (reported by Christian Buchta).

Revision 829 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 10 22:55:39 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 358 byte(s)
Diff to previous 828 , to selected 1121
First version of working lazy mapping.

Revision 828 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 9 07:47:15 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 358 byte(s)
Diff to previous 827 , to selected 1121
Some preliminary code for lazy mapping.

Revision 827 - (view) (download) (annotate) - [select for diffs]
Modified Mon Feb 25 16:52:55 2008 UTC (11 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 358 byte(s)
Diff to previous 825 , to selected 1121
Small bug fix.

Revision 825 - (view) (download) (annotate) - [select for diffs]
Modified Sat Feb 23 09:47:28 2008 UTC (11 years, 1 month ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 358 byte(s)
Diff to previous 824 , to selected 1121
Update documentation: language codes should be in ISO 639-1 format.

Revision 824 - (view) (download) (annotate) - [select for diffs]
Modified Sun Feb 10 09:58:43 2008 UTC (11 years, 1 month ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 383 byte(s)
Diff to previous 822 , to selected 1121
Documentation update.

Revision 822 - (view) (download) (annotate) - [select for diffs]
Modified Wed Feb 6 13:06:15 2008 UTC (11 years, 1 month ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 379 byte(s)
Diff to previous 820 , to selected 1121
Renamed completeStems to stemCompletion (suggested by David Meyer).

Revision 820 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 1 10:05:21 2008 UTC (11 years, 1 month ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 379 byte(s)
Diff to previous 816 , to selected 1121
Documentation update.

Revision 816 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 24 14:36:41 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 379 byte(s)
Diff to previous 813 , to selected 1121
Renamed TextDocCol to Corpus, and Corpus to Content.

Revision 813 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 22 18:46:13 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 810 , to selected 1121
New function meta() for consistent access to meta data of document collections, repositories, and texts.

Revision 810 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 21 17:14:06 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 808 , to selected 1121
Better support for encodings.

Revision 808 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 13 16:18:27 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 807 , to selected 1121
Fixed bug regarding default reader selection when no reader argument is given.

Revision 807 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 5 10:35:53 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 806 , to selected 1121
CSVSource now uses read.csv instead of scan internally.

Revision 806 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 2 10:29:14 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 805 , to selected 1121
Modular TermDocMatrix constructor is now default.

Revision 805 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 1 14:10:40 2008 UTC (11 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 802 , to selected 1121
Added function (getReaders) returning all available reader functions.

Revision 802 - (view) (download) (annotate) - [select for diffs]
Modified Sun Dec 2 09:28:41 2007 UTC (11 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 384 byte(s)
Diff to previous 799 , to selected 1121
See ChangeLog.

Revision 799 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 29 11:05:23 2007 UTC (11 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 368 byte(s)
Diff to previous 796 , to selected 1121
Better handling of empty arguments in TextDocCol. Exported readDOC.

Revision 796 - (view) (download) (annotate) - [select for diffs]
Modified Tue Nov 6 15:22:34 2007 UTC (11 years, 4 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 368 byte(s)
Diff to previous 795 , to selected 1121
Correct processing of empty documents.

Revision 795 - (view) (download) (annotate) - [select for diffs]
Modified Sat Oct 27 09:14:35 2007 UTC (11 years, 4 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 368 byte(s)
Diff to previous 790 , to selected 1121
Updated documentation

Revision 790 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 21 08:27:13 2007 UTC (11 years, 5 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 367 byte(s)
Diff to previous 785 , to selected 1121
Exported termFreq to NAMESPACE. New modular constructor for TermDocMatrix (called TermDocMatrix2 at the moment).

Revision 785 - (view) (download) (annotate) - [select for diffs]
Modified Sat Oct 13 10:46:28 2007 UTC (11 years, 5 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 367 byte(s)
Diff to previous 780 , to selected 1121
Added plot function for term-document matrices.

Revision 780 - (view) (download) (annotate) - [select for diffs]
Modified Sat Sep 29 13:24:17 2007 UTC (11 years, 5 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 777 , to selected 1121
Added three transformations often used for e-mail analyses.

Revision 777 - (view) (download) (annotate) - [select for diffs]
Modified Tue Aug 28 07:19:12 2007 UTC (11 years, 6 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 776 , to selected 1121
Function generators are now real S4 classes instead of S3 attributes.

Revision 776 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jul 29 15:27:41 2007 UTC (11 years, 7 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 775 , to selected 1121
Removed manual pdftotext and pdfinfo checks (the system call gives a warning anyway).

Revision 775 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 28 13:57:02 2007 UTC (11 years, 7 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 774 , to selected 1121
Added conversion (asPlain) from StructuredTextDocuments to PlainTextDocuments.

Revision 774 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 21 16:25:54 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 773 , to selected 1121
Added convenience methods for term-document matrices.

Revision 773 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 21 12:05:08 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 772 , to selected 1121
Vignette: readPDF is only called if pdftotext and pdfinfo are installed.

Revision 772 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jul 20 14:00:58 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 356 byte(s)
Diff to previous 771 , to selected 1121
Updated TODO list.

Revision 771 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jul 19 07:59:20 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 354 byte(s)
Diff to previous 770 , to selected 1121
Updated version for new CRAN release.

Revision 770 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jul 17 12:41:04 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 354 byte(s)
Diff to previous 769 , to selected 1121
Improved TermDocMatrix's efficiency. Kudos to Christian Buchta.

Revision 769 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jul 15 16:31:59 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 340 byte(s)
Diff to previous 766 , to selected 1121
Fixed bug in tmUpdate.

Revision 766 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 14 08:46:23 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 340 byte(s)
Diff to previous 765 , to selected 1121
Added PDF reader based on pdftotext and pdfinfo.

Revision 765 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jul 13 15:53:45 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 340 byte(s)
Diff to previous 764 , to selected 1121
See ChangeLog.

Revision 764 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 11 17:36:17 2007 UTC (11 years, 8 months ago) by hornik
Original Path: trunk/tm/DESCRIPTION
File length: 332 byte(s)
Diff to previous 763 , to selected 1121
Canonicalize license info.

Revision 763 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 11 11:56:44 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 340 byte(s)
Diff to previous 762 , to selected 1121
Changed from cba to new proxy package for computing (dis)similarities.

Revision 762 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 11 06:46:17 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 761 , to selected 1121
Updated vignette.

Revision 761 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jul 10 14:59:57 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 760 , to selected 1121
Updated vignette.

Revision 760 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jun 21 22:40:15 2007 UTC (11 years, 9 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 757 , to selected 1121
require() uses the quietly option to suppress loading messages.

Revision 757 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jun 7 17:41:56 2007 UTC (11 years, 9 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 756 , to selected 1121
Added classes for Reuters21578 XML and RCV1 documents.

Revision 756 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jun 6 17:12:11 2007 UTC (11 years, 9 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 755 , to selected 1121
Fixed some typos in vignette.

Revision 755 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jun 3 17:20:40 2007 UTC (11 years, 9 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 754 , to selected 1121
Added replaceWords function.

Revision 754 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 22 18:11:22 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 752 , to selected 1121
Fixed documentation.

Revision 752 - (view) (download) (annotate) - [select for diffs]
Modified Sat May 19 22:39:04 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 751 , to selected 1121
Small bug fix in textvector(). Added new function removeSparseTerms().

Revision 751 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 15 18:01:43 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 750 , to selected 1121
Fixed documentation for tmUpdate.

Revision 750 - (view) (download) (annotate) - [select for diffs]
Modified Fri May 11 16:46:15 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 749 , to selected 1121
Fixed documentation.

Revision 749 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 8 17:26:09 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 748 , to selected 1121
StructuredTextDocument inherits from TextDocument.

Revision 748 - (view) (download) (annotate) - [select for diffs]
Modified Fri May 4 18:52:42 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 747 , to selected 1121
findFreqTerms operates now (very) efficiently on (big) sparse matrices. Thanks to Martin Maechler.

Revision 747 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 27 18:16:53 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 745 , to selected 1121
Removed dbDisconnect calls since deprecated by last filehash release.

Revision 745 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 23 00:57:26 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 336 byte(s)
Diff to previous 741 , to selected 1121
Fixed dimnames in sparse matrix. Updated date in DESCRIPTION.

Revision 741 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 21 18:35:16 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 336 byte(s)
Diff to previous 732 , to selected 1121
Switched back to filehash instead of filehashSQLite.

Revision 732 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 11 18:11:54 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 342 byte(s)
Diff to previous 730 , to selected 1121
Added stopwords for various languages.

Revision 730 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 11 02:15:10 2007 UTC (11 years, 11 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 344 byte(s)
Diff to previous 716 , to selected 1121
Updated documentation.

Revision 716 - (view) (download) (annotate) - [select for diffs]
Modified Thu Mar 15 17:22:39 2007 UTC (12 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 338 byte(s)
Diff to previous 713 , to selected 1121
Some improvements for TermDocMatrix.

Revision 713 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 14 13:44:11 2007 UTC (12 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 330 byte(s)
Diff to previous 712 , to selected 1121
Added Snowball support. Added function returning stopwords (English, German, French).

Revision 712 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 4 15:18:36 2007 UTC (12 years ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 320 byte(s)
Diff to previous 711 , to selected 1121
Started to implement database support to optimize RAM usage, i.e., minimize RAM demand if necessary.

Revision 711 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 30 13:03:55 2007 UTC (12 years, 1 month ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 310 byte(s)
Diff to previous 708 , to selected 1121
Fixed bug in documentation.

Revision 708 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 22 10:34:12 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 310 byte(s)
Diff to previous 704 , to selected 1121
Fixed bug in documentation.

Revision 704 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jan 12 10:05:15 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 310 byte(s)
Diff to previous 702 , to selected 1121
Update to version 0.1-1.

Revision 702 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 9 09:39:33 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 308 byte(s)
Diff to previous 698 , to selected 1121
wordStem now explicitly uses Rstem namespace.

Revision 698 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 6 17:05:44 2007 UTC (12 years, 2 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 297 byte(s)
Diff to previous 693 , to selected 1121
Changes due to Kurt's review.

Revision 693 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 22 13:21:30 2006 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/tm/DESCRIPTION
File length: 297 byte(s)
Diff to previous 690 , to selected 1121
Renamed textmin to tm directory since the package name changed.

Revision 690 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 16 17:22:56 2006 UTC (12 years, 3 months ago) by feinerer
Original Path: trunk/textmin/DESCRIPTION
File length: 297 byte(s)
Diff to previous 78 , to selected 1121
Renamed package to 'tm'. Updated documentation (man) for CRAN release.

Revision 78 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 29 14:56:36 2006 UTC (12 years, 3 months ago) by zeileis
Original Path: trunk/textmin/DESCRIPTION
File length: 306 byte(s)
Diff to previous 67 , to selected 1121
removed old repos structure, now only R packages

Revision 67 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 1 17:29:59 2006 UTC (12 years, 4 months ago) by feinerer
Original Path: trunk/R/textmin/DESCRIPTION
File length: 306 byte(s)
Diff to previous 63 , to selected 1121
See ChangeLog

Revision 63 - (view) (download) (annotate) - [select for diffs]
Modified Thu Oct 26 14:59:09 2006 UTC (12 years, 5 months ago) by feinerer
Original Path: trunk/R/textmin/DESCRIPTION
File length: 315 byte(s)
Diff to previous 47 , to selected 1121
See ChangeLog.

Revision 47 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jul 10 12:22:35 2006 UTC (12 years, 8 months ago) by feinerer
Original Path: trunk/R/textmin/DESCRIPTION
File length: 310 byte(s)
Copied from: trunk/R/trunk/DESCRIPTION revision 44
Diff to previous 46 , to selected 1121
Renamed tm to textmin directory.

Revision 46 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 5 18:08:41 2006 UTC (12 years, 8 months ago) by meyer
Original Path: trunk/R/tm/DESCRIPTION
File length: 310 byte(s)
Copied from: trunk/R/trunk/DESCRIPTION revision 44
Diff to previous 45 , to selected 1121
move


Revision 45 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 5 17:27:29 2006 UTC (12 years, 8 months ago) by meyer
Original Path: trunk/R/trunk/tm/DESCRIPTION
File length: 310 byte(s)
Copied from: trunk/R/trunk/DESCRIPTION revision 44
Diff to previous 28 , to selected 1121
move in subdir


Revision 28 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 6 13:46:33 2005 UTC (13 years, 3 months ago) by feinerer
Original Path: trunk/R/trunk/DESCRIPTION
File length: 310 byte(s)
Diff to previous 25 , to selected 1121
See ChangeLog

Revision 25 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 30 18:53:50 2005 UTC (13 years, 3 months ago) by feinerer
Original Path: trunk/R/trunk/DESCRIPTION
File length: 294 byte(s)
Diff to previous 16 , to selected 1121
See ChangeLog

Revision 16 - (view) (download) (annotate) - [select for diffs]
Added Fri Oct 7 09:42:57 2005 UTC (13 years, 5 months ago) by feinerer
Original Path: trunk/R/trunk/DESCRIPTION
File length: 272 byte(s)
Diff to selected 1121
Textmatrix code runs. Simple k-means text clustering (similarity based upon word frequences) works.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

R-Forge@R-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business University of Wisconsin - Madison Powered By FusionForge