SCM

SCM Repository

[tm] Log of /pkg/R/doc.R
[tm] / pkg / R / doc.R  
ViewVC logotype

Log of /pkg/R/doc.R

Parent Directory Parent Directory


Links to HEAD: (view) (download) (annotate)
Sticky Revision:

Revision 1445 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 9 09:30:58 2016 UTC (2 years, 4 months ago) by feinerer
File length: 3095 byte(s)
Diff to previous 1437
Speed up termFreq(), general cleanup

- Avoid parallel::mclapply()
- Use custom .table()
- Use rep.int(), rep_len() and lengths()
- Fix typos
- Shorten overlong lines
- Consistent formatting

Revision 1437 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jul 13 19:23:49 2016 UTC (2 years, 7 months ago) by feinerer
File length: 3088 byte(s)
Diff to previous 1435
Add SimpleCorpus

SimpleCorpus provides a corpus which is optimized for the most common usage
scenario: importing plain texts from files in a directory or directly from a
vector in R, preprocessing and transforming the texts, and finally exporting
them to a term-document matrix. The aim is to boost performance and minimize
memory pressure. It loads all documents into memory, and is designed for
medium-sized to large data sets.

Revision 1435 - (view) (download) (annotate) - [select for diffs]
Modified Wed Nov 18 09:53:21 2015 UTC (3 years, 3 months ago) by feinerer
File length: 3069 byte(s)
Diff to previous 1429
Provide inspect.TextDocument() as shorthand for writeLines(as.character())

Revision 1429 - (view) (download) (annotate) - [select for diffs]
Modified Sat May 23 08:45:52 2015 UTC (3 years, 9 months ago) by feinerer
File length: 2952 byte(s)
Diff to previous 1428
nchar() on a whole XML text document is not accurate

Revision 1428 - (view) (download) (annotate) - [select for diffs]
Modified Thu May 14 07:37:54 2015 UTC (3 years, 9 months ago) by feinerer
File length: 2956 byte(s)
Diff to previous 1423
Report one character count for a whole document

Revision 1423 - (view) (download) (annotate) - [select for diffs]
Modified Mon May 4 19:37:55 2015 UTC (3 years, 9 months ago) by feinerer
File length: 2951 byte(s)
Diff to previous 1419
inspect() is not part of the TextDocument API

Use as.character() and content() to access the document instead.

Revision 1419 - (view) (download) (annotate) - [select for diffs]
Modified Sat May 2 17:23:47 2015 UTC (3 years, 9 months ago) by feinerer
File length: 3068 byte(s)
Diff to previous 1415
Sync format()/print() with NLP

Revision 1415 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 4 08:54:44 2015 UTC (3 years, 10 months ago) by feinerer
File length: 2955 byte(s)
Diff to previous 1404
Replace meta.TextDocument() with implementations for subclasses

Revision 1404 - (view) (download) (annotate) - [select for diffs]
Modified Tue Feb 17 18:04:22 2015 UTC (4 years ago) by feinerer
File length: 2585 byte(s)
Diff to previous 1387
Avoid (rather expensive) structure()

Revision 1387 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 3 16:00:51 2014 UTC (4 years, 8 months ago) by feinerer
File length: 2646 byte(s)
Diff to previous 1385
Fix signature

Revision 1385 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jun 3 15:25:57 2014 UTC (4 years, 8 months ago) by feinerer
File length: 2626 byte(s)
Diff to previous 1380
Stay with scan_tokenizer

Revision 1380 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 27 18:53:47 2014 UTC (4 years, 8 months ago) by feinerer
File length: 2670 byte(s)
Diff to previous 1371
Order arguments by name

Revision 1371 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 30 07:16:28 2014 UTC (4 years, 9 months ago) by feinerer
File length: 2670 byte(s)
Diff to previous 1366
Recognize meta argument, check for id and language in readXML

Revision 1366 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 28 14:48:37 2014 UTC (4 years, 9 months ago) by feinerer
File length: 2656 byte(s)
Diff to previous 1346
Support to set a named list as meta in PlainTextDocument and XMLTextDocument

Revision 1346 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 21 08:21:14 2014 UTC (4 years, 10 months ago) by feinerer
File length: 2514 byte(s)
Diff to previous 1332
Order arguments by name, use class() in print()

Revision 1332 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 18 09:00:55 2014 UTC (4 years, 10 months ago) by feinerer
File length: 2516 byte(s)
Diff to previous 1330
Update TextDocument documentation

Revision 1330 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 15 17:20:50 2014 UTC (4 years, 10 months ago) by feinerer
File length: 2292 byte(s)
Diff to previous 1329
Remove extra line

Revision 1329 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 15 17:16:03 2014 UTC (4 years, 10 months ago) by feinerer
File length: 2294 byte(s)
Diff to previous 1328
Synchronize print() appearance with NLP

Revision 1328 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 15 09:46:28 2014 UTC (4 years, 10 months ago) by feinerer
File length: 2226 byte(s)
Diff to previous 1320
Rearrange, update manual

Revision 1320 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 6 07:05:45 2014 UTC (4 years, 10 months ago) by feinerer
File length: 1780 byte(s)
Diff to previous 1319
Use words() as default tokenizer in termFreq()

Revision 1319 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 2 18:03:37 2014 UTC (4 years, 10 months ago) by feinerer
File length: 1771 byte(s)
Diff to previous 1309
Provide words.PlainTextDocument(), clean NAMESPACE

Revision 1309 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 26 09:15:04 2014 UTC (4 years, 11 months ago) by feinerer
File length: 1688 byte(s)
Diff to previous 1307
Move content and meta generics to package NLP

Revision 1307 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 12:15:51 2014 UTC (4 years, 11 months ago) by feinerer
File length: 1804 byte(s)
Diff to previous 1300
Redesign corpora

Revision 1300 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 14:30:05 2014 UTC (4 years, 11 months ago) by feinerer
File length: 1928 byte(s)
Diff to previous 1242
Redesign text documents

This is a major change and causes fallout. Soon to be fixed ...

Revision 1242 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 19 05:33:57 2013 UTC (5 years, 6 months ago) by feinerer
File length: 3179 byte(s)
Diff to previous 1194
Do not register VCorpus and PlainTextDocument as S4 classes anymore

Revision 1194 - (view) (download) (annotate) - [select for diffs]
Modified Fri Nov 2 15:15:03 2012 UTC (6 years, 3 months ago) by feinerer
File length: 3525 byte(s)
Diff to previous 1034
Ensure data types for document creation

Revision 1034 - (view) (download) (annotate) - [select for diffs]
Modified Tue Jan 12 16:47:41 2010 UTC (9 years, 1 month ago) by feinerer
File length: 3408 byte(s)
Diff to previous 1025
Be careful with names attribute.

Revision 1025 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 11 08:56:22 2009 UTC (9 years, 2 months ago) by feinerer
File length: 3400 byte(s)
Diff to previous 988
Register S3 document classes to be recognized by S4 methods.

Revision 988 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 4 12:27:12 2009 UTC (9 years, 5 months ago) by feinerer
File length: 3054 byte(s)
Diff to previous 985
Update documentation.

Revision 985 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 27 18:09:05 2009 UTC (9 years, 5 months ago) by feinerer
File length: 2099 byte(s)
Diff to previous 964
Use S3 instead of S4 class system.

Revision 964 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jun 29 08:26:04 2009 UTC (9 years, 7 months ago) by feinerer
File length: 249 byte(s)
Diff to previous 957
Improve documentation.

Revision 957 - (view) (download) (annotate) - [select for diffs]
Modified Fri Jun 12 12:47:57 2009 UTC (9 years, 8 months ago) by feinerer
File length: 237 byte(s)
Diff to previous 950
Pretty print.

Revision 950 - (view) (download) (annotate) - [select for diffs]
Modified Thu May 14 15:17:18 2009 UTC (9 years, 9 months ago) by feinerer
File length: 229 byte(s)
Copied from: pkg/R/plaintextdoc.R revision 949
Diff to previous 885
Experimental FCorpus (fast corpus).

Revision 885 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 09:34:44 2009 UTC (10 years ago) by stefan7th
Original Path: pkg/R/plaintextdoc.R
File length: 150 byte(s)
Diff to previous 884
moved package to /pkg

Revision 884 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 28 10:24:27 2009 UTC (10 years ago) by stefan7th
Original Path: pkg/tm/R/plaintextdoc.R
File length: 150 byte(s)
Diff to previous 837
R-Forge transition completed

Revision 837 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 23 09:16:25 2008 UTC (10 years, 10 months ago) by feinerer
Original Path: trunk/tm/R/plaintextdoc.R
File length: 150 byte(s)
Diff to previous 816
Improved show methods.

Revision 816 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 24 14:36:41 2008 UTC (11 years, 1 month ago) by feinerer
Original Path: trunk/tm/R/plaintextdoc.R
File length: 141 byte(s)
Diff to previous 725
Renamed TextDocCol to Corpus, and Corpus to Content.

Revision 725 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 6 01:10:28 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/R/plaintextdoc.R
File length: 140 byte(s)
Diff to previous 722
Updated parts of the documentation.

Revision 722 - (view) (download) (annotate) - [select for diffs]
Added Sun Apr 1 15:53:58 2007 UTC (11 years, 10 months ago) by feinerer
Original Path: trunk/tm/R/plaintextdoc.R
File length: 144 byte(s)
Prettyprint summary, print method for plain text docs, removePunctuation.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

R-Forge@R-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business University of Wisconsin - Madison Powered By FusionForge