SCM

SCM Repository

[tm] Log of /pkg/R/source.R
[tm] / pkg / R / source.R  
ViewVC logotype

Log of /pkg/R/source.R

Parent Directory Parent Directory


Links to HEAD: (view) (download) (annotate)
Sticky Revision:

Revision 1481 - (view) (download) (annotate) - [select for diffs]
Modified Sat May 20 10:28:00 2017 UTC (16 months ago) by feinerer
File length: 8088 byte(s)
Diff to previous 1461 , to selected 1390
Support TIF for DataframeSource

See Text Interchange Formats (TIF, https://github.com/ropensci/tif) and
readtext (https://github.com/kbenoit/readtext).

Revision 1461 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 14 14:51:22 2017 UTC (20 months ago) by feinerer
File length: 7800 byte(s)
Diff to previous 1445 , to selected 1390
Implement [ and [[ for selected sources

Both [ and [[ are not considered part of the API but are provided as
convenience. Moreover, it is considered good practice as sources typically
report a length.

Revision 1445 - (view) (download) (annotate) - [select for diffs]
Modified Sun Oct 9 09:30:58 2016 UTC (23 months, 1 week ago) by feinerer
File length: 7331 byte(s)
Diff to previous 1408 , to selected 1390
Speed up termFreq(), general cleanup

- Avoid parallel::mclapply()
- Use custom .table()
- Use rep.int(), rep_len() and lengths()
- Fix typos
- Shorten overlong lines
- Consistent formatting

Revision 1408 - (view) (download) (annotate) - [select for diffs]
Modified Mon Feb 23 20:55:55 2015 UTC (3 years, 6 months ago) by feinerer
File length: 7327 byte(s)
Diff to previous 1407 , to selected 1390
Fix typos, extend NAMESPACE

Revision 1407 - (view) (download) (annotate) - [select for diffs]
Modified Mon Feb 23 20:38:08 2015 UTC (3 years, 6 months ago) by feinerer
File length: 7286 byte(s)
Diff to previous 1404 , to selected 1390
Add ZipSource

Revision 1404 - (view) (download) (annotate) - [select for diffs]
Modified Tue Feb 17 18:04:22 2015 UTC (3 years, 7 months ago) by feinerer
File length: 5185 byte(s)
Diff to previous 1398 , to selected 1390
Avoid (rather expensive) structure()

Revision 1398 - (view) (download) (annotate) - [select for diffs]
Modified Sat Sep 13 10:57:04 2014 UTC (4 years ago) by feinerer
File length: 5197 byte(s)
Diff to previous 1397 , to selected 1390
Ensure generic/method consistency, sync vignette

Revision 1397 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 12 19:30:27 2014 UTC (4 years ago) by feinerer
File length: 5179 byte(s)
Diff to previous 1390
Add open() and close() for sources

Useful for sources with complex or expensive setup, e.g., database connections
or file handles.

Revision 1390 - (view) (download) (annotate) - [selected]
Modified Fri Jun 6 12:37:33 2014 UTC (4 years, 3 months ago) by feinerer
File length: 5126 byte(s)
Diff to previous 1376
Ensure data types

Revision 1376 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 21 14:36:35 2014 UTC (4 years, 4 months ago) by feinerer
File length: 5272 byte(s)
Diff to previous 1357 , to selected 1390
Remove names() from Source API

Revision 1357 - (view) (download) (annotate) - [select for diffs]
Modified Thu Apr 24 06:33:35 2014 UTC (4 years, 4 months ago) by feinerer
File length: 5626 byte(s)
Diff to previous 1346 , to selected 1390
Simplify SimpleSource() default arguments

Revision 1346 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 21 08:21:14 2014 UTC (4 years, 5 months ago) by feinerer
File length: 5646 byte(s)
Diff to previous 1336 , to selected 1390
Order arguments by name, use class() in print()

Revision 1336 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 19 08:59:39 2014 UTC (4 years, 5 months ago) by feinerer
File length: 5686 byte(s)
Diff to previous 1334 , to selected 1390
Implement and describe Source API

Revision 1334 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 18 12:29:08 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6125 byte(s)
Diff to previous 1331 , to selected 1390
Update Source documentation (not yet finished)

Revision 1331 - (view) (download) (annotate) - [select for diffs]
Modified Wed Apr 16 08:10:18 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6211 byte(s)
Diff to previous 1326 , to selected 1390
Proper handling of file:// URIs

Revision 1326 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 14 14:45:20 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6215 byte(s)
Diff to previous 1325 , to selected 1390
Default to encoding of the current locale

Revision 1325 - (view) (download) (annotate) - [select for diffs]
Modified Mon Apr 14 14:39:29 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6236 byte(s)
Diff to previous 1308 , to selected 1390
Use encoding argument for conversion via iconv()

Revision 1308 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 15:02:15 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6177 byte(s)
Diff to previous 1307 , to selected 1390
Bug fixes. More to come ...

Revision 1307 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 12:15:51 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6177 byte(s)
Diff to previous 1306 , to selected 1390
Redesign corpora

Revision 1306 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 25 08:37:05 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6169 byte(s)
Diff to previous 1303 , to selected 1390
Improve writeCorpus, use lower case in internal data structures

Revision 1303 - (view) (download) (annotate) - [select for diffs]
Modified Mon Mar 24 12:48:11 2014 UTC (4 years, 5 months ago) by feinerer
File length: 6196 byte(s)
Diff to previous 1298 , to selected 1390
Binary reader

Revision 1298 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 21 09:30:02 2014 UTC (4 years, 6 months ago) by feinerer
File length: 5739 byte(s)
Diff to previous 1297 , to selected 1390
Check for valid mode

Revision 1297 - (view) (download) (annotate) - [select for diffs]
Modified Thu Mar 20 18:43:22 2014 UTC (4 years, 6 months ago) by feinerer
File length: 5385 byte(s)
Diff to previous 1285 , to selected 1390
Redesign sources

Revision 1285 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 11 09:31:10 2014 UTC (4 years, 8 months ago) by feinerer
File length: 5068 byte(s)
Diff to previous 1274 , to selected 1390
Simplify checks

Revision 1274 - (view) (download) (annotate) - [select for diffs]
Modified Sun Jan 5 10:51:18 2014 UTC (4 years, 8 months ago) by feinerer
File length: 5071 byte(s)
Diff to previous 1261 , to selected 1390
More sanity checks

Revision 1261 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 27 09:37:35 2013 UTC (4 years, 11 months ago) by feinerer
File length: 4202 byte(s)
Diff to previous 1258 , to selected 1390
Allow multiple URIs for URISource, default to vectorized sources, simplify eoi()

Revision 1258 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 20 12:15:42 2013 UTC (5 years ago) by feinerer
File length: 4277 byte(s)
Diff to previous 1257 , to selected 1390
Remove GmaneSource() and readGmane(), simplify readers, improve documentation

Revision 1257 - (view) (download) (annotate) - [select for diffs]
Modified Thu Sep 19 10:48:07 2013 UTC (5 years ago) by feinerer
File length: 4488 byte(s)
Diff to previous 1252 , to selected 1390
Export Source constructor, extend documentation

Revision 1252 - (view) (download) (annotate) - [select for diffs]
Modified Mon Aug 26 14:00:31 2013 UTC (5 years ago) by feinerer
File length: 4184 byte(s)
Diff to previous 1190 , to selected 1390
Report non-existent or non-readable files

Revision 1190 - (view) (download) (annotate) - [select for diffs]
Modified Thu Sep 6 14:14:46 2012 UTC (6 years ago) by feinerer
File length: 4114 byte(s)
Diff to previous 1184 , to selected 1390
Convert factors to characters as processing factors piecewise is extremly slow

Revision 1184 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jul 7 08:49:29 2012 UTC (6 years, 2 months ago) by feinerer
File length: 4035 byte(s)
Diff to previous 1179 , to selected 1390
Make the same assumptions on the input encoding as readLines() does

Revision 1179 - (view) (download) (annotate) - [select for diffs]
Modified Mon Feb 20 17:37:40 2012 UTC (6 years, 7 months ago) by feinerer
File length: 4021 byte(s)
Diff to previous 1169 , to selected 1390
Use xmlParse() to work with C-level nodes

Revision 1169 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 14 11:32:38 2012 UTC (6 years, 8 months ago) by feinerer
File length: 4005 byte(s)
Diff to previous 1154 , to selected 1390
Simplify XMLSource; Use vignettes directory

Revision 1154 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 17 16:22:14 2011 UTC (6 years, 10 months ago) by feinerer
File length: 4252 byte(s)
Diff to previous 1117 , to selected 1390
Allow class argument for internal Source constructor

Revision 1117 - (view) (download) (annotate) - [select for diffs]
Modified Fri Feb 4 20:44:37 2011 UTC (7 years, 7 months ago) by feinerer
File length: 4315 byte(s)
Diff to previous 1070 , to selected 1390
Sources now store strings and connections instead of unevaluated calls. Improve documentation.

Revision 1070 - (view) (download) (annotate) - [select for diffs]
Modified Tue May 18 08:58:22 2010 UTC (8 years, 4 months ago) by feinerer
File length: 4435 byte(s)
Diff to previous 1065 , to selected 1390
Use element names as document IDs if provided by a source

Revision 1065 - (view) (download) (annotate) - [select for diffs]
Modified Sat Apr 10 07:38:57 2010 UTC (8 years, 5 months ago) by feinerer
File length: 4330 byte(s)
Diff to previous 1063 , to selected 1390
Use row.names() instead of rownames() as suggested by Kurt Hornik

Revision 1063 - (view) (download) (annotate) - [select for diffs]
Modified Fri Apr 9 10:36:39 2010 UTC (8 years, 5 months ago) by feinerer
File length: 4329 byte(s)
Diff to previous 1033 , to selected 1390
Sources can now provide document names

Revision 1033 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 9 09:33:54 2010 UTC (8 years, 8 months ago) by feinerer
File length: 4251 byte(s)
Diff to previous 1031 , to selected 1390
Clean up and prepare for CRAN release.

Revision 1031 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 7 12:02:43 2010 UTC (8 years, 8 months ago) by stefan7th
File length: 4250 byte(s)
Diff to previous 1029 , to selected 1390
Updated DirSource to better handle a large set of files

Revision 1029 - (view) (download) (annotate) - [select for diffs]
Modified Tue Dec 22 13:40:25 2009 UTC (8 years, 9 months ago) by feinerer
File length: 4218 byte(s)
Diff to previous 1018 , to selected 1390
Use encoding argument in URISource.

Revision 1018 - (view) (download) (annotate) - [select for diffs]
Modified Sun Nov 15 15:53:49 2009 UTC (8 years, 10 months ago) by feinerer
File length: 4195 byte(s)
Diff to previous 1017 , to selected 1390
Fix bug in removeWords(). Refactoring of term-document matrix constructor. Clean up of defunct functions.

Revision 1017 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 12 16:18:54 2009 UTC (8 years, 10 months ago) by feinerer
File length: 4431 byte(s)
Diff to previous 1010 , to selected 1390
Improve DirSource().

Revision 1010 - (view) (download) (annotate) - [select for diffs]
Modified Fri Oct 9 12:48:37 2009 UTC (8 years, 11 months ago) by feinerer
File length: 4367 byte(s)
Diff to previous 988 , to selected 1390
Use xmlChildren().

Revision 988 - (view) (download) (annotate) - [select for diffs]
Modified Fri Sep 4 12:27:12 2009 UTC (9 years ago) by feinerer
File length: 4421 byte(s)
Diff to previous 986 , to selected 1390
Update documentation.

Revision 986 - (view) (download) (annotate) - [select for diffs]
Modified Tue Sep 1 15:33:30 2009 UTC (9 years ago) by feinerer
File length: 4420 byte(s)
Diff to previous 985 , to selected 1390
Further changes due to S3 class system.

Revision 985 - (view) (download) (annotate) - [select for diffs]
Modified Thu Aug 27 18:09:05 2009 UTC (9 years ago) by feinerer
File length: 4455 byte(s)
Diff to previous 946 , to selected 1390
Use S3 instead of S4 class system.

Revision 946 - (view) (download) (annotate) - [select for diffs]
Modified Wed May 13 18:07:35 2009 UTC (9 years, 4 months ago) by feinerer
File length: 6769 byte(s)
Diff to previous 943 , to selected 1390
A lot of major improvements (see NEWS).

Revision 943 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 28 20:28:50 2009 UTC (9 years, 4 months ago) by feinerer
File length: 7957 byte(s)
Diff to previous 939 , to selected 1390
Fix.

Revision 939 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 26 07:04:11 2009 UTC (9 years, 4 months ago) by feinerer
File length: 7960 byte(s)
Diff to previous 934 , to selected 1390
Rename readCustom to readTabular.

Revision 934 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 12 13:18:24 2009 UTC (9 years, 5 months ago) by feinerer
File length: 8058 byte(s)
Diff to previous 917 , to selected 1390
Make DataframeSource vectorized.

Revision 917 - (view) (download) (annotate) - [select for diffs]
Modified Fri Mar 27 11:55:45 2009 UTC (9 years, 5 months ago) by feinerer
File length: 7842 byte(s)
Diff to previous 914 , to selected 1390
Update documentation.

Revision 914 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 24 20:10:57 2009 UTC (9 years, 5 months ago) by feinerer
File length: 7842 byte(s)
Diff to previous 911 , to selected 1390
Use readXML() for readGmane().

Revision 911 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 22 17:55:16 2009 UTC (9 years, 6 months ago) by feinerer
File length: 7789 byte(s)
Diff to previous 910 , to selected 1390
New XMLSource class for arbitrary XML files.

Revision 910 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 22 16:33:30 2009 UTC (9 years, 6 months ago) by feinerer
File length: 10486 byte(s)
Diff to previous 909 , to selected 1390
CSVSource is defunct.

Revision 909 - (view) (download) (annotate) - [select for diffs]
Modified Sun Mar 22 12:45:59 2009 UTC (9 years, 6 months ago) by feinerer
File length: 11817 byte(s)
Diff to previous 908 , to selected 1390
Sources now can be vectorized.

Revision 908 - (view) (download) (annotate) - [select for diffs]
Modified Sat Mar 21 18:22:48 2009 UTC (9 years, 6 months ago) by feinerer
File length: 10631 byte(s)
Diff to previous 895 , to selected 1390
Added reader which can be customized via user-defined mappings.

Revision 895 - (view) (download) (annotate) - [select for diffs]
Modified Tue Mar 10 17:59:34 2009 UTC (9 years, 6 months ago) by feinerer
File length: 10630 byte(s)
Diff to previous 886 , to selected 1390
Add pattern and ignore.case arguments to DirSource constructor.

Revision 886 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 22:47:34 2009 UTC (9 years, 7 months ago) by feinerer
File length: 10510 byte(s)
Diff to previous 885 , to selected 1390
Speed up package loading (Depends -> Suggests).

Revision 885 - (view) (download) (annotate) - [select for diffs]
Modified Thu Jan 29 09:34:44 2009 UTC (9 years, 7 months ago) by stefan7th
File length: 10358 byte(s)
Diff to previous 884 , to selected 1390
moved package to /pkg

Revision 884 - (view) (download) (annotate) - [select for diffs]
Modified Wed Jan 28 10:24:27 2009 UTC (9 years, 7 months ago) by stefan7th
Original Path: pkg/tm/R/source.R
File length: 10358 byte(s)
Diff to previous 876 , to selected 1390
R-Forge transition completed

Revision 876 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 6 15:58:01 2008 UTC (9 years, 9 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 10358 byte(s)
Diff to previous 875 , to selected 1390
New DataframeSource.

Revision 875 - (view) (download) (annotate) - [select for diffs]
Modified Sat Dec 6 13:25:03 2008 UTC (9 years, 9 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 9891 byte(s)
Diff to previous 874 , to selected 1390
Fixed non-standard call evaluation.

Revision 874 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 29 16:24:45 2008 UTC (9 years, 9 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 10419 byte(s)
Diff to previous 873 , to selected 1390
New URISource.

Revision 873 - (view) (download) (annotate) - [select for diffs]
Modified Thu Nov 27 08:26:53 2008 UTC (9 years, 9 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 9040 byte(s)
Diff to previous 869 , to selected 1390
Code refactoring for sources.

Revision 869 - (view) (download) (annotate) - [select for diffs]
Modified Sat Nov 8 09:16:37 2008 UTC (9 years, 10 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 10361 byte(s)
Diff to previous 866 , to selected 1390
Sources now have a Length slot. Knowing the length in advance makes corpus construction a lot faster (~ 8 times faster).

Revision 866 - (view) (download) (annotate) - [select for diffs]
Modified Sun Nov 2 09:11:00 2008 UTC (9 years, 10 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 10004 byte(s)
Diff to previous 848 , to selected 1390
Fixed variable binding warning and signature mismatch in documentation.

Revision 848 - (view) (download) (annotate) - [select for diffs]
Modified Tue Apr 29 16:51:43 2008 UTC (10 years, 4 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 9890 byte(s)
Diff to previous 832 , to selected 1390
Improved vignette.

Revision 832 - (view) (download) (annotate) - [select for diffs]
Modified Wed Mar 12 12:59:48 2008 UTC (10 years, 6 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 9781 byte(s)
Diff to previous 810 , to selected 1390
Added VectorSource.

Revision 810 - (view) (download) (annotate) - [select for diffs]
Modified Mon Jan 21 17:14:06 2008 UTC (10 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 8606 byte(s)
Diff to previous 807 , to selected 1390
Better support for encodings.

Revision 807 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 5 10:35:53 2008 UTC (10 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 8044 byte(s)
Diff to previous 723 , to selected 1390
CSVSource now uses read.csv instead of scan internally.

Revision 723 - (view) (download) (annotate) - [select for diffs]
Modified Sun Apr 1 16:12:26 2007 UTC (11 years, 5 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 8060 byte(s)
Diff to previous 698 , to selected 1390
Now each source has its own default reader.

Revision 698 - (view) (download) (annotate) - [select for diffs]
Modified Sat Jan 6 17:05:44 2007 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 7805 byte(s)
Diff to previous 694 , to selected 1390
Changes due to Kurt's review.

Revision 694 - (view) (download) (annotate) - [select for diffs]
Modified Sun Dec 31 14:47:46 2006 UTC (11 years, 8 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 7817 byte(s)
Diff to previous 693 , to selected 1390
Implemented improvements based upon comments by David.

Revision 693 - (view) (download) (annotate) - [select for diffs]
Modified Fri Dec 22 13:21:30 2006 UTC (11 years, 9 months ago) by feinerer
Original Path: trunk/tm/R/source.R
File length: 7912 byte(s)
Diff to previous 689 , to selected 1390
Renamed textmin to tm directory since the package name changed.

Revision 689 - (view) (download) (annotate) - [select for diffs]
Added Fri Dec 8 14:21:46 2006 UTC (11 years, 9 months ago) by feinerer
Original Path: trunk/textmin/R/source.R
File length: 7912 byte(s)
Diff to selected 1390
Implemented changes as proposed at the Forschungsklausur on 01.12.2006.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

R-Forge@R-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business University of Wisconsin - Madison Powered By FusionForge