SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1017, Thu Nov 12 16:18:54 2009 UTC
# Line 1  Line 1 
1    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/source.R (DirSource): Default to working directory if no path
4            is specified.
5    
6    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/source.R (DirSource): Stop on empty directories.
9    
10    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
13            named documents.
14    
15    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
16    
17            * R/transform.R (removeWords): Improve regular expressions.
18    
19    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
20    
21            * R/meta.R (DublinCore): Allow lower case tags.
22    
23    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
24    
25            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
26            instead of x$children.
27    
28    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
29    
30            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
31    
32    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
33    
34            * R/: Use S3 instead of S4 class system.
35    
36    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
37    
38            * R/reader.R (readMail): Moved to tm.plugin.mail package.
39    
40    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
41    
42            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
43            postings are basically e-mails with some extra headers.
44    
45    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
46    
47            * R/transform.R: Move convertMboxEml, removeCitation,
48            removeMultipart, and removeSignature to the tm.plugin.mail package
49            since they are mainly utility functions (for handling e-mails) and
50            not very framework specific.
51    
52    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
53    
54            * man/: Fix documentation.
55    
56    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
57    
58            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
59            plain text document instead of an XML document for texts of the
60            Reuters-21578 dataset.
61    
62            * R/sparse.R: Removed since the slam package is now available on
63            CRAN.
64    
65            * DESCRIPTION (Depends): Add slam package.
66    
67    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
68    
69            * R/transform.R (stemDoc): Fix character(0) handling.
70    
71    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
72    
73            * R/doc.R (show): Pretty print.
74    
75    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
76    
77            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
78            gracefully.
79    
80    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
81    
82            * R/corpus.R: Make corpus virtual. Implement corpus with standard
83            and permanent storage semantics.
84    
85            * DESCRIPTION: New major release. A *lot* of improvements.
86    
87    2009-05-04   Ingo Feinerer <feinerer@logic.at>
88    
89            * NAMESPACE: Export some simple_triplet_matrix functions.
90    
91    2009-04-28   Ingo Feinerer <feinerer@logic.at>
92    
93            * R/weight.R: Adapt tf-idf to new matrix format.
94    
95    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
96    
97            * R/matrix.R: Create two distinct classes for term-document and
98            document-term matrices.
99    
100    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
101    
102            * R/termdocmatrix.R: No longer use Matrix package. This reduces
103            package start-up time significantly.
104    
105    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
106    
107            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
108    
109    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
110    
111            * R/transform.R (tmReduce): Combine multiple maps into one
112            transformation.
113    
114    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/weight.R: Remove weightLogical since it does not return a
117            dgCMatrix.
118    
119            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
120            or TermDocumentMatrix instead.
121    
122    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
123    
124            * inst/doc/extensions.Rnw: Finished vignette.
125    
126    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
127    
128            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
129            DocumentTermMatrix representations.
130    
131    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
132    
133            * R/reader.R (readXML): New reader for arbitrary XML files.
134    
135    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
136    
137            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
138            (XMLSource): New XMLSource class for arbitrary XML files.
139            (Source): New slot Vectorized.
140    
141    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
142    
143            * R/reader.R (readTabular): Experimental reader for tabular data
144            structures which can be customized via user-defined mappings.
145    
146            * R/reader.R: Always use UTC time zone.
147    
148            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
149    
150    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
151    
152            * R/reader.R (readDOC): Options can be passed over to antiword.
153    
154            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
155            pdftotext.
156    
157    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
158    
159            * R/source.R (DirSource): Add pattern and ignore.case arguments
160            which are internally passed over to list.files().
161    
162    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
163    
164            * inst/doc/tm.Rnw: Suppress pointless loading message.
165    
166    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
167    
168            * DESCRIPTION: Speed up package loading (via moving packages not
169            strictly necessary for normal operation to Suggests instead of
170            Depends).
171    
172    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
173    
174            * R/reader.R (readNewsgroup): The date format is now configurable.
175    
176    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
179    
180    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
181    
182            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
183    
184    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/source.R (DataframeSource): New source class for data frames.
187    
188            * R/source.R: Fixed non-standard call evaluation.
189    
190    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
191    
192            * R/source.R (URISource): New source class for a single document.
193    
194    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/source.R: Refactoring.
197    
198    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
199    
200            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
201            Rmpi installations more gracefully.
202    
203    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
204    
205            * R/source.R (Source): Add Length slot.
206    
207    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/AAA.R: Unify duplicated .onLoad function.
210    
211    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
212    
213            * DESCRIPTION (Suggests): Added Rmpi.
214    
215    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
216    
217            * R/source.R (getElem): Fix 'no visible binding' warning.
218    
219            * man/WeightFunction.Rd: Fix signature.
220    
221    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
222    
223            * R/weight.R: Introduce name abbreviations for weighting functions.
224    
225    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
226    
227            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
228    
229            * R/cluster.R: Provide convenience functions for using a MPI
230            cluster.
231    
232            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
233            available.
234    
235            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
236            available.
237    
238    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
239    
240            * R/textdoccol.R (lapply): Removed debug print out.
241    
242    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
243    
244            * R/reader.R (readRCV1): Improved meta data extraction from
245            Reuters Corpus Volume 1 documents.
246    
247    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
248    
249            * R/transform.R: Ensure that all mappings preserve multiline
250            structures.
251    
252    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
253    
254            * R/filter.R: Every filter has now an attribute indicating whether
255            it sould be applied to document level (doclevel).
256    
257            * R/textdoccol.R (tmFilter): Set searchFullText as new default
258            filter.
259    
260    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
261    
262            * R/transform.R (replacePatterns): Replaced removeWords by
263            replacePatterns. Suggested by Christian Buchta.
264    
265            * R/textdoccol.R (inspect): Improved formatting.
266    
267    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
268    
269            * inst/CITATION: Updated JSS article information.
270    
271            * R/textdoccol.R (setAs): Added coerce method from list to
272            corpus.
273    
274            * R/meta.R (meta): Improved meta data handling.
275    
276    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
277    
278            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
279            Christian Buchta.
280    
281            * inst/CITATION: Added template to include JSS article reference.
282    
283    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
284    
285            * R/textdoccol.R (tmMap): Introduced lazy mapping.
286    
287            * R/source.R: Added VectorSource.
288    
289    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
290    
291            * man/: Language codes should be in ISO 639-1 format.
292    
293            * R/textdoccol.R (asPlain): Preserve local meta data.
294    
295    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
296    
297            * R/textdoccol.R (writeCorpus): Function for writing a corpus
298            containing plain text documents to disk.
299    
300    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
301    
302            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
303            always set correctly.
304    
305            * R/textdoccol.R: Set load = TRUE as default for load on demand
306            since in most cases this is the wanted behaviour.
307    
308    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
309    
310            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
311    
312            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
313    
314    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/meta.R (meta): New function for consistent access to meta data
317            of document collections, repositories, and texts.
318    
319    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/: Better support for encodings.
322    
323    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
324    
325            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
326            selection when no reader argument is given.
327    
328    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
329    
330            * R/source.R (CSVSource): Now uses read.csv instead of scan
331            internally.
332    
333    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
334    
335            * R/reader.R (getReaders): Returns available reader functions.
336    
337            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
338            as default.
339    
340    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
341    
342            * R/stopwords.R (stopwords): Shortened code, removed codetools
343            variable warnings.
344    
345            * man/: Documentation for showMeta, added an example for tmMap.
346    
347            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
348            some minor typos fixed.
349    
350    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * R/aobjects.R (showMeta): Added method for pretty printing a
353            text document's meta data.
354    
355    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/textdoccol.R (TextDocCol): Better handling of empty
358            arguments.
359    
360            * NAMESPACE: Exported readDOC.
361    
362            * man/completeStems.Rd: Added an example.
363    
364    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * R/stopwords.R (stopwords): Look up .dat files at every
367            call. Allows users to modify stopword .dat files interactively.
368    
369    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
370    
371            * R/termdocmatrix.R (termFreq): Correct processing of empty
372            documents.
373    
374    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * man/: Updated documentation.
377    
378    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/complete.R (completeStems): Completes (heuristically) word
381            stems.
382    
383            * R/termdocmatrix.R (TermDocMatrix2): New modular
384            constructor.
385    
386            * NAMESPACE: Exported termFreq.
387    
388    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
389    
390            * R/reader.R (readDOC): Added MS Word reader (using antiword).
391    
392    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
393    
394            * R/weight.R: Weighting functions for TermDocMatrix.
395    
396    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
397    
398            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
399            functions for accessing dimension, column, and row names.
400    
401            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
402    
403    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
406    
407    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
410    
411    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * R/reader.R (readPDF): Removed manual checks for pdftotext and
414            pdfinfo. The system call gives a warning anyway.
415    
416    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * R/textdoccol.R (asPlain): Conversion from
419            StructuredTextDocuments to PlainTextDocuments.
420    
421    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
424            for accessing term-document matrices.
425    
426            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
427            are installed.
428    
429    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
430    
431            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
432            Christian Buchta.
433    
434    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
435    
436            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
437    
438    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
439    
440            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
441    
442            * R/reader.R (readPDF): Added PDF reader.
443    
444    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
445    
446            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
447    
448            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
449    
450            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
451    
452            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
453    
454    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/distmeasure.R (dissimilarity): Replaced dists call from
457            package cba by new dist call from package proxy.
458    
459    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
462    
463    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * R/termdocmatrix.R: require() uses the quietly option to suppress
466            loading messages.
467    
468    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
469    
470            * R/dictionary.R: Added dictionary support.
471    
472    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
475            documents. This simplifies some functions, e.g., asPlain.
476    
477    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * inst/doc/tm.Rnw: Fixed some typos in vignette.
480    
481    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * R/textdoccol.R (replaceWords): Added method to replace a set of
484            words by a single word. Useful for synonyms.
485    
486    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
487    
488            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
489    
490    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
491    
492            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
493            vectors. Thanks to Ariel Maguyon for his error report.
494            (removeSparseTerms): New function to remove columns from a
495            term-document matrix exceeding a sparse factor.
496    
497    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
500    
501    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
502    
503            * man/sFilter.Rd: Corrected documentation on statement format (use
504            '==' instead of '=').
505    
506    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
507    
508            * R/aobjects.R (StructuredTextDocument): Inherits from
509            TextDocument.
510    
511    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
512    
513            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
514            on sparse matrices as proposed by Martin Maechler.
515    
516    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
517    
518            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
519            \pkg{filehash} version makes them deprecated.
520    
521    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
522    
523            * R/termdocmatrix.R (textvector): Stemming is now performed before
524            erasing stopwords.
525            (weightMatrix): Adapted to handle sparse matrices.
526            (TermDocMatrix): Sparse matrix is now efficiently built by
527            direct stepwise insertion of row values into it.
528    
529    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
530    
531            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
532            due to ongoing problems. For our purposes the latter is as useful
533            as the replaced package.
534    
535    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
538    
539            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
540    
541    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
542    
543            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
544            languages with available stopwords.
545    
546    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
547    
548            * inst/doc/tm.Rnw: Minor corrections in the vignette.
549    
550    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
551    
552            * DESCRIPTION: Update to version 0.2, since a lot of new features
553            have been integrated.
554    
555            * inst/stopwords: Updated existing stopwords and added stopwords
556            for various other languages.
557    
558    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
559    
560            * man/: Updated documentation.
561    
562            * Work/testDb.R: Script to test database stuff.
563    
564            * R/: Fixed various database related bugs. Seems to be rather
565            useable now, i.e., consider as alpha status for now.
566    
567    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
568    
569            * R/: Fixed some bugs related to database support.
570    
571    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
572    
573            * man/: Added a lot of examples to the manuals.
574    
575    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
576    
577            * man/: Updated parts of the documentation.
578    
579            * R/textdoccol.R (asPlain): Added conversion from newsgroup
580            documents to plain text documents.
581    
582    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
583    
584            * R/textdoccol.R: Finished experimental database support. Not yet
585            intensively tested.
586    
587            * R/source.R: Now each source has a default reader.
588    
589            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
590            class anymore.
591    
592            * R/plaintextdoc.R: Custom show method for plain text documents.
593    
594            * R/aobjects.R: Added a class for structured text documents.
595    
596            * R/reader.R: Replaced remaining \code{parser} occurrences with
597            \code{reader}.
598    
599            * R/textdoccol.R (summary): Indent tags.
600    
601            * R/textdoccol.R (removePunctuation): Transform method to remove
602            punctuation marks.
603    
604    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
605    
606            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
607            using prescindMeta().
608    
609    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
610    
611            * R/textdoccol.R: Improved database support.
612    
613    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
614    
615            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
616    
617            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
618            language code.
619    
620            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
621            into parserControl argument.
622    
623            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
624    
625    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
626    
627            * Work/tmDataSetup.R: The datasets acq and crude can now be
628            created on the fly.
629    
630            * R/stopwords.R: Introduced a function returning the stopwords for
631            a given language (English, German and French at the moment)
632    
633            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
634            otherwise falls back to Snowball package.
635    
636    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
637    
638            * man/dissimilarity-methods.Rd: Make clear that any method offered
639            by "dists" from package "cba" can be used.
640    
641    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
642    
643            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
644            to Kurt's latex suggestion. Removed points and underscores in
645            variable names for consistent naming.
646    
647            * DESCRIPTION: Update to version 0.1-2.
648    
649            * man/TextRepository.Rd: Fixed bug in documentation.
650    
651    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
652    
653            * DESCRIPTION: Update to version 0.1-1.
654    
655    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
656    
657            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
658            wordStem.
659    
660    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
661    
662            * R/: Changes due to Kurt's review.
663    
664    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
665    
666            * R/: Implemented improvements based upon comments by David
667            Meyer.
668    
669    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
670    
671            * inst/doc/: Rewrote vignette.
672    
673            * man/: Improved documentation.
674    
675    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
676    
677            * man/: Updated documentation.
678    
679            * DESCRIPTION: Changed package name to "tm". Updated version to
680            0.1 for first CRAN release.
681    
682            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
683            list archive example.
684    
685            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
686            archive example.
687    
688            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
689            from (several mails per box) mbox format to (single mail per file)
690            eml format.
691    
692    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
693    
694            * data/crude.rda: Rebuilt.
695    
696            * data/acq.rda: Rebuilt.
697    
698            * R/reader.R: Factored out reader and parser methods from
699            textdoccol.R.
700    
701            * R/source.R: Factored out Source methods from aobjects.R and
702            textdoccol.R.
703            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
704            feeds.
705    
706            * R/textdoccol.R (DirSource): Added support for recursive
707            traversal of directories.
708    
709    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
710    
711            * R/textdoccol.R ([[): Loads the document corpus automatically
712            into memory upon access.
713            (tm_transform, tm_filter): Removed several checks whether the
714            document is already loaded ([[ ensures this now).
715            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
716            mailing list archive.
717    
718    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
719    
720            * R/aobjects.R (TextDocument): Is now a virtual class.
721            (Source): Is now a virtual class.
722    
723    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
724    
725            * R/textdoccol.R (c): Support for an arbitrary number of document
726            collections.
727    
728    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
729    
730            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
731            append_meta and remove_meta.
732    
733            * R/textdoccol.R: Removed modify_metadata method.
734    
735            * R/textrepo.R: Removed modify_metadata method.
736    
737            * R/textdoccol.R (remove_meta): Supports removal of document
738            collection metadata and document (= in data frame) metadata.
739    
740    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
741    
742            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
743    
744            * data/crude.rda: Rebuilt.
745    
746            * data/acq.rda: Rebuilt.
747    
748            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
749    
750            * R/textdoccol.R ([): Bug fix for subsetting a document
751            collection's data frame.
752    
753    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
754    
755            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
756            to s_filter.
757    
758            * R/textdoccol.R: Local text documents' metadata can now be copied
759            to a document collection's data frame with prescind_meta.
760    
761    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
762    
763            * R/: Text documents' slot metadata is now accessible in s_filter.
764    
765            * R/: Rewrote s_filter function (has still some restrictions).
766    
767    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
768    
769            * R/: Various fixes in handling metadata.
770    
771            * R/: Added update mechanism for text document collections.
772    
773    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
774    
775            * R/: Merging of document collections now creates a binary tree
776            for reconstructing merged document collections.
777    
778            * R/: Redesign of metadata for document collections.
779    
780    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
781    
782            * R/: Messages now use \code{ngettext}.
783    
784    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
785    
786            * R/: Added functions for modifying and removing metadata.
787    
788    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
789    
790            * man/: Updated some documentation.
791    
792            * R/: Corrected some connection issues.
793    
794            * inst/doc: Worked on the vignette.
795    
796    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
797    
798            * inst/: Added texts and started vignette.
799    
800            * R/: Final changes based upon David's comments.
801    
802    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
803    
804            * NAMESPACE: Corrected exports (generic methods need exportMethods
805            directives!).
806    
807    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
808    
809            * R/: Modified the TextDocCol constructur and various parsers. It
810            is now modular and supports various file formats via plugins (see
811            the new "Source" class).
812    
813    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
814    
815            * man/: Revised documentation after previous code changes.
816    
817    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
818    
819            * R/: Remaining changes as discussed with David.
820    
821    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
822    
823            * R/: Some changes as suggested by David. The rest will follow
824            within the next days.
825    
826    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * man/: Finished documentation.
829    
830    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
831    
832            * man/: Wrote some documentation.
833    
834    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
835    
836            * R/: Further syntactic sugar in form of additional assignment and
837            accessor methods.
838    
839    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
840    
841            * R/: Syntactic sugar in form of "length", "show" and "summary"
842            operators.
843    
844    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
845    
846            * R/: Diverse updates. Mainly on default operators ("[" or "c")
847            and dissimilarities.
848    
849    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
850    
851            * R/: Added similarity functions.
852    
853            * data/: Added english stopwords.
854    
855    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * data/: Examples compiled for new features
858    
859            * R/: Changes due to new structure.
860    
861            * NAMESPACE: Corrected namespace to reflect new structure.
862    
863            * R/termdocmatrix.R: Adapted for new naming scheme.
864    
865    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
866    
867            * R/textdoccol.R: Adapted code for new class structure. Wrote
868            several transform and filter functions operating on text document
869            collections (alias text document databases).
870    
871            * R/aobjects.R: Adapted class structure with inheritance,
872            repositories and additional meta data. Loading files on demand is
873            now possible.
874    
875    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
876    
877            * R/: Some cosmetic cleanups.
878    
879            * inst/: Removed vignette on clustering. That and much more is now
880            described in the JSS paper on text mining. Based upon that
881            article an elaborated vignette will be incorporated in the future.
882    
883    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
884    
885            * R/: Updated generic S4 methods to comply with signature changes
886            in newer versions of R (> 2.3)
887    
888    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
889    
890            * ext/R/importRIS.R: Automatic RIS import is now possible.
891    
892    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894            * R/textdoccol.R: Added RIS HTML input format.
895    
896    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
897    
898            * R/textdoccol.R: Removed bug that caused invalid text document
899            collections when handling many input files.
900    
901  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
902    
903          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1017

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge