SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1032, Thu Jan 7 12:09:51 2010 UTC
# Line 1  Line 1 
1    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
2    
3            * R/source.R (DirSource): Improved implementation especially when
4            handling many (>1M) files.
5    
6    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/source.R (getElem.URISource): Use encoding argument.
9    
10    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/doc.R (setOldClass): Register S3 document classes to be
13            recognized by S4 methods.
14    
15    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
16    
17            * R/matrix.R (termFreq): Add option to remove punctuation
18            characters.
19    
20    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
21    
22            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
23            merging multiple term-document matrices.
24    
25    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
26    
27            * R/corpus.R (setOldClass): Register S3 corpus classes to be
28            recognized by S4 methods.
29    
30            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
31            that CRAN Mac OS X builds do not fail any longer.
32    
33    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
34    
35            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
36            of RWeka:AlphabeticTokenizer() as default.
37    
38    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
39    
40            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
41            caused words at the beginning or the end of a line not to be removed. Do
42            not delete whitespace anymore.
43    
44    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
45    
46            * R/source.R (DirSource): Default to working directory if no path
47            is specified.
48    
49    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
50    
51            * R/source.R (DirSource): Stop on empty directories.
52    
53    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
54    
55            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
56            named documents.
57    
58    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
59    
60            * R/transform.R (removeWords): Improve regular expressions.
61    
62    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/meta.R (DublinCore): Allow lower case tags.
65    
66    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
67    
68            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
69            instead of x$children.
70    
71    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
72    
73            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
74    
75    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
76    
77            * R/: Use S3 instead of S4 class system.
78    
79    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
80    
81            * R/reader.R (readMail): Moved to tm.plugin.mail package.
82    
83    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
84    
85            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
86            postings are basically e-mails with some extra headers.
87    
88    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
89    
90            * R/transform.R: Move convertMboxEml, removeCitation,
91            removeMultipart, and removeSignature to the tm.plugin.mail package
92            since they are mainly utility functions (for handling e-mails) and
93            not very framework specific.
94    
95    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
96    
97            * man/: Fix documentation.
98    
99    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
100    
101            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
102            plain text document instead of an XML document for texts of the
103            Reuters-21578 dataset.
104    
105            * R/sparse.R: Removed since the slam package is now available on
106            CRAN.
107    
108            * DESCRIPTION (Depends): Add slam package.
109    
110    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/transform.R (stemDoc): Fix character(0) handling.
113    
114    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/doc.R (show): Pretty print.
117    
118    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
119    
120            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
121            gracefully.
122    
123    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
124    
125            * R/corpus.R: Make corpus virtual. Implement corpus with standard
126            and permanent storage semantics.
127    
128            * DESCRIPTION: New major release. A *lot* of improvements.
129    
130    2009-05-04   Ingo Feinerer <feinerer@logic.at>
131    
132            * NAMESPACE: Export some simple_triplet_matrix functions.
133    
134    2009-04-28   Ingo Feinerer <feinerer@logic.at>
135    
136            * R/weight.R: Adapt tf-idf to new matrix format.
137    
138    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
139    
140            * R/matrix.R: Create two distinct classes for term-document and
141            document-term matrices.
142    
143    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
144    
145            * R/termdocmatrix.R: No longer use Matrix package. This reduces
146            package start-up time significantly.
147    
148    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
149    
150            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
151    
152    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
153    
154            * R/transform.R (tmReduce): Combine multiple maps into one
155            transformation.
156    
157    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
158    
159            * R/weight.R: Remove weightLogical since it does not return a
160            dgCMatrix.
161    
162            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
163            or TermDocumentMatrix instead.
164    
165    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
166    
167            * inst/doc/extensions.Rnw: Finished vignette.
168    
169    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
170    
171            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
172            DocumentTermMatrix representations.
173    
174    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
175    
176            * R/reader.R (readXML): New reader for arbitrary XML files.
177    
178    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
179    
180            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
181            (XMLSource): New XMLSource class for arbitrary XML files.
182            (Source): New slot Vectorized.
183    
184    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/reader.R (readTabular): Experimental reader for tabular data
187            structures which can be customized via user-defined mappings.
188    
189            * R/reader.R: Always use UTC time zone.
190    
191            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
192    
193    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
194    
195            * R/reader.R (readDOC): Options can be passed over to antiword.
196    
197            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
198            pdftotext.
199    
200    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
201    
202            * R/source.R (DirSource): Add pattern and ignore.case arguments
203            which are internally passed over to list.files().
204    
205    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
206    
207            * inst/doc/tm.Rnw: Suppress pointless loading message.
208    
209    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
210    
211            * DESCRIPTION: Speed up package loading (via moving packages not
212            strictly necessary for normal operation to Suggests instead of
213            Depends).
214    
215    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
216    
217            * R/reader.R (readNewsgroup): The date format is now configurable.
218    
219    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
220    
221            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
222    
223    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
224    
225            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
226    
227    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
228    
229            * R/source.R (DataframeSource): New source class for data frames.
230    
231            * R/source.R: Fixed non-standard call evaluation.
232    
233    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
234    
235            * R/source.R (URISource): New source class for a single document.
236    
237    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
238    
239            * R/source.R: Refactoring.
240    
241    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
242    
243            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
244            Rmpi installations more gracefully.
245    
246    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
247    
248            * R/source.R (Source): Add Length slot.
249    
250    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
251    
252            * R/AAA.R: Unify duplicated .onLoad function.
253    
254    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
255    
256            * DESCRIPTION (Suggests): Added Rmpi.
257    
258    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
259    
260            * R/source.R (getElem): Fix 'no visible binding' warning.
261    
262            * man/WeightFunction.Rd: Fix signature.
263    
264    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
265    
266            * R/weight.R: Introduce name abbreviations for weighting functions.
267    
268    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
269    
270            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
271    
272            * R/cluster.R: Provide convenience functions for using a MPI
273            cluster.
274    
275            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
276            available.
277    
278            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
279            available.
280    
281    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
282    
283            * R/textdoccol.R (lapply): Removed debug print out.
284    
285    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
286    
287            * R/reader.R (readRCV1): Improved meta data extraction from
288            Reuters Corpus Volume 1 documents.
289    
290    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
291    
292            * R/transform.R: Ensure that all mappings preserve multiline
293            structures.
294    
295    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
296    
297            * R/filter.R: Every filter has now an attribute indicating whether
298            it sould be applied to document level (doclevel).
299    
300            * R/textdoccol.R (tmFilter): Set searchFullText as new default
301            filter.
302    
303    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
304    
305            * R/transform.R (replacePatterns): Replaced removeWords by
306            replacePatterns. Suggested by Christian Buchta.
307    
308            * R/textdoccol.R (inspect): Improved formatting.
309    
310    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
311    
312            * inst/CITATION: Updated JSS article information.
313    
314            * R/textdoccol.R (setAs): Added coerce method from list to
315            corpus.
316    
317            * R/meta.R (meta): Improved meta data handling.
318    
319    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
322            Christian Buchta.
323    
324            * inst/CITATION: Added template to include JSS article reference.
325    
326    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * R/textdoccol.R (tmMap): Introduced lazy mapping.
329    
330            * R/source.R: Added VectorSource.
331    
332    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
333    
334            * man/: Language codes should be in ISO 639-1 format.
335    
336            * R/textdoccol.R (asPlain): Preserve local meta data.
337    
338    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
339    
340            * R/textdoccol.R (writeCorpus): Function for writing a corpus
341            containing plain text documents to disk.
342    
343    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
344    
345            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
346            always set correctly.
347    
348            * R/textdoccol.R: Set load = TRUE as default for load on demand
349            since in most cases this is the wanted behaviour.
350    
351    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
354    
355            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
356    
357    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
358    
359            * R/meta.R (meta): New function for consistent access to meta data
360            of document collections, repositories, and texts.
361    
362    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
363    
364            * R/: Better support for encodings.
365    
366    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
367    
368            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
369            selection when no reader argument is given.
370    
371    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
372    
373            * R/source.R (CSVSource): Now uses read.csv instead of scan
374            internally.
375    
376    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * R/reader.R (getReaders): Returns available reader functions.
379    
380            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
381            as default.
382    
383    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
384    
385            * R/stopwords.R (stopwords): Shortened code, removed codetools
386            variable warnings.
387    
388            * man/: Documentation for showMeta, added an example for tmMap.
389    
390            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
391            some minor typos fixed.
392    
393    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * R/aobjects.R (showMeta): Added method for pretty printing a
396            text document's meta data.
397    
398    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/textdoccol.R (TextDocCol): Better handling of empty
401            arguments.
402    
403            * NAMESPACE: Exported readDOC.
404    
405            * man/completeStems.Rd: Added an example.
406    
407    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/stopwords.R (stopwords): Look up .dat files at every
410            call. Allows users to modify stopword .dat files interactively.
411    
412    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/termdocmatrix.R (termFreq): Correct processing of empty
415            documents.
416    
417    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
418    
419            * man/: Updated documentation.
420    
421    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * R/complete.R (completeStems): Completes (heuristically) word
424            stems.
425    
426            * R/termdocmatrix.R (TermDocMatrix2): New modular
427            constructor.
428    
429            * NAMESPACE: Exported termFreq.
430    
431    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
432    
433            * R/reader.R (readDOC): Added MS Word reader (using antiword).
434    
435    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * R/weight.R: Weighting functions for TermDocMatrix.
438    
439    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
442            functions for accessing dimension, column, and row names.
443    
444            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
445    
446    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
449    
450    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
453    
454    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/reader.R (readPDF): Removed manual checks for pdftotext and
457            pdfinfo. The system call gives a warning anyway.
458    
459    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * R/textdoccol.R (asPlain): Conversion from
462            StructuredTextDocuments to PlainTextDocuments.
463    
464    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
465    
466            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
467            for accessing term-document matrices.
468    
469            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
470            are installed.
471    
472    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
475            Christian Buchta.
476    
477    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
480    
481    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
484    
485            * R/reader.R (readPDF): Added PDF reader.
486    
487    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
488    
489            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
490    
491            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
492    
493            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
494    
495            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
496    
497    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * R/distmeasure.R (dissimilarity): Replaced dists call from
500            package cba by new dist call from package proxy.
501    
502    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
505    
506    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
507    
508            * R/termdocmatrix.R: require() uses the quietly option to suppress
509            loading messages.
510    
511    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
512    
513            * R/dictionary.R: Added dictionary support.
514    
515    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
516    
517            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
518            documents. This simplifies some functions, e.g., asPlain.
519    
520    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
521    
522            * inst/doc/tm.Rnw: Fixed some typos in vignette.
523    
524    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
525    
526            * R/textdoccol.R (replaceWords): Added method to replace a set of
527            words by a single word. Useful for synonyms.
528    
529    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
530    
531            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
532    
533    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
534    
535            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
536            vectors. Thanks to Ariel Maguyon for his error report.
537            (removeSparseTerms): New function to remove columns from a
538            term-document matrix exceeding a sparse factor.
539    
540    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
543    
544    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
545    
546            * man/sFilter.Rd: Corrected documentation on statement format (use
547            '==' instead of '=').
548    
549    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
550    
551            * R/aobjects.R (StructuredTextDocument): Inherits from
552            TextDocument.
553    
554    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
555    
556            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
557            on sparse matrices as proposed by Martin Maechler.
558    
559    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
560    
561            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
562            \pkg{filehash} version makes them deprecated.
563    
564    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
565    
566            * R/termdocmatrix.R (textvector): Stemming is now performed before
567            erasing stopwords.
568            (weightMatrix): Adapted to handle sparse matrices.
569            (TermDocMatrix): Sparse matrix is now efficiently built by
570            direct stepwise insertion of row values into it.
571    
572    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
573    
574            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
575            due to ongoing problems. For our purposes the latter is as useful
576            as the replaced package.
577    
578    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
579    
580            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
581    
582            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
583    
584    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
585    
586            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
587            languages with available stopwords.
588    
589    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
590    
591            * inst/doc/tm.Rnw: Minor corrections in the vignette.
592    
593    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
594    
595            * DESCRIPTION: Update to version 0.2, since a lot of new features
596            have been integrated.
597    
598            * inst/stopwords: Updated existing stopwords and added stopwords
599            for various other languages.
600    
601    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
602    
603            * man/: Updated documentation.
604    
605            * Work/testDb.R: Script to test database stuff.
606    
607            * R/: Fixed various database related bugs. Seems to be rather
608            useable now, i.e., consider as alpha status for now.
609    
610    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
611    
612            * R/: Fixed some bugs related to database support.
613    
614    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
615    
616            * man/: Added a lot of examples to the manuals.
617    
618    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
619    
620            * man/: Updated parts of the documentation.
621    
622            * R/textdoccol.R (asPlain): Added conversion from newsgroup
623            documents to plain text documents.
624    
625    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
626    
627            * R/textdoccol.R: Finished experimental database support. Not yet
628            intensively tested.
629    
630            * R/source.R: Now each source has a default reader.
631    
632            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
633            class anymore.
634    
635            * R/plaintextdoc.R: Custom show method for plain text documents.
636    
637            * R/aobjects.R: Added a class for structured text documents.
638    
639            * R/reader.R: Replaced remaining \code{parser} occurrences with
640            \code{reader}.
641    
642            * R/textdoccol.R (summary): Indent tags.
643    
644            * R/textdoccol.R (removePunctuation): Transform method to remove
645            punctuation marks.
646    
647    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
648    
649            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
650            using prescindMeta().
651    
652    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
653    
654            * R/textdoccol.R: Improved database support.
655    
656    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
657    
658            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
659    
660            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
661            language code.
662    
663            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
664            into parserControl argument.
665    
666            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
667    
668    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
669    
670            * Work/tmDataSetup.R: The datasets acq and crude can now be
671            created on the fly.
672    
673            * R/stopwords.R: Introduced a function returning the stopwords for
674            a given language (English, German and French at the moment)
675    
676            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
677            otherwise falls back to Snowball package.
678    
679    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
680    
681            * man/dissimilarity-methods.Rd: Make clear that any method offered
682            by "dists" from package "cba" can be used.
683    
684    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
685    
686            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
687            to Kurt's latex suggestion. Removed points and underscores in
688            variable names for consistent naming.
689    
690            * DESCRIPTION: Update to version 0.1-2.
691    
692            * man/TextRepository.Rd: Fixed bug in documentation.
693    
694    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
695    
696            * DESCRIPTION: Update to version 0.1-1.
697    
698    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
699    
700            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
701            wordStem.
702    
703    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
704    
705            * R/: Changes due to Kurt's review.
706    
707    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
708    
709            * R/: Implemented improvements based upon comments by David
710            Meyer.
711    
712    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
713    
714            * inst/doc/: Rewrote vignette.
715    
716            * man/: Improved documentation.
717    
718    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
719    
720            * man/: Updated documentation.
721    
722            * DESCRIPTION: Changed package name to "tm". Updated version to
723            0.1 for first CRAN release.
724    
725            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
726            list archive example.
727    
728            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
729            archive example.
730    
731            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
732            from (several mails per box) mbox format to (single mail per file)
733            eml format.
734    
735    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
736    
737            * data/crude.rda: Rebuilt.
738    
739            * data/acq.rda: Rebuilt.
740    
741            * R/reader.R: Factored out reader and parser methods from
742            textdoccol.R.
743    
744            * R/source.R: Factored out Source methods from aobjects.R and
745            textdoccol.R.
746            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
747            feeds.
748    
749            * R/textdoccol.R (DirSource): Added support for recursive
750            traversal of directories.
751    
752    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
753    
754            * R/textdoccol.R ([[): Loads the document corpus automatically
755            into memory upon access.
756            (tm_transform, tm_filter): Removed several checks whether the
757            document is already loaded ([[ ensures this now).
758            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
759            mailing list archive.
760    
761    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
762    
763            * R/aobjects.R (TextDocument): Is now a virtual class.
764            (Source): Is now a virtual class.
765    
766    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
767    
768            * R/textdoccol.R (c): Support for an arbitrary number of document
769            collections.
770    
771    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
772    
773            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
774            append_meta and remove_meta.
775    
776            * R/textdoccol.R: Removed modify_metadata method.
777    
778            * R/textrepo.R: Removed modify_metadata method.
779    
780            * R/textdoccol.R (remove_meta): Supports removal of document
781            collection metadata and document (= in data frame) metadata.
782    
783    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
784    
785            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
786    
787            * data/crude.rda: Rebuilt.
788    
789            * data/acq.rda: Rebuilt.
790    
791            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
792    
793            * R/textdoccol.R ([): Bug fix for subsetting a document
794            collection's data frame.
795    
796    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
797    
798            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
799            to s_filter.
800    
801            * R/textdoccol.R: Local text documents' metadata can now be copied
802            to a document collection's data frame with prescind_meta.
803    
804    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
805    
806            * R/: Text documents' slot metadata is now accessible in s_filter.
807    
808            * R/: Rewrote s_filter function (has still some restrictions).
809    
810    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
811    
812            * R/: Various fixes in handling metadata.
813    
814            * R/: Added update mechanism for text document collections.
815    
816    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
817    
818            * R/: Merging of document collections now creates a binary tree
819            for reconstructing merged document collections.
820    
821            * R/: Redesign of metadata for document collections.
822    
823    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
824    
825            * R/: Messages now use \code{ngettext}.
826    
827    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
828    
829            * R/: Added functions for modifying and removing metadata.
830    
831    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * man/: Updated some documentation.
834    
835            * R/: Corrected some connection issues.
836    
837            * inst/doc: Worked on the vignette.
838    
839    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
840    
841            * inst/: Added texts and started vignette.
842    
843            * R/: Final changes based upon David's comments.
844    
845    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
846    
847            * NAMESPACE: Corrected exports (generic methods need exportMethods
848            directives!).
849    
850    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
851    
852            * R/: Modified the TextDocCol constructur and various parsers. It
853            is now modular and supports various file formats via plugins (see
854            the new "Source" class).
855    
856    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
857    
858            * man/: Revised documentation after previous code changes.
859    
860    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
861    
862            * R/: Remaining changes as discussed with David.
863    
864    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
865    
866            * R/: Some changes as suggested by David. The rest will follow
867            within the next days.
868    
869    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
870    
871            * man/: Finished documentation.
872    
873    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
874    
875            * man/: Wrote some documentation.
876    
877    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
878    
879            * R/: Further syntactic sugar in form of additional assignment and
880            accessor methods.
881    
882    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
883    
884            * R/: Syntactic sugar in form of "length", "show" and "summary"
885            operators.
886    
887    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * R/: Diverse updates. Mainly on default operators ("[" or "c")
890            and dissimilarities.
891    
892    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894            * R/: Added similarity functions.
895    
896            * data/: Added english stopwords.
897    
898    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
899    
900            * data/: Examples compiled for new features
901    
902            * R/: Changes due to new structure.
903    
904            * NAMESPACE: Corrected namespace to reflect new structure.
905    
906            * R/termdocmatrix.R: Adapted for new naming scheme.
907    
908    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
909    
910            * R/textdoccol.R: Adapted code for new class structure. Wrote
911            several transform and filter functions operating on text document
912            collections (alias text document databases).
913    
914            * R/aobjects.R: Adapted class structure with inheritance,
915            repositories and additional meta data. Loading files on demand is
916            now possible.
917    
918    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
919    
920            * R/: Some cosmetic cleanups.
921    
922            * inst/: Removed vignette on clustering. That and much more is now
923            described in the JSS paper on text mining. Based upon that
924            article an elaborated vignette will be incorporated in the future.
925    
926    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
927    
928            * R/: Updated generic S4 methods to comply with signature changes
929            in newer versions of R (> 2.3)
930    
931    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
932    
933            * ext/R/importRIS.R: Automatic RIS import is now possible.
934    
935    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
936    
937            * R/textdoccol.R: Added RIS HTML input format.
938    
939    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
940    
941            * R/textdoccol.R: Removed bug that caused invalid text document
942            collections when handling many input files.
943    
944  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
945    
946          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1032

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge