SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC pkg/ChangeLog revision 1034, Tue Jan 12 16:47:41 2010 UTC
# Line 1  Line 1 
1    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/doc.R (`Content<-`): Be careful with names attribute.
4    
5    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
6    
7            * R/source.R (DirSource): Improved implementation especially when
8            handling many (>1M) files.
9    
10    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/source.R (getElem.URISource): Use encoding argument.
13    
14    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/doc.R (setOldClass): Register S3 document classes to be
17            recognized by S4 methods.
18    
19    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
20    
21            * R/matrix.R (termFreq): Add option to remove punctuation
22            characters.
23    
24    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
25    
26            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
27            merging multiple term-document matrices.
28    
29    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
30    
31            * R/corpus.R (setOldClass): Register S3 corpus classes to be
32            recognized by S4 methods.
33    
34            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
35            that CRAN Mac OS X builds do not fail any longer.
36    
37    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
38    
39            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
40            of RWeka:AlphabeticTokenizer() as default.
41    
42    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
45            caused words at the beginning or the end of a line not to be removed. Do
46            not delete whitespace anymore.
47    
48    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
49    
50            * R/source.R (DirSource): Default to working directory if no path
51            is specified.
52    
53    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
54    
55            * R/source.R (DirSource): Stop on empty directories.
56    
57    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
58    
59            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
60            named documents.
61    
62    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/transform.R (removeWords): Improve regular expressions.
65    
66    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
67    
68            * R/meta.R (DublinCore): Allow lower case tags.
69    
70    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
71    
72            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
73            instead of x$children.
74    
75    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
76    
77            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
78    
79    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
80    
81            * R/: Use S3 instead of S4 class system.
82    
83    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
84    
85            * R/reader.R (readMail): Moved to tm.plugin.mail package.
86    
87    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
88    
89            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
90            postings are basically e-mails with some extra headers.
91    
92    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
93    
94            * R/transform.R: Move convertMboxEml, removeCitation,
95            removeMultipart, and removeSignature to the tm.plugin.mail package
96            since they are mainly utility functions (for handling e-mails) and
97            not very framework specific.
98    
99    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
100    
101            * man/: Fix documentation.
102    
103    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
104    
105            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
106            plain text document instead of an XML document for texts of the
107            Reuters-21578 dataset.
108    
109            * R/sparse.R: Removed since the slam package is now available on
110            CRAN.
111    
112            * DESCRIPTION (Depends): Add slam package.
113    
114    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/transform.R (stemDoc): Fix character(0) handling.
117    
118    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
119    
120            * R/doc.R (show): Pretty print.
121    
122    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
123    
124            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
125            gracefully.
126    
127    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
128    
129            * R/corpus.R: Make corpus virtual. Implement corpus with standard
130            and permanent storage semantics.
131    
132            * DESCRIPTION: New major release. A *lot* of improvements.
133    
134    2009-05-04   Ingo Feinerer <feinerer@logic.at>
135    
136            * NAMESPACE: Export some simple_triplet_matrix functions.
137    
138    2009-04-28   Ingo Feinerer <feinerer@logic.at>
139    
140            * R/weight.R: Adapt tf-idf to new matrix format.
141    
142    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
143    
144            * R/matrix.R: Create two distinct classes for term-document and
145            document-term matrices.
146    
147    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
148    
149            * R/termdocmatrix.R: No longer use Matrix package. This reduces
150            package start-up time significantly.
151    
152    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
153    
154            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
155    
156    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
157    
158            * R/transform.R (tmReduce): Combine multiple maps into one
159            transformation.
160    
161    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
162    
163            * R/weight.R: Remove weightLogical since it does not return a
164            dgCMatrix.
165    
166            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
167            or TermDocumentMatrix instead.
168    
169    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
170    
171            * inst/doc/extensions.Rnw: Finished vignette.
172    
173    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
174    
175            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
176            DocumentTermMatrix representations.
177    
178    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
179    
180            * R/reader.R (readXML): New reader for arbitrary XML files.
181    
182    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
183    
184            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
185            (XMLSource): New XMLSource class for arbitrary XML files.
186            (Source): New slot Vectorized.
187    
188    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
189    
190            * R/reader.R (readTabular): Experimental reader for tabular data
191            structures which can be customized via user-defined mappings.
192    
193            * R/reader.R: Always use UTC time zone.
194    
195            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
196    
197    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
198    
199            * R/reader.R (readDOC): Options can be passed over to antiword.
200    
201            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
202            pdftotext.
203    
204    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
205    
206            * R/source.R (DirSource): Add pattern and ignore.case arguments
207            which are internally passed over to list.files().
208    
209    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
210    
211            * inst/doc/tm.Rnw: Suppress pointless loading message.
212    
213    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
214    
215            * DESCRIPTION: Speed up package loading (via moving packages not
216            strictly necessary for normal operation to Suggests instead of
217            Depends).
218    
219    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
220    
221            * R/reader.R (readNewsgroup): The date format is now configurable.
222    
223    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
224    
225            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
226    
227    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
228    
229            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
230    
231    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
232    
233            * R/source.R (DataframeSource): New source class for data frames.
234    
235            * R/source.R: Fixed non-standard call evaluation.
236    
237    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
238    
239            * R/source.R (URISource): New source class for a single document.
240    
241    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
242    
243            * R/source.R: Refactoring.
244    
245    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
246    
247            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
248            Rmpi installations more gracefully.
249    
250    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
251    
252            * R/source.R (Source): Add Length slot.
253    
254    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
255    
256            * R/AAA.R: Unify duplicated .onLoad function.
257    
258    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
259    
260            * DESCRIPTION (Suggests): Added Rmpi.
261    
262    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
263    
264            * R/source.R (getElem): Fix 'no visible binding' warning.
265    
266            * man/WeightFunction.Rd: Fix signature.
267    
268    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
269    
270            * R/weight.R: Introduce name abbreviations for weighting functions.
271    
272    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
273    
274            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
275    
276            * R/cluster.R: Provide convenience functions for using a MPI
277            cluster.
278    
279            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
280            available.
281    
282            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
283            available.
284    
285    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
286    
287            * R/textdoccol.R (lapply): Removed debug print out.
288    
289    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
290    
291            * R/reader.R (readRCV1): Improved meta data extraction from
292            Reuters Corpus Volume 1 documents.
293    
294    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
295    
296            * R/transform.R: Ensure that all mappings preserve multiline
297            structures.
298    
299    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
300    
301            * R/filter.R: Every filter has now an attribute indicating whether
302            it sould be applied to document level (doclevel).
303    
304            * R/textdoccol.R (tmFilter): Set searchFullText as new default
305            filter.
306    
307    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
308    
309            * R/transform.R (replacePatterns): Replaced removeWords by
310            replacePatterns. Suggested by Christian Buchta.
311    
312            * R/textdoccol.R (inspect): Improved formatting.
313    
314    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * inst/CITATION: Updated JSS article information.
317    
318            * R/textdoccol.R (setAs): Added coerce method from list to
319            corpus.
320    
321            * R/meta.R (meta): Improved meta data handling.
322    
323    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
324    
325            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
326            Christian Buchta.
327    
328            * inst/CITATION: Added template to include JSS article reference.
329    
330    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
331    
332            * R/textdoccol.R (tmMap): Introduced lazy mapping.
333    
334            * R/source.R: Added VectorSource.
335    
336    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
337    
338            * man/: Language codes should be in ISO 639-1 format.
339    
340            * R/textdoccol.R (asPlain): Preserve local meta data.
341    
342    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
343    
344            * R/textdoccol.R (writeCorpus): Function for writing a corpus
345            containing plain text documents to disk.
346    
347    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
348    
349            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
350            always set correctly.
351    
352            * R/textdoccol.R: Set load = TRUE as default for load on demand
353            since in most cases this is the wanted behaviour.
354    
355    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
358    
359            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
360    
361    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
362    
363            * R/meta.R (meta): New function for consistent access to meta data
364            of document collections, repositories, and texts.
365    
366    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
367    
368            * R/: Better support for encodings.
369    
370    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
373            selection when no reader argument is given.
374    
375    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
376    
377            * R/source.R (CSVSource): Now uses read.csv instead of scan
378            internally.
379    
380    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * R/reader.R (getReaders): Returns available reader functions.
383    
384            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
385            as default.
386    
387    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
388    
389            * R/stopwords.R (stopwords): Shortened code, removed codetools
390            variable warnings.
391    
392            * man/: Documentation for showMeta, added an example for tmMap.
393    
394            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
395            some minor typos fixed.
396    
397    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
398    
399            * R/aobjects.R (showMeta): Added method for pretty printing a
400            text document's meta data.
401    
402    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * R/textdoccol.R (TextDocCol): Better handling of empty
405            arguments.
406    
407            * NAMESPACE: Exported readDOC.
408    
409            * man/completeStems.Rd: Added an example.
410    
411    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * R/stopwords.R (stopwords): Look up .dat files at every
414            call. Allows users to modify stopword .dat files interactively.
415    
416    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * R/termdocmatrix.R (termFreq): Correct processing of empty
419            documents.
420    
421    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * man/: Updated documentation.
424    
425    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/complete.R (completeStems): Completes (heuristically) word
428            stems.
429    
430            * R/termdocmatrix.R (TermDocMatrix2): New modular
431            constructor.
432    
433            * NAMESPACE: Exported termFreq.
434    
435    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * R/reader.R (readDOC): Added MS Word reader (using antiword).
438    
439    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441            * R/weight.R: Weighting functions for TermDocMatrix.
442    
443    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
444    
445            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
446            functions for accessing dimension, column, and row names.
447    
448            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
449    
450    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
453    
454    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
457    
458    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
459    
460            * R/reader.R (readPDF): Removed manual checks for pdftotext and
461            pdfinfo. The system call gives a warning anyway.
462    
463    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * R/textdoccol.R (asPlain): Conversion from
466            StructuredTextDocuments to PlainTextDocuments.
467    
468    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
469    
470            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
471            for accessing term-document matrices.
472    
473            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
474            are installed.
475    
476    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
477    
478            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
479            Christian Buchta.
480    
481    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
484    
485    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
486    
487            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
488    
489            * R/reader.R (readPDF): Added PDF reader.
490    
491    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
492    
493            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
494    
495            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
496    
497            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
498    
499            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
500    
501    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
502    
503            * R/distmeasure.R (dissimilarity): Replaced dists call from
504            package cba by new dist call from package proxy.
505    
506    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
507    
508            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
509    
510    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
511    
512            * R/termdocmatrix.R: require() uses the quietly option to suppress
513            loading messages.
514    
515    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
516    
517            * R/dictionary.R: Added dictionary support.
518    
519    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
520    
521            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
522            documents. This simplifies some functions, e.g., asPlain.
523    
524    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
525    
526            * inst/doc/tm.Rnw: Fixed some typos in vignette.
527    
528    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
529    
530            * R/textdoccol.R (replaceWords): Added method to replace a set of
531            words by a single word. Useful for synonyms.
532    
533    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
534    
535            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
536    
537    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
538    
539            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
540            vectors. Thanks to Ariel Maguyon for his error report.
541            (removeSparseTerms): New function to remove columns from a
542            term-document matrix exceeding a sparse factor.
543    
544    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
545    
546            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
547    
548    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
549    
550            * man/sFilter.Rd: Corrected documentation on statement format (use
551            '==' instead of '=').
552    
553    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
554    
555            * R/aobjects.R (StructuredTextDocument): Inherits from
556            TextDocument.
557    
558    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
559    
560            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
561            on sparse matrices as proposed by Martin Maechler.
562    
563    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
564    
565            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
566            \pkg{filehash} version makes them deprecated.
567    
568    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
569    
570            * R/termdocmatrix.R (textvector): Stemming is now performed before
571            erasing stopwords.
572            (weightMatrix): Adapted to handle sparse matrices.
573            (TermDocMatrix): Sparse matrix is now efficiently built by
574            direct stepwise insertion of row values into it.
575    
576    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
577    
578            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
579            due to ongoing problems. For our purposes the latter is as useful
580            as the replaced package.
581    
582    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
583    
584            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
585    
586            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
587    
588    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
589    
590            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
591            languages with available stopwords.
592    
593    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
594    
595            * inst/doc/tm.Rnw: Minor corrections in the vignette.
596    
597    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
598    
599            * DESCRIPTION: Update to version 0.2, since a lot of new features
600            have been integrated.
601    
602            * inst/stopwords: Updated existing stopwords and added stopwords
603            for various other languages.
604    
605    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
606    
607            * man/: Updated documentation.
608    
609            * Work/testDb.R: Script to test database stuff.
610    
611            * R/: Fixed various database related bugs. Seems to be rather
612            useable now, i.e., consider as alpha status for now.
613    
614    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
615    
616            * R/: Fixed some bugs related to database support.
617    
618    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
619    
620            * man/: Added a lot of examples to the manuals.
621    
622    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
623    
624            * man/: Updated parts of the documentation.
625    
626            * R/textdoccol.R (asPlain): Added conversion from newsgroup
627            documents to plain text documents.
628    
629    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
630    
631            * R/textdoccol.R: Finished experimental database support. Not yet
632            intensively tested.
633    
634            * R/source.R: Now each source has a default reader.
635    
636            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
637            class anymore.
638    
639            * R/plaintextdoc.R: Custom show method for plain text documents.
640    
641            * R/aobjects.R: Added a class for structured text documents.
642    
643            * R/reader.R: Replaced remaining \code{parser} occurrences with
644            \code{reader}.
645    
646            * R/textdoccol.R (summary): Indent tags.
647    
648            * R/textdoccol.R (removePunctuation): Transform method to remove
649            punctuation marks.
650    
651    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
652    
653            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
654            using prescindMeta().
655    
656    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
657    
658            * R/textdoccol.R: Improved database support.
659    
660    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
661    
662            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
663    
664            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
665            language code.
666    
667            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
668            into parserControl argument.
669    
670            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
671    
672    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
673    
674            * Work/tmDataSetup.R: The datasets acq and crude can now be
675            created on the fly.
676    
677            * R/stopwords.R: Introduced a function returning the stopwords for
678            a given language (English, German and French at the moment)
679    
680            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
681            otherwise falls back to Snowball package.
682    
683    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * man/dissimilarity-methods.Rd: Make clear that any method offered
686            by "dists" from package "cba" can be used.
687    
688    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
689    
690            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
691            to Kurt's latex suggestion. Removed points and underscores in
692            variable names for consistent naming.
693    
694            * DESCRIPTION: Update to version 0.1-2.
695    
696            * man/TextRepository.Rd: Fixed bug in documentation.
697    
698    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
699    
700            * DESCRIPTION: Update to version 0.1-1.
701    
702    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
703    
704            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
705            wordStem.
706    
707    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
708    
709            * R/: Changes due to Kurt's review.
710    
711    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
712    
713            * R/: Implemented improvements based upon comments by David
714            Meyer.
715    
716    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
717    
718            * inst/doc/: Rewrote vignette.
719    
720            * man/: Improved documentation.
721    
722    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
723    
724            * man/: Updated documentation.
725    
726            * DESCRIPTION: Changed package name to "tm". Updated version to
727            0.1 for first CRAN release.
728    
729            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
730            list archive example.
731    
732            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
733            archive example.
734    
735            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
736            from (several mails per box) mbox format to (single mail per file)
737            eml format.
738    
739    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
740    
741            * data/crude.rda: Rebuilt.
742    
743            * data/acq.rda: Rebuilt.
744    
745            * R/reader.R: Factored out reader and parser methods from
746            textdoccol.R.
747    
748            * R/source.R: Factored out Source methods from aobjects.R and
749            textdoccol.R.
750            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
751            feeds.
752    
753            * R/textdoccol.R (DirSource): Added support for recursive
754            traversal of directories.
755    
756    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
757    
758            * R/textdoccol.R ([[): Loads the document corpus automatically
759            into memory upon access.
760            (tm_transform, tm_filter): Removed several checks whether the
761            document is already loaded ([[ ensures this now).
762            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
763            mailing list archive.
764    
765    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
766    
767            * R/aobjects.R (TextDocument): Is now a virtual class.
768            (Source): Is now a virtual class.
769    
770    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
771    
772            * R/textdoccol.R (c): Support for an arbitrary number of document
773            collections.
774    
775    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
776    
777            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
778            append_meta and remove_meta.
779    
780            * R/textdoccol.R: Removed modify_metadata method.
781    
782            * R/textrepo.R: Removed modify_metadata method.
783    
784            * R/textdoccol.R (remove_meta): Supports removal of document
785            collection metadata and document (= in data frame) metadata.
786    
787    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
788    
789            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
790    
791            * data/crude.rda: Rebuilt.
792    
793            * data/acq.rda: Rebuilt.
794    
795            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
796    
797            * R/textdoccol.R ([): Bug fix for subsetting a document
798            collection's data frame.
799    
800    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
801    
802            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
803            to s_filter.
804    
805            * R/textdoccol.R: Local text documents' metadata can now be copied
806            to a document collection's data frame with prescind_meta.
807    
808    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
809    
810            * R/: Text documents' slot metadata is now accessible in s_filter.
811    
812            * R/: Rewrote s_filter function (has still some restrictions).
813    
814    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
815    
816            * R/: Various fixes in handling metadata.
817    
818            * R/: Added update mechanism for text document collections.
819    
820    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
821    
822            * R/: Merging of document collections now creates a binary tree
823            for reconstructing merged document collections.
824    
825            * R/: Redesign of metadata for document collections.
826    
827    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
828    
829            * R/: Messages now use \code{ngettext}.
830    
831    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/: Added functions for modifying and removing metadata.
834    
835    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
836    
837            * man/: Updated some documentation.
838    
839            * R/: Corrected some connection issues.
840    
841            * inst/doc: Worked on the vignette.
842    
843    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
844    
845            * inst/: Added texts and started vignette.
846    
847            * R/: Final changes based upon David's comments.
848    
849    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
850    
851            * NAMESPACE: Corrected exports (generic methods need exportMethods
852            directives!).
853    
854    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
855    
856            * R/: Modified the TextDocCol constructur and various parsers. It
857            is now modular and supports various file formats via plugins (see
858            the new "Source" class).
859    
860    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
861    
862            * man/: Revised documentation after previous code changes.
863    
864    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
865    
866            * R/: Remaining changes as discussed with David.
867    
868    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
869    
870            * R/: Some changes as suggested by David. The rest will follow
871            within the next days.
872    
873    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
874    
875            * man/: Finished documentation.
876    
877    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
878    
879            * man/: Wrote some documentation.
880    
881    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
882    
883            * R/: Further syntactic sugar in form of additional assignment and
884            accessor methods.
885    
886    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
887    
888            * R/: Syntactic sugar in form of "length", "show" and "summary"
889            operators.
890    
891    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
892    
893            * R/: Diverse updates. Mainly on default operators ("[" or "c")
894            and dissimilarities.
895    
896    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
897    
898            * R/: Added similarity functions.
899    
900            * data/: Added english stopwords.
901    
902    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
903    
904            * data/: Examples compiled for new features
905    
906            * R/: Changes due to new structure.
907    
908            * NAMESPACE: Corrected namespace to reflect new structure.
909    
910            * R/termdocmatrix.R: Adapted for new naming scheme.
911    
912    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
913    
914            * R/textdoccol.R: Adapted code for new class structure. Wrote
915            several transform and filter functions operating on text document
916            collections (alias text document databases).
917    
918            * R/aobjects.R: Adapted class structure with inheritance,
919            repositories and additional meta data. Loading files on demand is
920            now possible.
921    
922    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
923    
924            * R/: Some cosmetic cleanups.
925    
926            * inst/: Removed vignette on clustering. That and much more is now
927            described in the JSS paper on text mining. Based upon that
928            article an elaborated vignette will be incorporated in the future.
929    
930    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
931    
932            * R/: Updated generic S4 methods to comply with signature changes
933            in newer versions of R (> 2.3)
934    
935    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
936    
937            * ext/R/importRIS.R: Automatic RIS import is now possible.
938    
939    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
940    
941            * R/textdoccol.R: Added RIS HTML input format.
942    
943    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
944    
945            * R/textdoccol.R: Removed bug that caused invalid text document
946            collections when handling many input files.
947    
948    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
949    
950            * R/textdoccol.R: Restructured and extended file import
951            mechanism.
952    
953            * inst/doc/clustering.Rnw: Adapted vignette for use with
954            ReutNews.rda
955    
956            * man/ReutNews.Rd: Documentation for ReutNews.rda
957    
958            * data/ReutNews.rda: A tiny Reuters21578 example data set.
959    
960    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
961    
962            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
963            clustering facilities of this package.
964    
965    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
966    
967            * R/aobjects.R: Changed package document structure to avoid class
968            dependency problems.
969    
970  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
971    
972            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
973            data set.
974    
975          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
976          CMD check textmin" works without errors.          CMD check textmin" works without errors.
977    

Legend:
Removed from v.28  
changed lines
  Added in v.1034

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge