SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1021, Tue Nov 17 16:37:22 2009 UTC
# Line 1  Line 1 
1    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/corpus.R (setOldClass): Register S3 corpus classes to be
4            recognized by S4 methods.
5    
6            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
7            that CRAN Mac OS X builds do not fail any longer.
8    
9    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
10    
11            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
12            of RWeka:AlphabeticTokenizer() as default.
13    
14    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
17            caused words at the beginning or the end of a line not to be removed. Do
18            not delete whitespace anymore.
19    
20    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
21    
22            * R/source.R (DirSource): Default to working directory if no path
23            is specified.
24    
25    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
26    
27            * R/source.R (DirSource): Stop on empty directories.
28    
29    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
30    
31            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
32            named documents.
33    
34    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
35    
36            * R/transform.R (removeWords): Improve regular expressions.
37    
38    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
39    
40            * R/meta.R (DublinCore): Allow lower case tags.
41    
42    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
45            instead of x$children.
46    
47    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
48    
49            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
50    
51    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
52    
53            * R/: Use S3 instead of S4 class system.
54    
55    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
56    
57            * R/reader.R (readMail): Moved to tm.plugin.mail package.
58    
59    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
60    
61            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
62            postings are basically e-mails with some extra headers.
63    
64    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
65    
66            * R/transform.R: Move convertMboxEml, removeCitation,
67            removeMultipart, and removeSignature to the tm.plugin.mail package
68            since they are mainly utility functions (for handling e-mails) and
69            not very framework specific.
70    
71    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
72    
73            * man/: Fix documentation.
74    
75    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
76    
77            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
78            plain text document instead of an XML document for texts of the
79            Reuters-21578 dataset.
80    
81            * R/sparse.R: Removed since the slam package is now available on
82            CRAN.
83    
84            * DESCRIPTION (Depends): Add slam package.
85    
86    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/transform.R (stemDoc): Fix character(0) handling.
89    
90    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
91    
92            * R/doc.R (show): Pretty print.
93    
94    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
95    
96            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
97            gracefully.
98    
99    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
100    
101            * R/corpus.R: Make corpus virtual. Implement corpus with standard
102            and permanent storage semantics.
103    
104            * DESCRIPTION: New major release. A *lot* of improvements.
105    
106    2009-05-04   Ingo Feinerer <feinerer@logic.at>
107    
108            * NAMESPACE: Export some simple_triplet_matrix functions.
109    
110    2009-04-28   Ingo Feinerer <feinerer@logic.at>
111    
112            * R/weight.R: Adapt tf-idf to new matrix format.
113    
114    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/matrix.R: Create two distinct classes for term-document and
117            document-term matrices.
118    
119    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
120    
121            * R/termdocmatrix.R: No longer use Matrix package. This reduces
122            package start-up time significantly.
123    
124    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
125    
126            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
127    
128    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
129    
130            * R/transform.R (tmReduce): Combine multiple maps into one
131            transformation.
132    
133    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
134    
135            * R/weight.R: Remove weightLogical since it does not return a
136            dgCMatrix.
137    
138            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
139            or TermDocumentMatrix instead.
140    
141    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
142    
143            * inst/doc/extensions.Rnw: Finished vignette.
144    
145    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
148            DocumentTermMatrix representations.
149    
150    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
151    
152            * R/reader.R (readXML): New reader for arbitrary XML files.
153    
154    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
155    
156            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
157            (XMLSource): New XMLSource class for arbitrary XML files.
158            (Source): New slot Vectorized.
159    
160    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
161    
162            * R/reader.R (readTabular): Experimental reader for tabular data
163            structures which can be customized via user-defined mappings.
164    
165            * R/reader.R: Always use UTC time zone.
166    
167            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
168    
169    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
170    
171            * R/reader.R (readDOC): Options can be passed over to antiword.
172    
173            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
174            pdftotext.
175    
176    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/source.R (DirSource): Add pattern and ignore.case arguments
179            which are internally passed over to list.files().
180    
181    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
182    
183            * inst/doc/tm.Rnw: Suppress pointless loading message.
184    
185    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
186    
187            * DESCRIPTION: Speed up package loading (via moving packages not
188            strictly necessary for normal operation to Suggests instead of
189            Depends).
190    
191    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
192    
193            * R/reader.R (readNewsgroup): The date format is now configurable.
194    
195    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
196    
197            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
198    
199    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
200    
201            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
202    
203    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
204    
205            * R/source.R (DataframeSource): New source class for data frames.
206    
207            * R/source.R: Fixed non-standard call evaluation.
208    
209    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
210    
211            * R/source.R (URISource): New source class for a single document.
212    
213    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
214    
215            * R/source.R: Refactoring.
216    
217    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
218    
219            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
220            Rmpi installations more gracefully.
221    
222    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
223    
224            * R/source.R (Source): Add Length slot.
225    
226    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
227    
228            * R/AAA.R: Unify duplicated .onLoad function.
229    
230    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
231    
232            * DESCRIPTION (Suggests): Added Rmpi.
233    
234    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
235    
236            * R/source.R (getElem): Fix 'no visible binding' warning.
237    
238            * man/WeightFunction.Rd: Fix signature.
239    
240    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
241    
242            * R/weight.R: Introduce name abbreviations for weighting functions.
243    
244    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
245    
246            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
247    
248            * R/cluster.R: Provide convenience functions for using a MPI
249            cluster.
250    
251            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
252            available.
253    
254            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
255            available.
256    
257    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
258    
259            * R/textdoccol.R (lapply): Removed debug print out.
260    
261    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
262    
263            * R/reader.R (readRCV1): Improved meta data extraction from
264            Reuters Corpus Volume 1 documents.
265    
266    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
267    
268            * R/transform.R: Ensure that all mappings preserve multiline
269            structures.
270    
271    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
272    
273            * R/filter.R: Every filter has now an attribute indicating whether
274            it sould be applied to document level (doclevel).
275    
276            * R/textdoccol.R (tmFilter): Set searchFullText as new default
277            filter.
278    
279    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
280    
281            * R/transform.R (replacePatterns): Replaced removeWords by
282            replacePatterns. Suggested by Christian Buchta.
283    
284            * R/textdoccol.R (inspect): Improved formatting.
285    
286    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
287    
288            * inst/CITATION: Updated JSS article information.
289    
290            * R/textdoccol.R (setAs): Added coerce method from list to
291            corpus.
292    
293            * R/meta.R (meta): Improved meta data handling.
294    
295    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
296    
297            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
298            Christian Buchta.
299    
300            * inst/CITATION: Added template to include JSS article reference.
301    
302    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
303    
304            * R/textdoccol.R (tmMap): Introduced lazy mapping.
305    
306            * R/source.R: Added VectorSource.
307    
308    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
309    
310            * man/: Language codes should be in ISO 639-1 format.
311    
312            * R/textdoccol.R (asPlain): Preserve local meta data.
313    
314    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/textdoccol.R (writeCorpus): Function for writing a corpus
317            containing plain text documents to disk.
318    
319    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
322            always set correctly.
323    
324            * R/textdoccol.R: Set load = TRUE as default for load on demand
325            since in most cases this is the wanted behaviour.
326    
327    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
328    
329            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
330    
331            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
332    
333    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
334    
335            * R/meta.R (meta): New function for consistent access to meta data
336            of document collections, repositories, and texts.
337    
338    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
339    
340            * R/: Better support for encodings.
341    
342    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
343    
344            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
345            selection when no reader argument is given.
346    
347    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
348    
349            * R/source.R (CSVSource): Now uses read.csv instead of scan
350            internally.
351    
352    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
353    
354            * R/reader.R (getReaders): Returns available reader functions.
355    
356            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
357            as default.
358    
359    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * R/stopwords.R (stopwords): Shortened code, removed codetools
362            variable warnings.
363    
364            * man/: Documentation for showMeta, added an example for tmMap.
365    
366            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
367            some minor typos fixed.
368    
369    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
370    
371            * R/aobjects.R (showMeta): Added method for pretty printing a
372            text document's meta data.
373    
374    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * R/textdoccol.R (TextDocCol): Better handling of empty
377            arguments.
378    
379            * NAMESPACE: Exported readDOC.
380    
381            * man/completeStems.Rd: Added an example.
382    
383    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
384    
385            * R/stopwords.R (stopwords): Look up .dat files at every
386            call. Allows users to modify stopword .dat files interactively.
387    
388    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
389    
390            * R/termdocmatrix.R (termFreq): Correct processing of empty
391            documents.
392    
393    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * man/: Updated documentation.
396    
397    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
398    
399            * R/complete.R (completeStems): Completes (heuristically) word
400            stems.
401    
402            * R/termdocmatrix.R (TermDocMatrix2): New modular
403            constructor.
404    
405            * NAMESPACE: Exported termFreq.
406    
407    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/reader.R (readDOC): Added MS Word reader (using antiword).
410    
411    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * R/weight.R: Weighting functions for TermDocMatrix.
414    
415    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
416    
417            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
418            functions for accessing dimension, column, and row names.
419    
420            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
421    
422    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
423    
424            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
425    
426    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
427    
428            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
429    
430    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * R/reader.R (readPDF): Removed manual checks for pdftotext and
433            pdfinfo. The system call gives a warning anyway.
434    
435    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * R/textdoccol.R (asPlain): Conversion from
438            StructuredTextDocuments to PlainTextDocuments.
439    
440    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
441    
442            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
443            for accessing term-document matrices.
444    
445            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
446            are installed.
447    
448    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
449    
450            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
451            Christian Buchta.
452    
453    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
454    
455            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
456    
457    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
458    
459            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
460    
461            * R/reader.R (readPDF): Added PDF reader.
462    
463    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
466    
467            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
468    
469            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
470    
471            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
472    
473    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
474    
475            * R/distmeasure.R (dissimilarity): Replaced dists call from
476            package cba by new dist call from package proxy.
477    
478    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
479    
480            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
481    
482    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
483    
484            * R/termdocmatrix.R: require() uses the quietly option to suppress
485            loading messages.
486    
487    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
488    
489            * R/dictionary.R: Added dictionary support.
490    
491    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
492    
493            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
494            documents. This simplifies some functions, e.g., asPlain.
495    
496    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
497    
498            * inst/doc/tm.Rnw: Fixed some typos in vignette.
499    
500    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
501    
502            * R/textdoccol.R (replaceWords): Added method to replace a set of
503            words by a single word. Useful for synonyms.
504    
505    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
506    
507            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
508    
509    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
510    
511            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
512            vectors. Thanks to Ariel Maguyon for his error report.
513            (removeSparseTerms): New function to remove columns from a
514            term-document matrix exceeding a sparse factor.
515    
516    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
517    
518            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
519    
520    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
521    
522            * man/sFilter.Rd: Corrected documentation on statement format (use
523            '==' instead of '=').
524    
525    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
526    
527            * R/aobjects.R (StructuredTextDocument): Inherits from
528            TextDocument.
529    
530    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
531    
532            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
533            on sparse matrices as proposed by Martin Maechler.
534    
535    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
538            \pkg{filehash} version makes them deprecated.
539    
540    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * R/termdocmatrix.R (textvector): Stemming is now performed before
543            erasing stopwords.
544            (weightMatrix): Adapted to handle sparse matrices.
545            (TermDocMatrix): Sparse matrix is now efficiently built by
546            direct stepwise insertion of row values into it.
547    
548    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
549    
550            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
551            due to ongoing problems. For our purposes the latter is as useful
552            as the replaced package.
553    
554    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
555    
556            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
557    
558            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
559    
560    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
561    
562            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
563            languages with available stopwords.
564    
565    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
566    
567            * inst/doc/tm.Rnw: Minor corrections in the vignette.
568    
569    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
570    
571            * DESCRIPTION: Update to version 0.2, since a lot of new features
572            have been integrated.
573    
574            * inst/stopwords: Updated existing stopwords and added stopwords
575            for various other languages.
576    
577    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
578    
579            * man/: Updated documentation.
580    
581            * Work/testDb.R: Script to test database stuff.
582    
583            * R/: Fixed various database related bugs. Seems to be rather
584            useable now, i.e., consider as alpha status for now.
585    
586    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * R/: Fixed some bugs related to database support.
589    
590    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
591    
592            * man/: Added a lot of examples to the manuals.
593    
594    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
595    
596            * man/: Updated parts of the documentation.
597    
598            * R/textdoccol.R (asPlain): Added conversion from newsgroup
599            documents to plain text documents.
600    
601    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
602    
603            * R/textdoccol.R: Finished experimental database support. Not yet
604            intensively tested.
605    
606            * R/source.R: Now each source has a default reader.
607    
608            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
609            class anymore.
610    
611            * R/plaintextdoc.R: Custom show method for plain text documents.
612    
613            * R/aobjects.R: Added a class for structured text documents.
614    
615            * R/reader.R: Replaced remaining \code{parser} occurrences with
616            \code{reader}.
617    
618            * R/textdoccol.R (summary): Indent tags.
619    
620            * R/textdoccol.R (removePunctuation): Transform method to remove
621            punctuation marks.
622    
623    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
624    
625            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
626            using prescindMeta().
627    
628    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * R/textdoccol.R: Improved database support.
631    
632    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
633    
634            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
635    
636            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
637            language code.
638    
639            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
640            into parserControl argument.
641    
642            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
643    
644    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
645    
646            * Work/tmDataSetup.R: The datasets acq and crude can now be
647            created on the fly.
648    
649            * R/stopwords.R: Introduced a function returning the stopwords for
650            a given language (English, German and French at the moment)
651    
652            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
653            otherwise falls back to Snowball package.
654    
655    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
656    
657            * man/dissimilarity-methods.Rd: Make clear that any method offered
658            by "dists" from package "cba" can be used.
659    
660    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
661    
662            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
663            to Kurt's latex suggestion. Removed points and underscores in
664            variable names for consistent naming.
665    
666            * DESCRIPTION: Update to version 0.1-2.
667    
668            * man/TextRepository.Rd: Fixed bug in documentation.
669    
670    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
671    
672            * DESCRIPTION: Update to version 0.1-1.
673    
674    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
675    
676            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
677            wordStem.
678    
679    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
680    
681            * R/: Changes due to Kurt's review.
682    
683    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * R/: Implemented improvements based upon comments by David
686            Meyer.
687    
688    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
689    
690            * inst/doc/: Rewrote vignette.
691    
692            * man/: Improved documentation.
693    
694    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
695    
696            * man/: Updated documentation.
697    
698            * DESCRIPTION: Changed package name to "tm". Updated version to
699            0.1 for first CRAN release.
700    
701            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
702            list archive example.
703    
704            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
705            archive example.
706    
707            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
708            from (several mails per box) mbox format to (single mail per file)
709            eml format.
710    
711    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
712    
713            * data/crude.rda: Rebuilt.
714    
715            * data/acq.rda: Rebuilt.
716    
717            * R/reader.R: Factored out reader and parser methods from
718            textdoccol.R.
719    
720            * R/source.R: Factored out Source methods from aobjects.R and
721            textdoccol.R.
722            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
723            feeds.
724    
725            * R/textdoccol.R (DirSource): Added support for recursive
726            traversal of directories.
727    
728    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
729    
730            * R/textdoccol.R ([[): Loads the document corpus automatically
731            into memory upon access.
732            (tm_transform, tm_filter): Removed several checks whether the
733            document is already loaded ([[ ensures this now).
734            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
735            mailing list archive.
736    
737    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
738    
739            * R/aobjects.R (TextDocument): Is now a virtual class.
740            (Source): Is now a virtual class.
741    
742    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
743    
744            * R/textdoccol.R (c): Support for an arbitrary number of document
745            collections.
746    
747    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
748    
749            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
750            append_meta and remove_meta.
751    
752            * R/textdoccol.R: Removed modify_metadata method.
753    
754            * R/textrepo.R: Removed modify_metadata method.
755    
756            * R/textdoccol.R (remove_meta): Supports removal of document
757            collection metadata and document (= in data frame) metadata.
758    
759    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
760    
761            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
762    
763            * data/crude.rda: Rebuilt.
764    
765            * data/acq.rda: Rebuilt.
766    
767            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
768    
769            * R/textdoccol.R ([): Bug fix for subsetting a document
770            collection's data frame.
771    
772    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
773    
774            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
775            to s_filter.
776    
777            * R/textdoccol.R: Local text documents' metadata can now be copied
778            to a document collection's data frame with prescind_meta.
779    
780    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
781    
782            * R/: Text documents' slot metadata is now accessible in s_filter.
783    
784            * R/: Rewrote s_filter function (has still some restrictions).
785    
786    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
787    
788            * R/: Various fixes in handling metadata.
789    
790            * R/: Added update mechanism for text document collections.
791    
792    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
793    
794            * R/: Merging of document collections now creates a binary tree
795            for reconstructing merged document collections.
796    
797            * R/: Redesign of metadata for document collections.
798    
799    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * R/: Messages now use \code{ngettext}.
802    
803    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * R/: Added functions for modifying and removing metadata.
806    
807    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
808    
809            * man/: Updated some documentation.
810    
811            * R/: Corrected some connection issues.
812    
813            * inst/doc: Worked on the vignette.
814    
815    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
816    
817            * inst/: Added texts and started vignette.
818    
819            * R/: Final changes based upon David's comments.
820    
821    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
822    
823            * NAMESPACE: Corrected exports (generic methods need exportMethods
824            directives!).
825    
826    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * R/: Modified the TextDocCol constructur and various parsers. It
829            is now modular and supports various file formats via plugins (see
830            the new "Source" class).
831    
832    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
833    
834            * man/: Revised documentation after previous code changes.
835    
836    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
837    
838            * R/: Remaining changes as discussed with David.
839    
840    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
841    
842            * R/: Some changes as suggested by David. The rest will follow
843            within the next days.
844    
845    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
846    
847            * man/: Finished documentation.
848    
849    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
850    
851            * man/: Wrote some documentation.
852    
853    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
854    
855            * R/: Further syntactic sugar in form of additional assignment and
856            accessor methods.
857    
858    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
859    
860            * R/: Syntactic sugar in form of "length", "show" and "summary"
861            operators.
862    
863    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
864    
865            * R/: Diverse updates. Mainly on default operators ("[" or "c")
866            and dissimilarities.
867    
868    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
869    
870            * R/: Added similarity functions.
871    
872            * data/: Added english stopwords.
873    
874    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
875    
876            * data/: Examples compiled for new features
877    
878            * R/: Changes due to new structure.
879    
880            * NAMESPACE: Corrected namespace to reflect new structure.
881    
882            * R/termdocmatrix.R: Adapted for new naming scheme.
883    
884    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
885    
886            * R/textdoccol.R: Adapted code for new class structure. Wrote
887            several transform and filter functions operating on text document
888            collections (alias text document databases).
889    
890            * R/aobjects.R: Adapted class structure with inheritance,
891            repositories and additional meta data. Loading files on demand is
892            now possible.
893    
894    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
895    
896            * R/: Some cosmetic cleanups.
897    
898            * inst/: Removed vignette on clustering. That and much more is now
899            described in the JSS paper on text mining. Based upon that
900            article an elaborated vignette will be incorporated in the future.
901    
902    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
903    
904            * R/: Updated generic S4 methods to comply with signature changes
905            in newer versions of R (> 2.3)
906    
907    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
908    
909            * ext/R/importRIS.R: Automatic RIS import is now possible.
910    
911    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
912    
913            * R/textdoccol.R: Added RIS HTML input format.
914    
915    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
916    
917            * R/textdoccol.R: Removed bug that caused invalid text document
918            collections when handling many input files.
919    
920  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
921    
922          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1021

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge