SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC pkg/ChangeLog revision 1011, Mon Oct 19 12:20:43 2009 UTC
# Line 1  Line 1 
1    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/meta.R (DublinCore): Allow lower case tags.
4    
5    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
8            instead of x$children.
9    
10    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
13    
14    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/: Use S3 instead of S4 class system.
17    
18    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
19    
20            * R/reader.R (readMail): Moved to tm.plugin.mail package.
21    
22    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
23    
24            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
25            postings are basically e-mails with some extra headers.
26    
27    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
28    
29            * R/transform.R: Move convertMboxEml, removeCitation,
30            removeMultipart, and removeSignature to the tm.plugin.mail package
31            since they are mainly utility functions (for handling e-mails) and
32            not very framework specific.
33    
34    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
35    
36            * man/: Fix documentation.
37    
38    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
39    
40            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
41            plain text document instead of an XML document for texts of the
42            Reuters-21578 dataset.
43    
44            * R/sparse.R: Removed since the slam package is now available on
45            CRAN.
46    
47            * DESCRIPTION (Depends): Add slam package.
48    
49    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
50    
51            * R/transform.R (stemDoc): Fix character(0) handling.
52    
53    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
54    
55            * R/doc.R (show): Pretty print.
56    
57    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
58    
59            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
60            gracefully.
61    
62    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/corpus.R: Make corpus virtual. Implement corpus with standard
65            and permanent storage semantics.
66    
67            * DESCRIPTION: New major release. A *lot* of improvements.
68    
69    2009-05-04   Ingo Feinerer <feinerer@logic.at>
70    
71            * NAMESPACE: Export some simple_triplet_matrix functions.
72    
73    2009-04-28   Ingo Feinerer <feinerer@logic.at>
74    
75            * R/weight.R: Adapt tf-idf to new matrix format.
76    
77    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
78    
79            * R/matrix.R: Create two distinct classes for term-document and
80            document-term matrices.
81    
82    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
83    
84            * R/termdocmatrix.R: No longer use Matrix package. This reduces
85            package start-up time significantly.
86    
87    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
88    
89            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
90    
91    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
92    
93            * R/transform.R (tmReduce): Combine multiple maps into one
94            transformation.
95    
96    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
97    
98            * R/weight.R: Remove weightLogical since it does not return a
99            dgCMatrix.
100    
101            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
102            or TermDocumentMatrix instead.
103    
104    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
105    
106            * inst/doc/extensions.Rnw: Finished vignette.
107    
108    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
109    
110            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
111            DocumentTermMatrix representations.
112    
113    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
114    
115            * R/reader.R (readXML): New reader for arbitrary XML files.
116    
117    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
118    
119            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
120            (XMLSource): New XMLSource class for arbitrary XML files.
121            (Source): New slot Vectorized.
122    
123    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
124    
125            * R/reader.R (readTabular): Experimental reader for tabular data
126            structures which can be customized via user-defined mappings.
127    
128            * R/reader.R: Always use UTC time zone.
129    
130            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
131    
132    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
133    
134            * R/reader.R (readDOC): Options can be passed over to antiword.
135    
136            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
137            pdftotext.
138    
139    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
140    
141            * R/source.R (DirSource): Add pattern and ignore.case arguments
142            which are internally passed over to list.files().
143    
144    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
145    
146            * inst/doc/tm.Rnw: Suppress pointless loading message.
147    
148    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
149    
150            * DESCRIPTION: Speed up package loading (via moving packages not
151            strictly necessary for normal operation to Suggests instead of
152            Depends).
153    
154    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
155    
156            * R/reader.R (readNewsgroup): The date format is now configurable.
157    
158    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
159    
160            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
161    
162    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
163    
164            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
165    
166    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
167    
168            * R/source.R (DataframeSource): New source class for data frames.
169    
170            * R/source.R: Fixed non-standard call evaluation.
171    
172    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
173    
174            * R/source.R (URISource): New source class for a single document.
175    
176    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/source.R: Refactoring.
179    
180    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
181    
182            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
183            Rmpi installations more gracefully.
184    
185    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
186    
187            * R/source.R (Source): Add Length slot.
188    
189    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
190    
191            * R/AAA.R: Unify duplicated .onLoad function.
192    
193    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
194    
195            * DESCRIPTION (Suggests): Added Rmpi.
196    
197    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
198    
199            * R/source.R (getElem): Fix 'no visible binding' warning.
200    
201            * man/WeightFunction.Rd: Fix signature.
202    
203    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
204    
205            * R/weight.R: Introduce name abbreviations for weighting functions.
206    
207    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
210    
211            * R/cluster.R: Provide convenience functions for using a MPI
212            cluster.
213    
214            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
215            available.
216    
217            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
218            available.
219    
220    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
221    
222            * R/textdoccol.R (lapply): Removed debug print out.
223    
224    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
225    
226            * R/reader.R (readRCV1): Improved meta data extraction from
227            Reuters Corpus Volume 1 documents.
228    
229    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
230    
231            * R/transform.R: Ensure that all mappings preserve multiline
232            structures.
233    
234    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
235    
236            * R/filter.R: Every filter has now an attribute indicating whether
237            it sould be applied to document level (doclevel).
238    
239            * R/textdoccol.R (tmFilter): Set searchFullText as new default
240            filter.
241    
242    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
243    
244            * R/transform.R (replacePatterns): Replaced removeWords by
245            replacePatterns. Suggested by Christian Buchta.
246    
247            * R/textdoccol.R (inspect): Improved formatting.
248    
249    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
250    
251            * inst/CITATION: Updated JSS article information.
252    
253            * R/textdoccol.R (setAs): Added coerce method from list to
254            corpus.
255    
256            * R/meta.R (meta): Improved meta data handling.
257    
258    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
261            Christian Buchta.
262    
263            * inst/CITATION: Added template to include JSS article reference.
264    
265    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
266    
267            * R/textdoccol.R (tmMap): Introduced lazy mapping.
268    
269            * R/source.R: Added VectorSource.
270    
271    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
272    
273            * man/: Language codes should be in ISO 639-1 format.
274    
275            * R/textdoccol.R (asPlain): Preserve local meta data.
276    
277    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
278    
279            * R/textdoccol.R (writeCorpus): Function for writing a corpus
280            containing plain text documents to disk.
281    
282    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
283    
284            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
285            always set correctly.
286    
287            * R/textdoccol.R: Set load = TRUE as default for load on demand
288            since in most cases this is the wanted behaviour.
289    
290    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
291    
292            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
293    
294            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
295    
296    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
297    
298            * R/meta.R (meta): New function for consistent access to meta data
299            of document collections, repositories, and texts.
300    
301    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
302    
303            * R/: Better support for encodings.
304    
305    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
306    
307            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
308            selection when no reader argument is given.
309    
310    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
311    
312            * R/source.R (CSVSource): Now uses read.csv instead of scan
313            internally.
314    
315    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
316    
317            * R/reader.R (getReaders): Returns available reader functions.
318    
319            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
320            as default.
321    
322    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/stopwords.R (stopwords): Shortened code, removed codetools
325            variable warnings.
326    
327            * man/: Documentation for showMeta, added an example for tmMap.
328    
329            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
330            some minor typos fixed.
331    
332    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
333    
334            * R/aobjects.R (showMeta): Added method for pretty printing a
335            text document's meta data.
336    
337    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * R/textdoccol.R (TextDocCol): Better handling of empty
340            arguments.
341    
342            * NAMESPACE: Exported readDOC.
343    
344            * man/completeStems.Rd: Added an example.
345    
346    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * R/stopwords.R (stopwords): Look up .dat files at every
349            call. Allows users to modify stopword .dat files interactively.
350    
351    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * R/termdocmatrix.R (termFreq): Correct processing of empty
354            documents.
355    
356    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
357    
358            * man/: Updated documentation.
359    
360    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * R/complete.R (completeStems): Completes (heuristically) word
363            stems.
364    
365            * R/termdocmatrix.R (TermDocMatrix2): New modular
366            constructor.
367    
368            * NAMESPACE: Exported termFreq.
369    
370    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * R/reader.R (readDOC): Added MS Word reader (using antiword).
373    
374    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * R/weight.R: Weighting functions for TermDocMatrix.
377    
378    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
381            functions for accessing dimension, column, and row names.
382    
383            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
384    
385    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
386    
387            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
388    
389    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
392    
393    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * R/reader.R (readPDF): Removed manual checks for pdftotext and
396            pdfinfo. The system call gives a warning anyway.
397    
398    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/textdoccol.R (asPlain): Conversion from
401            StructuredTextDocuments to PlainTextDocuments.
402    
403    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
406            for accessing term-document matrices.
407    
408            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
409            are installed.
410    
411    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
414            Christian Buchta.
415    
416    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
419    
420    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
423    
424            * R/reader.R (readPDF): Added PDF reader.
425    
426    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
427    
428            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
429    
430            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
431    
432            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
433    
434            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
435    
436    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
437    
438            * R/distmeasure.R (dissimilarity): Replaced dists call from
439            package cba by new dist call from package proxy.
440    
441    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
444    
445    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * R/termdocmatrix.R: require() uses the quietly option to suppress
448            loading messages.
449    
450    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * R/dictionary.R: Added dictionary support.
453    
454    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
457            documents. This simplifies some functions, e.g., asPlain.
458    
459    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * inst/doc/tm.Rnw: Fixed some typos in vignette.
462    
463    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * R/textdoccol.R (replaceWords): Added method to replace a set of
466            words by a single word. Useful for synonyms.
467    
468    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
469    
470            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
471    
472    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
475            vectors. Thanks to Ariel Maguyon for his error report.
476            (removeSparseTerms): New function to remove columns from a
477            term-document matrix exceeding a sparse factor.
478    
479    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
480    
481            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
482    
483    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
484    
485            * man/sFilter.Rd: Corrected documentation on statement format (use
486            '==' instead of '=').
487    
488    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
489    
490            * R/aobjects.R (StructuredTextDocument): Inherits from
491            TextDocument.
492    
493    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
494    
495            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
496            on sparse matrices as proposed by Martin Maechler.
497    
498    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
499    
500            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
501            \pkg{filehash} version makes them deprecated.
502    
503    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
504    
505            * R/termdocmatrix.R (textvector): Stemming is now performed before
506            erasing stopwords.
507            (weightMatrix): Adapted to handle sparse matrices.
508            (TermDocMatrix): Sparse matrix is now efficiently built by
509            direct stepwise insertion of row values into it.
510    
511    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
512    
513            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
514            due to ongoing problems. For our purposes the latter is as useful
515            as the replaced package.
516    
517    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
518    
519            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
520    
521            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
522    
523    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
524    
525            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
526            languages with available stopwords.
527    
528    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
529    
530            * inst/doc/tm.Rnw: Minor corrections in the vignette.
531    
532    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
533    
534            * DESCRIPTION: Update to version 0.2, since a lot of new features
535            have been integrated.
536    
537            * inst/stopwords: Updated existing stopwords and added stopwords
538            for various other languages.
539    
540    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * man/: Updated documentation.
543    
544            * Work/testDb.R: Script to test database stuff.
545    
546            * R/: Fixed various database related bugs. Seems to be rather
547            useable now, i.e., consider as alpha status for now.
548    
549    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
550    
551            * R/: Fixed some bugs related to database support.
552    
553    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
554    
555            * man/: Added a lot of examples to the manuals.
556    
557    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
558    
559            * man/: Updated parts of the documentation.
560    
561            * R/textdoccol.R (asPlain): Added conversion from newsgroup
562            documents to plain text documents.
563    
564    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
565    
566            * R/textdoccol.R: Finished experimental database support. Not yet
567            intensively tested.
568    
569            * R/source.R: Now each source has a default reader.
570    
571            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
572            class anymore.
573    
574            * R/plaintextdoc.R: Custom show method for plain text documents.
575    
576            * R/aobjects.R: Added a class for structured text documents.
577    
578            * R/reader.R: Replaced remaining \code{parser} occurrences with
579            \code{reader}.
580    
581            * R/textdoccol.R (summary): Indent tags.
582    
583            * R/textdoccol.R (removePunctuation): Transform method to remove
584            punctuation marks.
585    
586    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
589            using prescindMeta().
590    
591    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
592    
593            * R/textdoccol.R: Improved database support.
594    
595    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
596    
597            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
598    
599            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
600            language code.
601    
602            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
603            into parserControl argument.
604    
605            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
606    
607    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
608    
609            * Work/tmDataSetup.R: The datasets acq and crude can now be
610            created on the fly.
611    
612            * R/stopwords.R: Introduced a function returning the stopwords for
613            a given language (English, German and French at the moment)
614    
615            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
616            otherwise falls back to Snowball package.
617    
618    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
619    
620            * man/dissimilarity-methods.Rd: Make clear that any method offered
621            by "dists" from package "cba" can be used.
622    
623    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
624    
625            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
626            to Kurt's latex suggestion. Removed points and underscores in
627            variable names for consistent naming.
628    
629            * DESCRIPTION: Update to version 0.1-2.
630    
631            * man/TextRepository.Rd: Fixed bug in documentation.
632    
633    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
634    
635            * DESCRIPTION: Update to version 0.1-1.
636    
637    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
638    
639            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
640            wordStem.
641    
642    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
643    
644            * R/: Changes due to Kurt's review.
645    
646    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
647    
648            * R/: Implemented improvements based upon comments by David
649            Meyer.
650    
651    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
652    
653            * inst/doc/: Rewrote vignette.
654    
655            * man/: Improved documentation.
656    
657    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * man/: Updated documentation.
660    
661            * DESCRIPTION: Changed package name to "tm". Updated version to
662            0.1 for first CRAN release.
663    
664            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
665            list archive example.
666    
667            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
668            archive example.
669    
670            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
671            from (several mails per box) mbox format to (single mail per file)
672            eml format.
673    
674    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
675    
676            * data/crude.rda: Rebuilt.
677    
678            * data/acq.rda: Rebuilt.
679    
680            * R/reader.R: Factored out reader and parser methods from
681            textdoccol.R.
682    
683            * R/source.R: Factored out Source methods from aobjects.R and
684            textdoccol.R.
685            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
686            feeds.
687    
688            * R/textdoccol.R (DirSource): Added support for recursive
689            traversal of directories.
690    
691    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
692    
693            * R/textdoccol.R ([[): Loads the document corpus automatically
694            into memory upon access.
695            (tm_transform, tm_filter): Removed several checks whether the
696            document is already loaded ([[ ensures this now).
697            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
698            mailing list archive.
699    
700    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
701    
702            * R/aobjects.R (TextDocument): Is now a virtual class.
703            (Source): Is now a virtual class.
704    
705    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
706    
707            * R/textdoccol.R (c): Support for an arbitrary number of document
708            collections.
709    
710    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
711    
712            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
713            append_meta and remove_meta.
714    
715            * R/textdoccol.R: Removed modify_metadata method.
716    
717            * R/textrepo.R: Removed modify_metadata method.
718    
719            * R/textdoccol.R (remove_meta): Supports removal of document
720            collection metadata and document (= in data frame) metadata.
721    
722    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
723    
724            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
725    
726            * data/crude.rda: Rebuilt.
727    
728            * data/acq.rda: Rebuilt.
729    
730            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
731    
732            * R/textdoccol.R ([): Bug fix for subsetting a document
733            collection's data frame.
734    
735    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
736    
737            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
738            to s_filter.
739    
740            * R/textdoccol.R: Local text documents' metadata can now be copied
741            to a document collection's data frame with prescind_meta.
742    
743    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
744    
745            * R/: Text documents' slot metadata is now accessible in s_filter.
746    
747            * R/: Rewrote s_filter function (has still some restrictions).
748    
749    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
750    
751            * R/: Various fixes in handling metadata.
752    
753            * R/: Added update mechanism for text document collections.
754    
755    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
756    
757            * R/: Merging of document collections now creates a binary tree
758            for reconstructing merged document collections.
759    
760            * R/: Redesign of metadata for document collections.
761    
762    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
763    
764            * R/: Messages now use \code{ngettext}.
765    
766    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
767    
768            * R/: Added functions for modifying and removing metadata.
769    
770    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
771    
772            * man/: Updated some documentation.
773    
774            * R/: Corrected some connection issues.
775    
776            * inst/doc: Worked on the vignette.
777    
778    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
779    
780            * inst/: Added texts and started vignette.
781    
782            * R/: Final changes based upon David's comments.
783    
784    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
785    
786            * NAMESPACE: Corrected exports (generic methods need exportMethods
787            directives!).
788    
789    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
790    
791            * R/: Modified the TextDocCol constructur and various parsers. It
792            is now modular and supports various file formats via plugins (see
793            the new "Source" class).
794    
795    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
796    
797            * man/: Revised documentation after previous code changes.
798    
799    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * R/: Remaining changes as discussed with David.
802    
803    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * R/: Some changes as suggested by David. The rest will follow
806            within the next days.
807    
808    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
809    
810            * man/: Finished documentation.
811    
812    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
813    
814            * man/: Wrote some documentation.
815    
816    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
817    
818            * R/: Further syntactic sugar in form of additional assignment and
819            accessor methods.
820    
821    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
822    
823            * R/: Syntactic sugar in form of "length", "show" and "summary"
824            operators.
825    
826    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * R/: Diverse updates. Mainly on default operators ("[" or "c")
829            and dissimilarities.
830    
831    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/: Added similarity functions.
834    
835            * data/: Added english stopwords.
836    
837    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
838    
839            * data/: Examples compiled for new features
840    
841            * R/: Changes due to new structure.
842    
843            * NAMESPACE: Corrected namespace to reflect new structure.
844    
845            * R/termdocmatrix.R: Adapted for new naming scheme.
846    
847    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
848    
849            * R/textdoccol.R: Adapted code for new class structure. Wrote
850            several transform and filter functions operating on text document
851            collections (alias text document databases).
852    
853            * R/aobjects.R: Adapted class structure with inheritance,
854            repositories and additional meta data. Loading files on demand is
855            now possible.
856    
857    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
858    
859            * R/: Some cosmetic cleanups.
860    
861            * inst/: Removed vignette on clustering. That and much more is now
862            described in the JSS paper on text mining. Based upon that
863            article an elaborated vignette will be incorporated in the future.
864    
865    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
866    
867            * R/: Updated generic S4 methods to comply with signature changes
868            in newer versions of R (> 2.3)
869    
870    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
871    
872            * ext/R/importRIS.R: Automatic RIS import is now possible.
873    
874    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
875    
876            * R/textdoccol.R: Added RIS HTML input format.
877    
878    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
879    
880            * R/textdoccol.R: Removed bug that caused invalid text document
881            collections when handling many input files.
882    
883    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
884    
885            * R/textdoccol.R: Restructured and extended file import
886            mechanism.
887    
888            * inst/doc/clustering.Rnw: Adapted vignette for use with
889            ReutNews.rda
890    
891            * man/ReutNews.Rd: Documentation for ReutNews.rda
892    
893            * data/ReutNews.rda: A tiny Reuters21578 example data set.
894    
895    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
896    
897            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
898            clustering facilities of this package.
899    
900    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
901    
902            * R/aobjects.R: Changed package document structure to avoid class
903            dependency problems.
904    
905  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
906    
907            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
908            data set.
909    
910          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
911          CMD check textmin" works without errors.          CMD check textmin" works without errors.
912    

Legend:
Removed from v.28  
changed lines
  Added in v.1011

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge