SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/textmin/ChangeLog revision 47, Mon Jul 10 12:22:35 2006 UTC pkg/ChangeLog revision 1007, Tue Sep 15 18:02:44 2009 UTC
# Line 1  Line 1 
1    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
4    
5    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/: Use S3 instead of S4 class system.
8    
9    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
10    
11            * R/reader.R (readMail): Moved to tm.plugin.mail package.
12    
13    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
14    
15            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
16            postings are basically e-mails with some extra headers.
17    
18    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
19    
20            * R/transform.R: Move convertMboxEml, removeCitation,
21            removeMultipart, and removeSignature to the tm.plugin.mail package
22            since they are mainly utility functions (for handling e-mails) and
23            not very framework specific.
24    
25    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
26    
27            * man/: Fix documentation.
28    
29    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
30    
31            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
32            plain text document instead of an XML document for texts of the
33            Reuters-21578 dataset.
34    
35            * R/sparse.R: Removed since the slam package is now available on
36            CRAN.
37    
38            * DESCRIPTION (Depends): Add slam package.
39    
40    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
41    
42            * R/transform.R (stemDoc): Fix character(0) handling.
43    
44    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
45    
46            * R/doc.R (show): Pretty print.
47    
48    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
49    
50            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
51            gracefully.
52    
53    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
54    
55            * R/corpus.R: Make corpus virtual. Implement corpus with standard
56            and permanent storage semantics.
57    
58            * DESCRIPTION: New major release. A *lot* of improvements.
59    
60    2009-05-04   Ingo Feinerer <feinerer@logic.at>
61    
62            * NAMESPACE: Export some simple_triplet_matrix functions.
63    
64    2009-04-28   Ingo Feinerer <feinerer@logic.at>
65    
66            * R/weight.R: Adapt tf-idf to new matrix format.
67    
68    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
69    
70            * R/matrix.R: Create two distinct classes for term-document and
71            document-term matrices.
72    
73    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
74    
75            * R/termdocmatrix.R: No longer use Matrix package. This reduces
76            package start-up time significantly.
77    
78    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
79    
80            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
81    
82    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
83    
84            * R/transform.R (tmReduce): Combine multiple maps into one
85            transformation.
86    
87    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
88    
89            * R/weight.R: Remove weightLogical since it does not return a
90            dgCMatrix.
91    
92            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
93            or TermDocumentMatrix instead.
94    
95    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
96    
97            * inst/doc/extensions.Rnw: Finished vignette.
98    
99    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
100    
101            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
102            DocumentTermMatrix representations.
103    
104    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
105    
106            * R/reader.R (readXML): New reader for arbitrary XML files.
107    
108    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
109    
110            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
111            (XMLSource): New XMLSource class for arbitrary XML files.
112            (Source): New slot Vectorized.
113    
114    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/reader.R (readTabular): Experimental reader for tabular data
117            structures which can be customized via user-defined mappings.
118    
119            * R/reader.R: Always use UTC time zone.
120    
121            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
122    
123    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
124    
125            * R/reader.R (readDOC): Options can be passed over to antiword.
126    
127            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
128            pdftotext.
129    
130    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
131    
132            * R/source.R (DirSource): Add pattern and ignore.case arguments
133            which are internally passed over to list.files().
134    
135    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
136    
137            * inst/doc/tm.Rnw: Suppress pointless loading message.
138    
139    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
140    
141            * DESCRIPTION: Speed up package loading (via moving packages not
142            strictly necessary for normal operation to Suggests instead of
143            Depends).
144    
145    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/reader.R (readNewsgroup): The date format is now configurable.
148    
149    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
150    
151            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
152    
153    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
154    
155            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
156    
157    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
158    
159            * R/source.R (DataframeSource): New source class for data frames.
160    
161            * R/source.R: Fixed non-standard call evaluation.
162    
163    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
164    
165            * R/source.R (URISource): New source class for a single document.
166    
167    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
168    
169            * R/source.R: Refactoring.
170    
171    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
172    
173            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
174            Rmpi installations more gracefully.
175    
176    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/source.R (Source): Add Length slot.
179    
180    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
181    
182            * R/AAA.R: Unify duplicated .onLoad function.
183    
184    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
185    
186            * DESCRIPTION (Suggests): Added Rmpi.
187    
188    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
189    
190            * R/source.R (getElem): Fix 'no visible binding' warning.
191    
192            * man/WeightFunction.Rd: Fix signature.
193    
194    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/weight.R: Introduce name abbreviations for weighting functions.
197    
198    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
199    
200            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
201    
202            * R/cluster.R: Provide convenience functions for using a MPI
203            cluster.
204    
205            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
206            available.
207    
208            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
209            available.
210    
211    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
212    
213            * R/textdoccol.R (lapply): Removed debug print out.
214    
215    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
216    
217            * R/reader.R (readRCV1): Improved meta data extraction from
218            Reuters Corpus Volume 1 documents.
219    
220    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
221    
222            * R/transform.R: Ensure that all mappings preserve multiline
223            structures.
224    
225    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
226    
227            * R/filter.R: Every filter has now an attribute indicating whether
228            it sould be applied to document level (doclevel).
229    
230            * R/textdoccol.R (tmFilter): Set searchFullText as new default
231            filter.
232    
233    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
234    
235            * R/transform.R (replacePatterns): Replaced removeWords by
236            replacePatterns. Suggested by Christian Buchta.
237    
238            * R/textdoccol.R (inspect): Improved formatting.
239    
240    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
241    
242            * inst/CITATION: Updated JSS article information.
243    
244            * R/textdoccol.R (setAs): Added coerce method from list to
245            corpus.
246    
247            * R/meta.R (meta): Improved meta data handling.
248    
249    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
250    
251            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
252            Christian Buchta.
253    
254            * inst/CITATION: Added template to include JSS article reference.
255    
256    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * R/textdoccol.R (tmMap): Introduced lazy mapping.
259    
260            * R/source.R: Added VectorSource.
261    
262    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
263    
264            * man/: Language codes should be in ISO 639-1 format.
265    
266            * R/textdoccol.R (asPlain): Preserve local meta data.
267    
268    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
269    
270            * R/textdoccol.R (writeCorpus): Function for writing a corpus
271            containing plain text documents to disk.
272    
273    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
274    
275            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
276            always set correctly.
277    
278            * R/textdoccol.R: Set load = TRUE as default for load on demand
279            since in most cases this is the wanted behaviour.
280    
281    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
282    
283            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
284    
285            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
286    
287    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
288    
289            * R/meta.R (meta): New function for consistent access to meta data
290            of document collections, repositories, and texts.
291    
292    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
293    
294            * R/: Better support for encodings.
295    
296    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
297    
298            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
299            selection when no reader argument is given.
300    
301    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
302    
303            * R/source.R (CSVSource): Now uses read.csv instead of scan
304            internally.
305    
306    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
307    
308            * R/reader.R (getReaders): Returns available reader functions.
309    
310            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
311            as default.
312    
313    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
314    
315            * R/stopwords.R (stopwords): Shortened code, removed codetools
316            variable warnings.
317    
318            * man/: Documentation for showMeta, added an example for tmMap.
319    
320            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
321            some minor typos fixed.
322    
323    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
324    
325            * R/aobjects.R (showMeta): Added method for pretty printing a
326            text document's meta data.
327    
328    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
329    
330            * R/textdoccol.R (TextDocCol): Better handling of empty
331            arguments.
332    
333            * NAMESPACE: Exported readDOC.
334    
335            * man/completeStems.Rd: Added an example.
336    
337    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * R/stopwords.R (stopwords): Look up .dat files at every
340            call. Allows users to modify stopword .dat files interactively.
341    
342    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
343    
344            * R/termdocmatrix.R (termFreq): Correct processing of empty
345            documents.
346    
347    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
348    
349            * man/: Updated documentation.
350    
351    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * R/complete.R (completeStems): Completes (heuristically) word
354            stems.
355    
356            * R/termdocmatrix.R (TermDocMatrix2): New modular
357            constructor.
358    
359            * NAMESPACE: Exported termFreq.
360    
361    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
362    
363            * R/reader.R (readDOC): Added MS Word reader (using antiword).
364    
365    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
366    
367            * R/weight.R: Weighting functions for TermDocMatrix.
368    
369    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
370    
371            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
372            functions for accessing dimension, column, and row names.
373    
374            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
375    
376    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
379    
380    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
383    
384    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
385    
386            * R/reader.R (readPDF): Removed manual checks for pdftotext and
387            pdfinfo. The system call gives a warning anyway.
388    
389    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/textdoccol.R (asPlain): Conversion from
392            StructuredTextDocuments to PlainTextDocuments.
393    
394    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
397            for accessing term-document matrices.
398    
399            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
400            are installed.
401    
402    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
405            Christian Buchta.
406    
407    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
410    
411    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
414    
415            * R/reader.R (readPDF): Added PDF reader.
416    
417    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
418    
419            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
420    
421            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
422    
423            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
424    
425            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
426    
427    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
428    
429            * R/distmeasure.R (dissimilarity): Replaced dists call from
430            package cba by new dist call from package proxy.
431    
432    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
433    
434            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
435    
436    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
437    
438            * R/termdocmatrix.R: require() uses the quietly option to suppress
439            loading messages.
440    
441    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * R/dictionary.R: Added dictionary support.
444    
445    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
448            documents. This simplifies some functions, e.g., asPlain.
449    
450    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * inst/doc/tm.Rnw: Fixed some typos in vignette.
453    
454    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/textdoccol.R (replaceWords): Added method to replace a set of
457            words by a single word. Useful for synonyms.
458    
459    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
462    
463    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
466            vectors. Thanks to Ariel Maguyon for his error report.
467            (removeSparseTerms): New function to remove columns from a
468            term-document matrix exceeding a sparse factor.
469    
470    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
471    
472            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
473    
474    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
475    
476            * man/sFilter.Rd: Corrected documentation on statement format (use
477            '==' instead of '=').
478    
479    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
480    
481            * R/aobjects.R (StructuredTextDocument): Inherits from
482            TextDocument.
483    
484    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
485    
486            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
487            on sparse matrices as proposed by Martin Maechler.
488    
489    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
490    
491            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
492            \pkg{filehash} version makes them deprecated.
493    
494    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
495    
496            * R/termdocmatrix.R (textvector): Stemming is now performed before
497            erasing stopwords.
498            (weightMatrix): Adapted to handle sparse matrices.
499            (TermDocMatrix): Sparse matrix is now efficiently built by
500            direct stepwise insertion of row values into it.
501    
502    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
505            due to ongoing problems. For our purposes the latter is as useful
506            as the replaced package.
507    
508    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
509    
510            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
511    
512            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
513    
514    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
515    
516            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
517            languages with available stopwords.
518    
519    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
520    
521            * inst/doc/tm.Rnw: Minor corrections in the vignette.
522    
523    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
524    
525            * DESCRIPTION: Update to version 0.2, since a lot of new features
526            have been integrated.
527    
528            * inst/stopwords: Updated existing stopwords and added stopwords
529            for various other languages.
530    
531    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
532    
533            * man/: Updated documentation.
534    
535            * Work/testDb.R: Script to test database stuff.
536    
537            * R/: Fixed various database related bugs. Seems to be rather
538            useable now, i.e., consider as alpha status for now.
539    
540    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * R/: Fixed some bugs related to database support.
543    
544    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
545    
546            * man/: Added a lot of examples to the manuals.
547    
548    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
549    
550            * man/: Updated parts of the documentation.
551    
552            * R/textdoccol.R (asPlain): Added conversion from newsgroup
553            documents to plain text documents.
554    
555    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
556    
557            * R/textdoccol.R: Finished experimental database support. Not yet
558            intensively tested.
559    
560            * R/source.R: Now each source has a default reader.
561    
562            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
563            class anymore.
564    
565            * R/plaintextdoc.R: Custom show method for plain text documents.
566    
567            * R/aobjects.R: Added a class for structured text documents.
568    
569            * R/reader.R: Replaced remaining \code{parser} occurrences with
570            \code{reader}.
571    
572            * R/textdoccol.R (summary): Indent tags.
573    
574            * R/textdoccol.R (removePunctuation): Transform method to remove
575            punctuation marks.
576    
577    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
578    
579            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
580            using prescindMeta().
581    
582    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
583    
584            * R/textdoccol.R: Improved database support.
585    
586    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
589    
590            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
591            language code.
592    
593            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
594            into parserControl argument.
595    
596            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
597    
598    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
599    
600            * Work/tmDataSetup.R: The datasets acq and crude can now be
601            created on the fly.
602    
603            * R/stopwords.R: Introduced a function returning the stopwords for
604            a given language (English, German and French at the moment)
605    
606            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
607            otherwise falls back to Snowball package.
608    
609    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
610    
611            * man/dissimilarity-methods.Rd: Make clear that any method offered
612            by "dists" from package "cba" can be used.
613    
614    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
615    
616            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
617            to Kurt's latex suggestion. Removed points and underscores in
618            variable names for consistent naming.
619    
620            * DESCRIPTION: Update to version 0.1-2.
621    
622            * man/TextRepository.Rd: Fixed bug in documentation.
623    
624    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
625    
626            * DESCRIPTION: Update to version 0.1-1.
627    
628    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
631            wordStem.
632    
633    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
634    
635            * R/: Changes due to Kurt's review.
636    
637    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
638    
639            * R/: Implemented improvements based upon comments by David
640            Meyer.
641    
642    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
643    
644            * inst/doc/: Rewrote vignette.
645    
646            * man/: Improved documentation.
647    
648    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
649    
650            * man/: Updated documentation.
651    
652            * DESCRIPTION: Changed package name to "tm". Updated version to
653            0.1 for first CRAN release.
654    
655            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
656            list archive example.
657    
658            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
659            archive example.
660    
661            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
662            from (several mails per box) mbox format to (single mail per file)
663            eml format.
664    
665    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
666    
667            * data/crude.rda: Rebuilt.
668    
669            * data/acq.rda: Rebuilt.
670    
671            * R/reader.R: Factored out reader and parser methods from
672            textdoccol.R.
673    
674            * R/source.R: Factored out Source methods from aobjects.R and
675            textdoccol.R.
676            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
677            feeds.
678    
679            * R/textdoccol.R (DirSource): Added support for recursive
680            traversal of directories.
681    
682    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
683    
684            * R/textdoccol.R ([[): Loads the document corpus automatically
685            into memory upon access.
686            (tm_transform, tm_filter): Removed several checks whether the
687            document is already loaded ([[ ensures this now).
688            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
689            mailing list archive.
690    
691    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
692    
693            * R/aobjects.R (TextDocument): Is now a virtual class.
694            (Source): Is now a virtual class.
695    
696    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
697    
698            * R/textdoccol.R (c): Support for an arbitrary number of document
699            collections.
700    
701    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
702    
703            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
704            append_meta and remove_meta.
705    
706            * R/textdoccol.R: Removed modify_metadata method.
707    
708            * R/textrepo.R: Removed modify_metadata method.
709    
710            * R/textdoccol.R (remove_meta): Supports removal of document
711            collection metadata and document (= in data frame) metadata.
712    
713    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
714    
715            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
716    
717            * data/crude.rda: Rebuilt.
718    
719            * data/acq.rda: Rebuilt.
720    
721            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
722    
723            * R/textdoccol.R ([): Bug fix for subsetting a document
724            collection's data frame.
725    
726    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
727    
728            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
729            to s_filter.
730    
731            * R/textdoccol.R: Local text documents' metadata can now be copied
732            to a document collection's data frame with prescind_meta.
733    
734    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
735    
736            * R/: Text documents' slot metadata is now accessible in s_filter.
737    
738            * R/: Rewrote s_filter function (has still some restrictions).
739    
740    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
741    
742            * R/: Various fixes in handling metadata.
743    
744            * R/: Added update mechanism for text document collections.
745    
746    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
747    
748            * R/: Merging of document collections now creates a binary tree
749            for reconstructing merged document collections.
750    
751            * R/: Redesign of metadata for document collections.
752    
753    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
754    
755            * R/: Messages now use \code{ngettext}.
756    
757    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
758    
759            * R/: Added functions for modifying and removing metadata.
760    
761    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
762    
763            * man/: Updated some documentation.
764    
765            * R/: Corrected some connection issues.
766    
767            * inst/doc: Worked on the vignette.
768    
769    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
770    
771            * inst/: Added texts and started vignette.
772    
773            * R/: Final changes based upon David's comments.
774    
775    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
776    
777            * NAMESPACE: Corrected exports (generic methods need exportMethods
778            directives!).
779    
780    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
781    
782            * R/: Modified the TextDocCol constructur and various parsers. It
783            is now modular and supports various file formats via plugins (see
784            the new "Source" class).
785    
786    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
787    
788            * man/: Revised documentation after previous code changes.
789    
790    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
791    
792            * R/: Remaining changes as discussed with David.
793    
794    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
795    
796            * R/: Some changes as suggested by David. The rest will follow
797            within the next days.
798    
799    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * man/: Finished documentation.
802    
803    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * man/: Wrote some documentation.
806    
807    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
808    
809            * R/: Further syntactic sugar in form of additional assignment and
810            accessor methods.
811    
812    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
813    
814            * R/: Syntactic sugar in form of "length", "show" and "summary"
815            operators.
816    
817    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
818    
819            * R/: Diverse updates. Mainly on default operators ("[" or "c")
820            and dissimilarities.
821    
822    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
823    
824            * R/: Added similarity functions.
825    
826            * data/: Added english stopwords.
827    
828    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
829    
830            * data/: Examples compiled for new features
831    
832            * R/: Changes due to new structure.
833    
834            * NAMESPACE: Corrected namespace to reflect new structure.
835    
836            * R/termdocmatrix.R: Adapted for new naming scheme.
837    
838    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
839    
840            * R/textdoccol.R: Adapted code for new class structure. Wrote
841            several transform and filter functions operating on text document
842            collections (alias text document databases).
843    
844            * R/aobjects.R: Adapted class structure with inheritance,
845            repositories and additional meta data. Loading files on demand is
846            now possible.
847    
848    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
849    
850            * R/: Some cosmetic cleanups.
851    
852            * inst/: Removed vignette on clustering. That and much more is now
853            described in the JSS paper on text mining. Based upon that
854            article an elaborated vignette will be incorporated in the future.
855    
856  2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
857    
858          * R/: Updated generic S4 methods to comply with signature changes          * R/: Updated generic S4 methods to comply with signature changes

Legend:
Removed from v.47  
changed lines
  Added in v.1007

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge