SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC pkg/ChangeLog revision 993, Sun Sep 6 17:51:08 2009 UTC
# Line 1  Line 1 
1    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/: Use S3 instead of S4 class system.
4    
5    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/reader.R (readMail): Moved to tm.plugin.mail package.
8    
9    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
10    
11            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
12            postings are basically e-mails with some extra headers.
13    
14    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/transform.R: Move convertMboxEml, removeCitation,
17            removeMultipart, and removeSignature to the tm.plugin.mail package
18            since they are mainly utility functions (for handling e-mails) and
19            not very framework specific.
20    
21    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
22    
23            * man/: Fix documentation.
24    
25    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
26    
27            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
28            plain text document instead of an XML document for texts of the
29            Reuters-21578 dataset.
30    
31            * R/sparse.R: Removed since the slam package is now available on
32            CRAN.
33    
34            * DESCRIPTION (Depends): Add slam package.
35    
36    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
37    
38            * R/transform.R (stemDoc): Fix character(0) handling.
39    
40    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
41    
42            * R/doc.R (show): Pretty print.
43    
44    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
45    
46            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
47            gracefully.
48    
49    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
50    
51            * R/corpus.R: Make corpus virtual. Implement corpus with standard
52            and permanent storage semantics.
53    
54            * DESCRIPTION: New major release. A *lot* of improvements.
55    
56    2009-05-04   Ingo Feinerer <feinerer@logic.at>
57    
58            * NAMESPACE: Export some simple_triplet_matrix functions.
59    
60    2009-04-28   Ingo Feinerer <feinerer@logic.at>
61    
62            * R/weight.R: Adapt tf-idf to new matrix format.
63    
64    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
65    
66            * R/matrix.R: Create two distinct classes for term-document and
67            document-term matrices.
68    
69    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
70    
71            * R/termdocmatrix.R: No longer use Matrix package. This reduces
72            package start-up time significantly.
73    
74    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
75    
76            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
77    
78    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
79    
80            * R/transform.R (tmReduce): Combine multiple maps into one
81            transformation.
82    
83    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
84    
85            * R/weight.R: Remove weightLogical since it does not return a
86            dgCMatrix.
87    
88            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
89            or TermDocumentMatrix instead.
90    
91    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
92    
93            * inst/doc/extensions.Rnw: Finished vignette.
94    
95    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
96    
97            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
98            DocumentTermMatrix representations.
99    
100    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
101    
102            * R/reader.R (readXML): New reader for arbitrary XML files.
103    
104    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
105    
106            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
107            (XMLSource): New XMLSource class for arbitrary XML files.
108            (Source): New slot Vectorized.
109    
110    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/reader.R (readTabular): Experimental reader for tabular data
113            structures which can be customized via user-defined mappings.
114    
115            * R/reader.R: Always use UTC time zone.
116    
117            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
118    
119    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
120    
121            * R/reader.R (readDOC): Options can be passed over to antiword.
122    
123            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
124            pdftotext.
125    
126    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
127    
128            * R/source.R (DirSource): Add pattern and ignore.case arguments
129            which are internally passed over to list.files().
130    
131    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
132    
133            * inst/doc/tm.Rnw: Suppress pointless loading message.
134    
135    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
136    
137            * DESCRIPTION: Speed up package loading (via moving packages not
138            strictly necessary for normal operation to Suggests instead of
139            Depends).
140    
141    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
142    
143            * R/reader.R (readNewsgroup): The date format is now configurable.
144    
145    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
148    
149    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
150    
151            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
152    
153    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
154    
155            * R/source.R (DataframeSource): New source class for data frames.
156    
157            * R/source.R: Fixed non-standard call evaluation.
158    
159    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
160    
161            * R/source.R (URISource): New source class for a single document.
162    
163    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
164    
165            * R/source.R: Refactoring.
166    
167    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
168    
169            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
170            Rmpi installations more gracefully.
171    
172    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
173    
174            * R/source.R (Source): Add Length slot.
175    
176    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/AAA.R: Unify duplicated .onLoad function.
179    
180    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
181    
182            * DESCRIPTION (Suggests): Added Rmpi.
183    
184    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/source.R (getElem): Fix 'no visible binding' warning.
187    
188            * man/WeightFunction.Rd: Fix signature.
189    
190    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
191    
192            * R/weight.R: Introduce name abbreviations for weighting functions.
193    
194    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
197    
198            * R/cluster.R: Provide convenience functions for using a MPI
199            cluster.
200    
201            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
202            available.
203    
204            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
205            available.
206    
207    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/textdoccol.R (lapply): Removed debug print out.
210    
211    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
212    
213            * R/reader.R (readRCV1): Improved meta data extraction from
214            Reuters Corpus Volume 1 documents.
215    
216    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
217    
218            * R/transform.R: Ensure that all mappings preserve multiline
219            structures.
220    
221    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
222    
223            * R/filter.R: Every filter has now an attribute indicating whether
224            it sould be applied to document level (doclevel).
225    
226            * R/textdoccol.R (tmFilter): Set searchFullText as new default
227            filter.
228    
229    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
230    
231            * R/transform.R (replacePatterns): Replaced removeWords by
232            replacePatterns. Suggested by Christian Buchta.
233    
234            * R/textdoccol.R (inspect): Improved formatting.
235    
236    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
237    
238            * inst/CITATION: Updated JSS article information.
239    
240            * R/textdoccol.R (setAs): Added coerce method from list to
241            corpus.
242    
243            * R/meta.R (meta): Improved meta data handling.
244    
245    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
246    
247            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
248            Christian Buchta.
249    
250            * inst/CITATION: Added template to include JSS article reference.
251    
252    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
253    
254            * R/textdoccol.R (tmMap): Introduced lazy mapping.
255    
256            * R/source.R: Added VectorSource.
257    
258    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * man/: Language codes should be in ISO 639-1 format.
261    
262            * R/textdoccol.R (asPlain): Preserve local meta data.
263    
264    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
265    
266            * R/textdoccol.R (writeCorpus): Function for writing a corpus
267            containing plain text documents to disk.
268    
269    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
270    
271            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
272            always set correctly.
273    
274            * R/textdoccol.R: Set load = TRUE as default for load on demand
275            since in most cases this is the wanted behaviour.
276    
277    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
278    
279            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
280    
281            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
282    
283    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
284    
285            * R/meta.R (meta): New function for consistent access to meta data
286            of document collections, repositories, and texts.
287    
288    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
289    
290            * R/: Better support for encodings.
291    
292    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
293    
294            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
295            selection when no reader argument is given.
296    
297    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
298    
299            * R/source.R (CSVSource): Now uses read.csv instead of scan
300            internally.
301    
302    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
303    
304            * R/reader.R (getReaders): Returns available reader functions.
305    
306            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
307            as default.
308    
309    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
310    
311            * R/stopwords.R (stopwords): Shortened code, removed codetools
312            variable warnings.
313    
314            * man/: Documentation for showMeta, added an example for tmMap.
315    
316            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
317            some minor typos fixed.
318    
319    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/aobjects.R (showMeta): Added method for pretty printing a
322            text document's meta data.
323    
324    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/textdoccol.R (TextDocCol): Better handling of empty
327            arguments.
328    
329            * NAMESPACE: Exported readDOC.
330    
331            * man/completeStems.Rd: Added an example.
332    
333    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
334    
335            * R/stopwords.R (stopwords): Look up .dat files at every
336            call. Allows users to modify stopword .dat files interactively.
337    
338    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
339    
340            * R/termdocmatrix.R (termFreq): Correct processing of empty
341            documents.
342    
343    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
344    
345            * man/: Updated documentation.
346    
347    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
348    
349            * R/complete.R (completeStems): Completes (heuristically) word
350            stems.
351    
352            * R/termdocmatrix.R (TermDocMatrix2): New modular
353            constructor.
354    
355            * NAMESPACE: Exported termFreq.
356    
357    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
358    
359            * R/reader.R (readDOC): Added MS Word reader (using antiword).
360    
361    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
362    
363            * R/weight.R: Weighting functions for TermDocMatrix.
364    
365    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
366    
367            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
368            functions for accessing dimension, column, and row names.
369    
370            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
371    
372    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
375    
376    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
379    
380    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * R/reader.R (readPDF): Removed manual checks for pdftotext and
383            pdfinfo. The system call gives a warning anyway.
384    
385    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
386    
387            * R/textdoccol.R (asPlain): Conversion from
388            StructuredTextDocuments to PlainTextDocuments.
389    
390    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
391    
392            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
393            for accessing term-document matrices.
394    
395            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
396            are installed.
397    
398    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
401            Christian Buchta.
402    
403    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
406    
407    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
410    
411            * R/reader.R (readPDF): Added PDF reader.
412    
413    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
414    
415            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
416    
417            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
418    
419            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
420    
421            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
422    
423    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
424    
425            * R/distmeasure.R (dissimilarity): Replaced dists call from
426            package cba by new dist call from package proxy.
427    
428    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
429    
430            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
431    
432    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
433    
434            * R/termdocmatrix.R: require() uses the quietly option to suppress
435            loading messages.
436    
437    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
438    
439            * R/dictionary.R: Added dictionary support.
440    
441    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
444            documents. This simplifies some functions, e.g., asPlain.
445    
446    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * inst/doc/tm.Rnw: Fixed some typos in vignette.
449    
450    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * R/textdoccol.R (replaceWords): Added method to replace a set of
453            words by a single word. Useful for synonyms.
454    
455    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
456    
457            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
458    
459    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
462            vectors. Thanks to Ariel Maguyon for his error report.
463            (removeSparseTerms): New function to remove columns from a
464            term-document matrix exceeding a sparse factor.
465    
466    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
467    
468            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
469    
470    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
471    
472            * man/sFilter.Rd: Corrected documentation on statement format (use
473            '==' instead of '=').
474    
475    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
476    
477            * R/aobjects.R (StructuredTextDocument): Inherits from
478            TextDocument.
479    
480    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
481    
482            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
483            on sparse matrices as proposed by Martin Maechler.
484    
485    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
486    
487            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
488            \pkg{filehash} version makes them deprecated.
489    
490    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
491    
492            * R/termdocmatrix.R (textvector): Stemming is now performed before
493            erasing stopwords.
494            (weightMatrix): Adapted to handle sparse matrices.
495            (TermDocMatrix): Sparse matrix is now efficiently built by
496            direct stepwise insertion of row values into it.
497    
498    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
499    
500            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
501            due to ongoing problems. For our purposes the latter is as useful
502            as the replaced package.
503    
504    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
505    
506            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
507    
508            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
509    
510    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
511    
512            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
513            languages with available stopwords.
514    
515    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
516    
517            * inst/doc/tm.Rnw: Minor corrections in the vignette.
518    
519    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
520    
521            * DESCRIPTION: Update to version 0.2, since a lot of new features
522            have been integrated.
523    
524            * inst/stopwords: Updated existing stopwords and added stopwords
525            for various other languages.
526    
527    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
528    
529            * man/: Updated documentation.
530    
531            * Work/testDb.R: Script to test database stuff.
532    
533            * R/: Fixed various database related bugs. Seems to be rather
534            useable now, i.e., consider as alpha status for now.
535    
536    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
537    
538            * R/: Fixed some bugs related to database support.
539    
540    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * man/: Added a lot of examples to the manuals.
543    
544    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
545    
546            * man/: Updated parts of the documentation.
547    
548            * R/textdoccol.R (asPlain): Added conversion from newsgroup
549            documents to plain text documents.
550    
551    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
552    
553            * R/textdoccol.R: Finished experimental database support. Not yet
554            intensively tested.
555    
556            * R/source.R: Now each source has a default reader.
557    
558            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
559            class anymore.
560    
561            * R/plaintextdoc.R: Custom show method for plain text documents.
562    
563            * R/aobjects.R: Added a class for structured text documents.
564    
565            * R/reader.R: Replaced remaining \code{parser} occurrences with
566            \code{reader}.
567    
568            * R/textdoccol.R (summary): Indent tags.
569    
570            * R/textdoccol.R (removePunctuation): Transform method to remove
571            punctuation marks.
572    
573    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
574    
575            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
576            using prescindMeta().
577    
578    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
579    
580            * R/textdoccol.R: Improved database support.
581    
582    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
583    
584            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
585    
586            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
587            language code.
588    
589            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
590            into parserControl argument.
591    
592            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
593    
594    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
595    
596            * Work/tmDataSetup.R: The datasets acq and crude can now be
597            created on the fly.
598    
599            * R/stopwords.R: Introduced a function returning the stopwords for
600            a given language (English, German and French at the moment)
601    
602            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
603            otherwise falls back to Snowball package.
604    
605    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
606    
607            * man/dissimilarity-methods.Rd: Make clear that any method offered
608            by "dists" from package "cba" can be used.
609    
610    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
611    
612            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
613            to Kurt's latex suggestion. Removed points and underscores in
614            variable names for consistent naming.
615    
616            * DESCRIPTION: Update to version 0.1-2.
617    
618            * man/TextRepository.Rd: Fixed bug in documentation.
619    
620    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
621    
622            * DESCRIPTION: Update to version 0.1-1.
623    
624    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
625    
626            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
627            wordStem.
628    
629    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
630    
631            * R/: Changes due to Kurt's review.
632    
633    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
634    
635            * R/: Implemented improvements based upon comments by David
636            Meyer.
637    
638    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
639    
640            * inst/doc/: Rewrote vignette.
641    
642            * man/: Improved documentation.
643    
644    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
645    
646            * man/: Updated documentation.
647    
648            * DESCRIPTION: Changed package name to "tm". Updated version to
649            0.1 for first CRAN release.
650    
651            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
652            list archive example.
653    
654            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
655            archive example.
656    
657            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
658            from (several mails per box) mbox format to (single mail per file)
659            eml format.
660    
661    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
662    
663            * data/crude.rda: Rebuilt.
664    
665            * data/acq.rda: Rebuilt.
666    
667            * R/reader.R: Factored out reader and parser methods from
668            textdoccol.R.
669    
670            * R/source.R: Factored out Source methods from aobjects.R and
671            textdoccol.R.
672            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
673            feeds.
674    
675            * R/textdoccol.R (DirSource): Added support for recursive
676            traversal of directories.
677    
678    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
679    
680            * R/textdoccol.R ([[): Loads the document corpus automatically
681            into memory upon access.
682            (tm_transform, tm_filter): Removed several checks whether the
683            document is already loaded ([[ ensures this now).
684            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
685            mailing list archive.
686    
687    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
688    
689            * R/aobjects.R (TextDocument): Is now a virtual class.
690            (Source): Is now a virtual class.
691    
692    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
693    
694            * R/textdoccol.R (c): Support for an arbitrary number of document
695            collections.
696    
697    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
698    
699            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
700            append_meta and remove_meta.
701    
702            * R/textdoccol.R: Removed modify_metadata method.
703    
704            * R/textrepo.R: Removed modify_metadata method.
705    
706            * R/textdoccol.R (remove_meta): Supports removal of document
707            collection metadata and document (= in data frame) metadata.
708    
709    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
710    
711            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
712    
713            * data/crude.rda: Rebuilt.
714    
715            * data/acq.rda: Rebuilt.
716    
717            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
718    
719            * R/textdoccol.R ([): Bug fix for subsetting a document
720            collection's data frame.
721    
722    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
723    
724            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
725            to s_filter.
726    
727            * R/textdoccol.R: Local text documents' metadata can now be copied
728            to a document collection's data frame with prescind_meta.
729    
730    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
731    
732            * R/: Text documents' slot metadata is now accessible in s_filter.
733    
734            * R/: Rewrote s_filter function (has still some restrictions).
735    
736    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
737    
738            * R/: Various fixes in handling metadata.
739    
740            * R/: Added update mechanism for text document collections.
741    
742    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
743    
744            * R/: Merging of document collections now creates a binary tree
745            for reconstructing merged document collections.
746    
747            * R/: Redesign of metadata for document collections.
748    
749    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
750    
751            * R/: Messages now use \code{ngettext}.
752    
753    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
754    
755            * R/: Added functions for modifying and removing metadata.
756    
757    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
758    
759            * man/: Updated some documentation.
760    
761            * R/: Corrected some connection issues.
762    
763            * inst/doc: Worked on the vignette.
764    
765    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
766    
767            * inst/: Added texts and started vignette.
768    
769            * R/: Final changes based upon David's comments.
770    
771    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
772    
773            * NAMESPACE: Corrected exports (generic methods need exportMethods
774            directives!).
775    
776    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
777    
778            * R/: Modified the TextDocCol constructur and various parsers. It
779            is now modular and supports various file formats via plugins (see
780            the new "Source" class).
781    
782    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
783    
784            * man/: Revised documentation after previous code changes.
785    
786    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
787    
788            * R/: Remaining changes as discussed with David.
789    
790    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
791    
792            * R/: Some changes as suggested by David. The rest will follow
793            within the next days.
794    
795    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
796    
797            * man/: Finished documentation.
798    
799    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * man/: Wrote some documentation.
802    
803    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * R/: Further syntactic sugar in form of additional assignment and
806            accessor methods.
807    
808    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
809    
810            * R/: Syntactic sugar in form of "length", "show" and "summary"
811            operators.
812    
813    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
814    
815            * R/: Diverse updates. Mainly on default operators ("[" or "c")
816            and dissimilarities.
817    
818    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
819    
820            * R/: Added similarity functions.
821    
822            * data/: Added english stopwords.
823    
824    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
825    
826            * data/: Examples compiled for new features
827    
828            * R/: Changes due to new structure.
829    
830            * NAMESPACE: Corrected namespace to reflect new structure.
831    
832            * R/termdocmatrix.R: Adapted for new naming scheme.
833    
834    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
835    
836            * R/textdoccol.R: Adapted code for new class structure. Wrote
837            several transform and filter functions operating on text document
838            collections (alias text document databases).
839    
840            * R/aobjects.R: Adapted class structure with inheritance,
841            repositories and additional meta data. Loading files on demand is
842            now possible.
843    
844    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
845    
846            * R/: Some cosmetic cleanups.
847    
848            * inst/: Removed vignette on clustering. That and much more is now
849            described in the JSS paper on text mining. Based upon that
850            article an elaborated vignette will be incorporated in the future.
851    
852    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
853    
854            * R/: Updated generic S4 methods to comply with signature changes
855            in newer versions of R (> 2.3)
856    
857    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
858    
859            * ext/R/importRIS.R: Automatic RIS import is now possible.
860    
861    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
862    
863            * R/textdoccol.R: Added RIS HTML input format.
864    
865    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
866    
867            * R/textdoccol.R: Removed bug that caused invalid text document
868            collections when handling many input files.
869    
870    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
871    
872            * R/textdoccol.R: Restructured and extended file import
873            mechanism.
874    
875            * inst/doc/clustering.Rnw: Adapted vignette for use with
876            ReutNews.rda
877    
878            * man/ReutNews.Rd: Documentation for ReutNews.rda
879    
880            * data/ReutNews.rda: A tiny Reuters21578 example data set.
881    
882    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
883    
884            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
885            clustering facilities of this package.
886    
887    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * R/aobjects.R: Changed package document structure to avoid class
890            dependency problems.
891    
892  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
895            data set.
896    
897          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
898          CMD check textmin" works without errors.          CMD check textmin" works without errors.
899    

Legend:
Removed from v.28  
changed lines
  Added in v.993

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge