SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC pkg/ChangeLog revision 973, Sat Jul 4 08:10:25 2009 UTC
# Line 1  Line 1 
1    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
4            postings are basically e-mails with some extra headers.
5    
6    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/transform.R: Move removeCitation, removeMultipart, and
9            removeSignature to the tau package since they are mainly utility
10            functions (for handling e-mails) and not very framework specific.
11    
12    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
13    
14            * man/: Fix documentation.
15    
16    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
17    
18            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
19            plain text document instead of an XML document for texts of the
20            Reuters-21578 dataset.
21    
22            * R/sparse.R: Removed since the slam package is now available on
23            CRAN.
24    
25            * DESCRIPTION (Depends): Add slam package.
26    
27    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
28    
29            * R/transform.R (stemDoc): Fix character(0) handling.
30    
31    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
32    
33            * R/doc.R (show): Pretty print.
34    
35    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
36    
37            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
38            gracefully.
39    
40    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
41    
42            * R/corpus.R: Make corpus virtual. Implement corpus with standard
43            and permanent storage semantics.
44    
45            * DESCRIPTION: New major release. A *lot* of improvements.
46    
47    2009-05-04   Ingo Feinerer <feinerer@logic.at>
48    
49            * NAMESPACE: Export some simple_triplet_matrix functions.
50    
51    2009-04-28   Ingo Feinerer <feinerer@logic.at>
52    
53            * R/weight.R: Adapt tf-idf to new matrix format.
54    
55    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
56    
57            * R/matrix.R: Create two distinct classes for term-document and
58            document-term matrices.
59    
60    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
61    
62            * R/termdocmatrix.R: No longer use Matrix package. This reduces
63            package start-up time significantly.
64    
65    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
66    
67            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
68    
69    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
70    
71            * R/transform.R (tmReduce): Combine multiple maps into one
72            transformation.
73    
74    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
75    
76            * R/weight.R: Remove weightLogical since it does not return a
77            dgCMatrix.
78    
79            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
80            or TermDocumentMatrix instead.
81    
82    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
83    
84            * inst/doc/extensions.Rnw: Finished vignette.
85    
86    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
89            DocumentTermMatrix representations.
90    
91    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
92    
93            * R/reader.R (readXML): New reader for arbitrary XML files.
94    
95    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
96    
97            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
98            (XMLSource): New XMLSource class for arbitrary XML files.
99            (Source): New slot Vectorized.
100    
101    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
102    
103            * R/reader.R (readTabular): Experimental reader for tabular data
104            structures which can be customized via user-defined mappings.
105    
106            * R/reader.R: Always use UTC time zone.
107    
108            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
109    
110    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/reader.R (readDOC): Options can be passed over to antiword.
113    
114            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
115            pdftotext.
116    
117    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
118    
119            * R/source.R (DirSource): Add pattern and ignore.case arguments
120            which are internally passed over to list.files().
121    
122    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
123    
124            * inst/doc/tm.Rnw: Suppress pointless loading message.
125    
126    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
127    
128            * DESCRIPTION: Speed up package loading (via moving packages not
129            strictly necessary for normal operation to Suggests instead of
130            Depends).
131    
132    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
133    
134            * R/reader.R (readNewsgroup): The date format is now configurable.
135    
136    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
137    
138            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
139    
140    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
141    
142            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
143    
144    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
145    
146            * R/source.R (DataframeSource): New source class for data frames.
147    
148            * R/source.R: Fixed non-standard call evaluation.
149    
150    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
151    
152            * R/source.R (URISource): New source class for a single document.
153    
154    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
155    
156            * R/source.R: Refactoring.
157    
158    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
159    
160            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
161            Rmpi installations more gracefully.
162    
163    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
164    
165            * R/source.R (Source): Add Length slot.
166    
167    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
168    
169            * R/AAA.R: Unify duplicated .onLoad function.
170    
171    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
172    
173            * DESCRIPTION (Suggests): Added Rmpi.
174    
175    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
176    
177            * R/source.R (getElem): Fix 'no visible binding' warning.
178    
179            * man/WeightFunction.Rd: Fix signature.
180    
181    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
182    
183            * R/weight.R: Introduce name abbreviations for weighting functions.
184    
185    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
186    
187            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
188    
189            * R/cluster.R: Provide convenience functions for using a MPI
190            cluster.
191    
192            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
193            available.
194    
195            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
196            available.
197    
198    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
199    
200            * R/textdoccol.R (lapply): Removed debug print out.
201    
202    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
203    
204            * R/reader.R (readRCV1): Improved meta data extraction from
205            Reuters Corpus Volume 1 documents.
206    
207    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
208    
209            * R/transform.R: Ensure that all mappings preserve multiline
210            structures.
211    
212    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
213    
214            * R/filter.R: Every filter has now an attribute indicating whether
215            it sould be applied to document level (doclevel).
216    
217            * R/textdoccol.R (tmFilter): Set searchFullText as new default
218            filter.
219    
220    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
221    
222            * R/transform.R (replacePatterns): Replaced removeWords by
223            replacePatterns. Suggested by Christian Buchta.
224    
225            * R/textdoccol.R (inspect): Improved formatting.
226    
227    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
228    
229            * inst/CITATION: Updated JSS article information.
230    
231            * R/textdoccol.R (setAs): Added coerce method from list to
232            corpus.
233    
234            * R/meta.R (meta): Improved meta data handling.
235    
236    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
237    
238            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
239            Christian Buchta.
240    
241            * inst/CITATION: Added template to include JSS article reference.
242    
243    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
244    
245            * R/textdoccol.R (tmMap): Introduced lazy mapping.
246    
247            * R/source.R: Added VectorSource.
248    
249    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
250    
251            * man/: Language codes should be in ISO 639-1 format.
252    
253            * R/textdoccol.R (asPlain): Preserve local meta data.
254    
255    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
256    
257            * R/textdoccol.R (writeCorpus): Function for writing a corpus
258            containing plain text documents to disk.
259    
260    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
261    
262            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
263            always set correctly.
264    
265            * R/textdoccol.R: Set load = TRUE as default for load on demand
266            since in most cases this is the wanted behaviour.
267    
268    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
269    
270            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
271    
272            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
273    
274    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
275    
276            * R/meta.R (meta): New function for consistent access to meta data
277            of document collections, repositories, and texts.
278    
279    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
280    
281            * R/: Better support for encodings.
282    
283    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
284    
285            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
286            selection when no reader argument is given.
287    
288    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
289    
290            * R/source.R (CSVSource): Now uses read.csv instead of scan
291            internally.
292    
293    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
294    
295            * R/reader.R (getReaders): Returns available reader functions.
296    
297            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
298            as default.
299    
300    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
301    
302            * R/stopwords.R (stopwords): Shortened code, removed codetools
303            variable warnings.
304    
305            * man/: Documentation for showMeta, added an example for tmMap.
306    
307            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
308            some minor typos fixed.
309    
310    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
311    
312            * R/aobjects.R (showMeta): Added method for pretty printing a
313            text document's meta data.
314    
315    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
316    
317            * R/textdoccol.R (TextDocCol): Better handling of empty
318            arguments.
319    
320            * NAMESPACE: Exported readDOC.
321    
322            * man/completeStems.Rd: Added an example.
323    
324    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/stopwords.R (stopwords): Look up .dat files at every
327            call. Allows users to modify stopword .dat files interactively.
328    
329    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * R/termdocmatrix.R (termFreq): Correct processing of empty
332            documents.
333    
334    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
335    
336            * man/: Updated documentation.
337    
338    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
339    
340            * R/complete.R (completeStems): Completes (heuristically) word
341            stems.
342    
343            * R/termdocmatrix.R (TermDocMatrix2): New modular
344            constructor.
345    
346            * NAMESPACE: Exported termFreq.
347    
348    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
349    
350            * R/reader.R (readDOC): Added MS Word reader (using antiword).
351    
352    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
353    
354            * R/weight.R: Weighting functions for TermDocMatrix.
355    
356    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
357    
358            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
359            functions for accessing dimension, column, and row names.
360    
361            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
362    
363    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
366    
367    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
368    
369            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
370    
371    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
372    
373            * R/reader.R (readPDF): Removed manual checks for pdftotext and
374            pdfinfo. The system call gives a warning anyway.
375    
376    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * R/textdoccol.R (asPlain): Conversion from
379            StructuredTextDocuments to PlainTextDocuments.
380    
381    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
382    
383            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
384            for accessing term-document matrices.
385    
386            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
387            are installed.
388    
389    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
392            Christian Buchta.
393    
394    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
397    
398    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
401    
402            * R/reader.R (readPDF): Added PDF reader.
403    
404    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
405    
406            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
407    
408            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
409    
410            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
411    
412            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
413    
414    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
415    
416            * R/distmeasure.R (dissimilarity): Replaced dists call from
417            package cba by new dist call from package proxy.
418    
419    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
420    
421            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
422    
423    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
424    
425            * R/termdocmatrix.R: require() uses the quietly option to suppress
426            loading messages.
427    
428    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
429    
430            * R/dictionary.R: Added dictionary support.
431    
432    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
433    
434            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
435            documents. This simplifies some functions, e.g., asPlain.
436    
437    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
438    
439            * inst/doc/tm.Rnw: Fixed some typos in vignette.
440    
441    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * R/textdoccol.R (replaceWords): Added method to replace a set of
444            words by a single word. Useful for synonyms.
445    
446    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
449    
450    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
453            vectors. Thanks to Ariel Maguyon for his error report.
454            (removeSparseTerms): New function to remove columns from a
455            term-document matrix exceeding a sparse factor.
456    
457    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
458    
459            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
460    
461    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
462    
463            * man/sFilter.Rd: Corrected documentation on statement format (use
464            '==' instead of '=').
465    
466    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
467    
468            * R/aobjects.R (StructuredTextDocument): Inherits from
469            TextDocument.
470    
471    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
472    
473            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
474            on sparse matrices as proposed by Martin Maechler.
475    
476    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
477    
478            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
479            \pkg{filehash} version makes them deprecated.
480    
481    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * R/termdocmatrix.R (textvector): Stemming is now performed before
484            erasing stopwords.
485            (weightMatrix): Adapted to handle sparse matrices.
486            (TermDocMatrix): Sparse matrix is now efficiently built by
487            direct stepwise insertion of row values into it.
488    
489    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
490    
491            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
492            due to ongoing problems. For our purposes the latter is as useful
493            as the replaced package.
494    
495    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
496    
497            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
498    
499            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
500    
501    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
502    
503            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
504            languages with available stopwords.
505    
506    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
507    
508            * inst/doc/tm.Rnw: Minor corrections in the vignette.
509    
510    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
511    
512            * DESCRIPTION: Update to version 0.2, since a lot of new features
513            have been integrated.
514    
515            * inst/stopwords: Updated existing stopwords and added stopwords
516            for various other languages.
517    
518    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
519    
520            * man/: Updated documentation.
521    
522            * Work/testDb.R: Script to test database stuff.
523    
524            * R/: Fixed various database related bugs. Seems to be rather
525            useable now, i.e., consider as alpha status for now.
526    
527    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
528    
529            * R/: Fixed some bugs related to database support.
530    
531    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
532    
533            * man/: Added a lot of examples to the manuals.
534    
535    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * man/: Updated parts of the documentation.
538    
539            * R/textdoccol.R (asPlain): Added conversion from newsgroup
540            documents to plain text documents.
541    
542    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
543    
544            * R/textdoccol.R: Finished experimental database support. Not yet
545            intensively tested.
546    
547            * R/source.R: Now each source has a default reader.
548    
549            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
550            class anymore.
551    
552            * R/plaintextdoc.R: Custom show method for plain text documents.
553    
554            * R/aobjects.R: Added a class for structured text documents.
555    
556            * R/reader.R: Replaced remaining \code{parser} occurrences with
557            \code{reader}.
558    
559            * R/textdoccol.R (summary): Indent tags.
560    
561            * R/textdoccol.R (removePunctuation): Transform method to remove
562            punctuation marks.
563    
564    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
565    
566            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
567            using prescindMeta().
568    
569    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
570    
571            * R/textdoccol.R: Improved database support.
572    
573    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
574    
575            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
576    
577            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
578            language code.
579    
580            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
581            into parserControl argument.
582    
583            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
584    
585    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
586    
587            * Work/tmDataSetup.R: The datasets acq and crude can now be
588            created on the fly.
589    
590            * R/stopwords.R: Introduced a function returning the stopwords for
591            a given language (English, German and French at the moment)
592    
593            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
594            otherwise falls back to Snowball package.
595    
596    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
597    
598            * man/dissimilarity-methods.Rd: Make clear that any method offered
599            by "dists" from package "cba" can be used.
600    
601    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
602    
603            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
604            to Kurt's latex suggestion. Removed points and underscores in
605            variable names for consistent naming.
606    
607            * DESCRIPTION: Update to version 0.1-2.
608    
609            * man/TextRepository.Rd: Fixed bug in documentation.
610    
611    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
612    
613            * DESCRIPTION: Update to version 0.1-1.
614    
615    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
616    
617            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
618            wordStem.
619    
620    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
621    
622            * R/: Changes due to Kurt's review.
623    
624    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
625    
626            * R/: Implemented improvements based upon comments by David
627            Meyer.
628    
629    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
630    
631            * inst/doc/: Rewrote vignette.
632    
633            * man/: Improved documentation.
634    
635    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
636    
637            * man/: Updated documentation.
638    
639            * DESCRIPTION: Changed package name to "tm". Updated version to
640            0.1 for first CRAN release.
641    
642            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
643            list archive example.
644    
645            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
646            archive example.
647    
648            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
649            from (several mails per box) mbox format to (single mail per file)
650            eml format.
651    
652    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
653    
654            * data/crude.rda: Rebuilt.
655    
656            * data/acq.rda: Rebuilt.
657    
658            * R/reader.R: Factored out reader and parser methods from
659            textdoccol.R.
660    
661            * R/source.R: Factored out Source methods from aobjects.R and
662            textdoccol.R.
663            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
664            feeds.
665    
666            * R/textdoccol.R (DirSource): Added support for recursive
667            traversal of directories.
668    
669    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
670    
671            * R/textdoccol.R ([[): Loads the document corpus automatically
672            into memory upon access.
673            (tm_transform, tm_filter): Removed several checks whether the
674            document is already loaded ([[ ensures this now).
675            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
676            mailing list archive.
677    
678    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
679    
680            * R/aobjects.R (TextDocument): Is now a virtual class.
681            (Source): Is now a virtual class.
682    
683    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * R/textdoccol.R (c): Support for an arbitrary number of document
686            collections.
687    
688    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
689    
690            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
691            append_meta and remove_meta.
692    
693            * R/textdoccol.R: Removed modify_metadata method.
694    
695            * R/textrepo.R: Removed modify_metadata method.
696    
697            * R/textdoccol.R (remove_meta): Supports removal of document
698            collection metadata and document (= in data frame) metadata.
699    
700    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
701    
702            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
703    
704            * data/crude.rda: Rebuilt.
705    
706            * data/acq.rda: Rebuilt.
707    
708            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
709    
710            * R/textdoccol.R ([): Bug fix for subsetting a document
711            collection's data frame.
712    
713    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
714    
715            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
716            to s_filter.
717    
718            * R/textdoccol.R: Local text documents' metadata can now be copied
719            to a document collection's data frame with prescind_meta.
720    
721    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
722    
723            * R/: Text documents' slot metadata is now accessible in s_filter.
724    
725            * R/: Rewrote s_filter function (has still some restrictions).
726    
727    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
728    
729            * R/: Various fixes in handling metadata.
730    
731            * R/: Added update mechanism for text document collections.
732    
733    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
734    
735            * R/: Merging of document collections now creates a binary tree
736            for reconstructing merged document collections.
737    
738            * R/: Redesign of metadata for document collections.
739    
740    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
741    
742            * R/: Messages now use \code{ngettext}.
743    
744    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
745    
746            * R/: Added functions for modifying and removing metadata.
747    
748    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
749    
750            * man/: Updated some documentation.
751    
752            * R/: Corrected some connection issues.
753    
754            * inst/doc: Worked on the vignette.
755    
756    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
757    
758            * inst/: Added texts and started vignette.
759    
760            * R/: Final changes based upon David's comments.
761    
762    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
763    
764            * NAMESPACE: Corrected exports (generic methods need exportMethods
765            directives!).
766    
767    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
768    
769            * R/: Modified the TextDocCol constructur and various parsers. It
770            is now modular and supports various file formats via plugins (see
771            the new "Source" class).
772    
773    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
774    
775            * man/: Revised documentation after previous code changes.
776    
777    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
778    
779            * R/: Remaining changes as discussed with David.
780    
781    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
782    
783            * R/: Some changes as suggested by David. The rest will follow
784            within the next days.
785    
786    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
787    
788            * man/: Finished documentation.
789    
790    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
791    
792            * man/: Wrote some documentation.
793    
794    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
795    
796            * R/: Further syntactic sugar in form of additional assignment and
797            accessor methods.
798    
799    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * R/: Syntactic sugar in form of "length", "show" and "summary"
802            operators.
803    
804    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
805    
806            * R/: Diverse updates. Mainly on default operators ("[" or "c")
807            and dissimilarities.
808    
809    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
810    
811            * R/: Added similarity functions.
812    
813            * data/: Added english stopwords.
814    
815    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
816    
817            * data/: Examples compiled for new features
818    
819            * R/: Changes due to new structure.
820    
821            * NAMESPACE: Corrected namespace to reflect new structure.
822    
823            * R/termdocmatrix.R: Adapted for new naming scheme.
824    
825    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
826    
827            * R/textdoccol.R: Adapted code for new class structure. Wrote
828            several transform and filter functions operating on text document
829            collections (alias text document databases).
830    
831            * R/aobjects.R: Adapted class structure with inheritance,
832            repositories and additional meta data. Loading files on demand is
833            now possible.
834    
835    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
836    
837            * R/: Some cosmetic cleanups.
838    
839            * inst/: Removed vignette on clustering. That and much more is now
840            described in the JSS paper on text mining. Based upon that
841            article an elaborated vignette will be incorporated in the future.
842    
843    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
844    
845            * R/: Updated generic S4 methods to comply with signature changes
846            in newer versions of R (> 2.3)
847    
848    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
849    
850            * ext/R/importRIS.R: Automatic RIS import is now possible.
851    
852    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
853    
854            * R/textdoccol.R: Added RIS HTML input format.
855    
856    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
857    
858            * R/textdoccol.R: Removed bug that caused invalid text document
859            collections when handling many input files.
860    
861    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
862    
863            * R/textdoccol.R: Restructured and extended file import
864            mechanism.
865    
866            * inst/doc/clustering.Rnw: Adapted vignette for use with
867            ReutNews.rda
868    
869            * man/ReutNews.Rd: Documentation for ReutNews.rda
870    
871            * data/ReutNews.rda: A tiny Reuters21578 example data set.
872    
873    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
874    
875            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
876            clustering facilities of this package.
877    
878    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
879    
880            * R/aobjects.R: Changed package document structure to avoid class
881            dependency problems.
882    
883  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
884    
885            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
886            data set.
887    
888          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
889          CMD check textmin" works without errors.          CMD check textmin" works without errors.
890    

Legend:
Removed from v.28  
changed lines
  Added in v.973

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge