SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC pkg/ChangeLog revision 1018, Sun Nov 15 15:53:49 2009 UTC
# Line 1  Line 1 
1    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
4            caused words at the beginning or the end of a line not to be removed. Do
5            not delete whitespace anymore.
6    
7    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
8    
9            * R/source.R (DirSource): Default to working directory if no path
10            is specified.
11    
12    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
13    
14            * R/source.R (DirSource): Stop on empty directories.
15    
16    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
17    
18            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
19            named documents.
20    
21    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
22    
23            * R/transform.R (removeWords): Improve regular expressions.
24    
25    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
26    
27            * R/meta.R (DublinCore): Allow lower case tags.
28    
29    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
30    
31            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
32            instead of x$children.
33    
34    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
35    
36            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
37    
38    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
39    
40            * R/: Use S3 instead of S4 class system.
41    
42    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/reader.R (readMail): Moved to tm.plugin.mail package.
45    
46    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
47    
48            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
49            postings are basically e-mails with some extra headers.
50    
51    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
52    
53            * R/transform.R: Move convertMboxEml, removeCitation,
54            removeMultipart, and removeSignature to the tm.plugin.mail package
55            since they are mainly utility functions (for handling e-mails) and
56            not very framework specific.
57    
58    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
59    
60            * man/: Fix documentation.
61    
62    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
65            plain text document instead of an XML document for texts of the
66            Reuters-21578 dataset.
67    
68            * R/sparse.R: Removed since the slam package is now available on
69            CRAN.
70    
71            * DESCRIPTION (Depends): Add slam package.
72    
73    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
74    
75            * R/transform.R (stemDoc): Fix character(0) handling.
76    
77    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
78    
79            * R/doc.R (show): Pretty print.
80    
81    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
82    
83            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
84            gracefully.
85    
86    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/corpus.R: Make corpus virtual. Implement corpus with standard
89            and permanent storage semantics.
90    
91            * DESCRIPTION: New major release. A *lot* of improvements.
92    
93    2009-05-04   Ingo Feinerer <feinerer@logic.at>
94    
95            * NAMESPACE: Export some simple_triplet_matrix functions.
96    
97    2009-04-28   Ingo Feinerer <feinerer@logic.at>
98    
99            * R/weight.R: Adapt tf-idf to new matrix format.
100    
101    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
102    
103            * R/matrix.R: Create two distinct classes for term-document and
104            document-term matrices.
105    
106    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
107    
108            * R/termdocmatrix.R: No longer use Matrix package. This reduces
109            package start-up time significantly.
110    
111    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
112    
113            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
114    
115    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
116    
117            * R/transform.R (tmReduce): Combine multiple maps into one
118            transformation.
119    
120    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
121    
122            * R/weight.R: Remove weightLogical since it does not return a
123            dgCMatrix.
124    
125            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
126            or TermDocumentMatrix instead.
127    
128    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
129    
130            * inst/doc/extensions.Rnw: Finished vignette.
131    
132    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
133    
134            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
135            DocumentTermMatrix representations.
136    
137    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
138    
139            * R/reader.R (readXML): New reader for arbitrary XML files.
140    
141    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
142    
143            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
144            (XMLSource): New XMLSource class for arbitrary XML files.
145            (Source): New slot Vectorized.
146    
147    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
148    
149            * R/reader.R (readTabular): Experimental reader for tabular data
150            structures which can be customized via user-defined mappings.
151    
152            * R/reader.R: Always use UTC time zone.
153    
154            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
155    
156    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
157    
158            * R/reader.R (readDOC): Options can be passed over to antiword.
159    
160            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
161            pdftotext.
162    
163    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
164    
165            * R/source.R (DirSource): Add pattern and ignore.case arguments
166            which are internally passed over to list.files().
167    
168    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
169    
170            * inst/doc/tm.Rnw: Suppress pointless loading message.
171    
172    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
173    
174            * DESCRIPTION: Speed up package loading (via moving packages not
175            strictly necessary for normal operation to Suggests instead of
176            Depends).
177    
178    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
179    
180            * R/reader.R (readNewsgroup): The date format is now configurable.
181    
182    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
183    
184            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
185    
186    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
187    
188            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
189    
190    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
191    
192            * R/source.R (DataframeSource): New source class for data frames.
193    
194            * R/source.R: Fixed non-standard call evaluation.
195    
196    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
197    
198            * R/source.R (URISource): New source class for a single document.
199    
200    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
201    
202            * R/source.R: Refactoring.
203    
204    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
205    
206            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
207            Rmpi installations more gracefully.
208    
209    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
210    
211            * R/source.R (Source): Add Length slot.
212    
213    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
214    
215            * R/AAA.R: Unify duplicated .onLoad function.
216    
217    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
218    
219            * DESCRIPTION (Suggests): Added Rmpi.
220    
221    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
222    
223            * R/source.R (getElem): Fix 'no visible binding' warning.
224    
225            * man/WeightFunction.Rd: Fix signature.
226    
227    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
228    
229            * R/weight.R: Introduce name abbreviations for weighting functions.
230    
231    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
232    
233            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
234    
235            * R/cluster.R: Provide convenience functions for using a MPI
236            cluster.
237    
238            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
239            available.
240    
241            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
242            available.
243    
244    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
245    
246            * R/textdoccol.R (lapply): Removed debug print out.
247    
248    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
249    
250            * R/reader.R (readRCV1): Improved meta data extraction from
251            Reuters Corpus Volume 1 documents.
252    
253    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
254    
255            * R/transform.R: Ensure that all mappings preserve multiline
256            structures.
257    
258    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * R/filter.R: Every filter has now an attribute indicating whether
261            it sould be applied to document level (doclevel).
262    
263            * R/textdoccol.R (tmFilter): Set searchFullText as new default
264            filter.
265    
266    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
267    
268            * R/transform.R (replacePatterns): Replaced removeWords by
269            replacePatterns. Suggested by Christian Buchta.
270    
271            * R/textdoccol.R (inspect): Improved formatting.
272    
273    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
274    
275            * inst/CITATION: Updated JSS article information.
276    
277            * R/textdoccol.R (setAs): Added coerce method from list to
278            corpus.
279    
280            * R/meta.R (meta): Improved meta data handling.
281    
282    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
283    
284            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
285            Christian Buchta.
286    
287            * inst/CITATION: Added template to include JSS article reference.
288    
289    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
290    
291            * R/textdoccol.R (tmMap): Introduced lazy mapping.
292    
293            * R/source.R: Added VectorSource.
294    
295    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
296    
297            * man/: Language codes should be in ISO 639-1 format.
298    
299            * R/textdoccol.R (asPlain): Preserve local meta data.
300    
301    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
302    
303            * R/textdoccol.R (writeCorpus): Function for writing a corpus
304            containing plain text documents to disk.
305    
306    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
307    
308            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
309            always set correctly.
310    
311            * R/textdoccol.R: Set load = TRUE as default for load on demand
312            since in most cases this is the wanted behaviour.
313    
314    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
317    
318            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
319    
320    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
321    
322            * R/meta.R (meta): New function for consistent access to meta data
323            of document collections, repositories, and texts.
324    
325    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
326    
327            * R/: Better support for encodings.
328    
329    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
332            selection when no reader argument is given.
333    
334    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
335    
336            * R/source.R (CSVSource): Now uses read.csv instead of scan
337            internally.
338    
339    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
340    
341            * R/reader.R (getReaders): Returns available reader functions.
342    
343            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
344            as default.
345    
346    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * R/stopwords.R (stopwords): Shortened code, removed codetools
349            variable warnings.
350    
351            * man/: Documentation for showMeta, added an example for tmMap.
352    
353            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
354            some minor typos fixed.
355    
356    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
357    
358            * R/aobjects.R (showMeta): Added method for pretty printing a
359            text document's meta data.
360    
361    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
362    
363            * R/textdoccol.R (TextDocCol): Better handling of empty
364            arguments.
365    
366            * NAMESPACE: Exported readDOC.
367    
368            * man/completeStems.Rd: Added an example.
369    
370    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * R/stopwords.R (stopwords): Look up .dat files at every
373            call. Allows users to modify stopword .dat files interactively.
374    
375    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
376    
377            * R/termdocmatrix.R (termFreq): Correct processing of empty
378            documents.
379    
380    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * man/: Updated documentation.
383    
384    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
385    
386            * R/complete.R (completeStems): Completes (heuristically) word
387            stems.
388    
389            * R/termdocmatrix.R (TermDocMatrix2): New modular
390            constructor.
391    
392            * NAMESPACE: Exported termFreq.
393    
394    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/reader.R (readDOC): Added MS Word reader (using antiword).
397    
398    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/weight.R: Weighting functions for TermDocMatrix.
401    
402    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
405            functions for accessing dimension, column, and row names.
406    
407            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
408    
409    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
410    
411            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
412    
413    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
414    
415            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
416    
417    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
418    
419            * R/reader.R (readPDF): Removed manual checks for pdftotext and
420            pdfinfo. The system call gives a warning anyway.
421    
422    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
423    
424            * R/textdoccol.R (asPlain): Conversion from
425            StructuredTextDocuments to PlainTextDocuments.
426    
427    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
428    
429            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
430            for accessing term-document matrices.
431    
432            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
433            are installed.
434    
435    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
438            Christian Buchta.
439    
440    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
441    
442            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
443    
444    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
445    
446            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
447    
448            * R/reader.R (readPDF): Added PDF reader.
449    
450    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
453    
454            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
455    
456            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
457    
458            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
459    
460    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * R/distmeasure.R (dissimilarity): Replaced dists call from
463            package cba by new dist call from package proxy.
464    
465    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
466    
467            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
468    
469    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
470    
471            * R/termdocmatrix.R: require() uses the quietly option to suppress
472            loading messages.
473    
474    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
475    
476            * R/dictionary.R: Added dictionary support.
477    
478    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
479    
480            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
481            documents. This simplifies some functions, e.g., asPlain.
482    
483    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
484    
485            * inst/doc/tm.Rnw: Fixed some typos in vignette.
486    
487    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
488    
489            * R/textdoccol.R (replaceWords): Added method to replace a set of
490            words by a single word. Useful for synonyms.
491    
492    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
493    
494            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
495    
496    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
497    
498            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
499            vectors. Thanks to Ariel Maguyon for his error report.
500            (removeSparseTerms): New function to remove columns from a
501            term-document matrix exceeding a sparse factor.
502    
503    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
504    
505            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
506    
507    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
508    
509            * man/sFilter.Rd: Corrected documentation on statement format (use
510            '==' instead of '=').
511    
512    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
513    
514            * R/aobjects.R (StructuredTextDocument): Inherits from
515            TextDocument.
516    
517    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
518    
519            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
520            on sparse matrices as proposed by Martin Maechler.
521    
522    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
523    
524            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
525            \pkg{filehash} version makes them deprecated.
526    
527    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
528    
529            * R/termdocmatrix.R (textvector): Stemming is now performed before
530            erasing stopwords.
531            (weightMatrix): Adapted to handle sparse matrices.
532            (TermDocMatrix): Sparse matrix is now efficiently built by
533            direct stepwise insertion of row values into it.
534    
535    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
538            due to ongoing problems. For our purposes the latter is as useful
539            as the replaced package.
540    
541    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
542    
543            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
544    
545            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
546    
547    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
548    
549            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
550            languages with available stopwords.
551    
552    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
553    
554            * inst/doc/tm.Rnw: Minor corrections in the vignette.
555    
556    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
557    
558            * DESCRIPTION: Update to version 0.2, since a lot of new features
559            have been integrated.
560    
561            * inst/stopwords: Updated existing stopwords and added stopwords
562            for various other languages.
563    
564    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
565    
566            * man/: Updated documentation.
567    
568            * Work/testDb.R: Script to test database stuff.
569    
570            * R/: Fixed various database related bugs. Seems to be rather
571            useable now, i.e., consider as alpha status for now.
572    
573    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
574    
575            * R/: Fixed some bugs related to database support.
576    
577    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
578    
579            * man/: Added a lot of examples to the manuals.
580    
581    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
582    
583            * man/: Updated parts of the documentation.
584    
585            * R/textdoccol.R (asPlain): Added conversion from newsgroup
586            documents to plain text documents.
587    
588    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
589    
590            * R/textdoccol.R: Finished experimental database support. Not yet
591            intensively tested.
592    
593            * R/source.R: Now each source has a default reader.
594    
595            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
596            class anymore.
597    
598            * R/plaintextdoc.R: Custom show method for plain text documents.
599    
600            * R/aobjects.R: Added a class for structured text documents.
601    
602            * R/reader.R: Replaced remaining \code{parser} occurrences with
603            \code{reader}.
604    
605            * R/textdoccol.R (summary): Indent tags.
606    
607            * R/textdoccol.R (removePunctuation): Transform method to remove
608            punctuation marks.
609    
610    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
611    
612            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
613            using prescindMeta().
614    
615    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
616    
617            * R/textdoccol.R: Improved database support.
618    
619    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
620    
621            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
622    
623            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
624            language code.
625    
626            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
627            into parserControl argument.
628    
629            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
630    
631    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
632    
633            * Work/tmDataSetup.R: The datasets acq and crude can now be
634            created on the fly.
635    
636            * R/stopwords.R: Introduced a function returning the stopwords for
637            a given language (English, German and French at the moment)
638    
639            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
640            otherwise falls back to Snowball package.
641    
642    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
643    
644            * man/dissimilarity-methods.Rd: Make clear that any method offered
645            by "dists" from package "cba" can be used.
646    
647    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
648    
649            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
650            to Kurt's latex suggestion. Removed points and underscores in
651            variable names for consistent naming.
652    
653            * DESCRIPTION: Update to version 0.1-2.
654    
655            * man/TextRepository.Rd: Fixed bug in documentation.
656    
657    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * DESCRIPTION: Update to version 0.1-1.
660    
661    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
662    
663            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
664            wordStem.
665    
666    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
667    
668            * R/: Changes due to Kurt's review.
669    
670    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
671    
672            * R/: Implemented improvements based upon comments by David
673            Meyer.
674    
675    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
676    
677            * inst/doc/: Rewrote vignette.
678    
679            * man/: Improved documentation.
680    
681    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
682    
683            * man/: Updated documentation.
684    
685            * DESCRIPTION: Changed package name to "tm". Updated version to
686            0.1 for first CRAN release.
687    
688            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
689            list archive example.
690    
691            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
692            archive example.
693    
694            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
695            from (several mails per box) mbox format to (single mail per file)
696            eml format.
697    
698    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
699    
700            * data/crude.rda: Rebuilt.
701    
702            * data/acq.rda: Rebuilt.
703    
704            * R/reader.R: Factored out reader and parser methods from
705            textdoccol.R.
706    
707            * R/source.R: Factored out Source methods from aobjects.R and
708            textdoccol.R.
709            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
710            feeds.
711    
712            * R/textdoccol.R (DirSource): Added support for recursive
713            traversal of directories.
714    
715    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
716    
717            * R/textdoccol.R ([[): Loads the document corpus automatically
718            into memory upon access.
719            (tm_transform, tm_filter): Removed several checks whether the
720            document is already loaded ([[ ensures this now).
721            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
722            mailing list archive.
723    
724    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
725    
726            * R/aobjects.R (TextDocument): Is now a virtual class.
727            (Source): Is now a virtual class.
728    
729    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
730    
731            * R/textdoccol.R (c): Support for an arbitrary number of document
732            collections.
733    
734    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
735    
736            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
737            append_meta and remove_meta.
738    
739            * R/textdoccol.R: Removed modify_metadata method.
740    
741            * R/textrepo.R: Removed modify_metadata method.
742    
743            * R/textdoccol.R (remove_meta): Supports removal of document
744            collection metadata and document (= in data frame) metadata.
745    
746    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
747    
748            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
749    
750            * data/crude.rda: Rebuilt.
751    
752            * data/acq.rda: Rebuilt.
753    
754            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
755    
756            * R/textdoccol.R ([): Bug fix for subsetting a document
757            collection's data frame.
758    
759    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
760    
761            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
762            to s_filter.
763    
764            * R/textdoccol.R: Local text documents' metadata can now be copied
765            to a document collection's data frame with prescind_meta.
766    
767    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
768    
769            * R/: Text documents' slot metadata is now accessible in s_filter.
770    
771            * R/: Rewrote s_filter function (has still some restrictions).
772    
773    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
774    
775            * R/: Various fixes in handling metadata.
776    
777            * R/: Added update mechanism for text document collections.
778    
779    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
780    
781            * R/: Merging of document collections now creates a binary tree
782            for reconstructing merged document collections.
783    
784            * R/: Redesign of metadata for document collections.
785    
786    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
787    
788            * R/: Messages now use \code{ngettext}.
789    
790    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
791    
792            * R/: Added functions for modifying and removing metadata.
793    
794    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
795    
796            * man/: Updated some documentation.
797    
798            * R/: Corrected some connection issues.
799    
800            * inst/doc: Worked on the vignette.
801    
802    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
803    
804            * inst/: Added texts and started vignette.
805    
806            * R/: Final changes based upon David's comments.
807    
808    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
809    
810            * NAMESPACE: Corrected exports (generic methods need exportMethods
811            directives!).
812    
813    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
814    
815            * R/: Modified the TextDocCol constructur and various parsers. It
816            is now modular and supports various file formats via plugins (see
817            the new "Source" class).
818    
819    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
820    
821            * man/: Revised documentation after previous code changes.
822    
823    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
824    
825            * R/: Remaining changes as discussed with David.
826    
827    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
828    
829            * R/: Some changes as suggested by David. The rest will follow
830            within the next days.
831    
832    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
833    
834            * man/: Finished documentation.
835    
836    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
837    
838            * man/: Wrote some documentation.
839    
840    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
841    
842            * R/: Further syntactic sugar in form of additional assignment and
843            accessor methods.
844    
845    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
846    
847            * R/: Syntactic sugar in form of "length", "show" and "summary"
848            operators.
849    
850    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
851    
852            * R/: Diverse updates. Mainly on default operators ("[" or "c")
853            and dissimilarities.
854    
855    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * R/: Added similarity functions.
858    
859            * data/: Added english stopwords.
860    
861    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
862    
863            * data/: Examples compiled for new features
864    
865            * R/: Changes due to new structure.
866    
867            * NAMESPACE: Corrected namespace to reflect new structure.
868    
869            * R/termdocmatrix.R: Adapted for new naming scheme.
870    
871    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
872    
873            * R/textdoccol.R: Adapted code for new class structure. Wrote
874            several transform and filter functions operating on text document
875            collections (alias text document databases).
876    
877            * R/aobjects.R: Adapted class structure with inheritance,
878            repositories and additional meta data. Loading files on demand is
879            now possible.
880    
881    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
882    
883            * R/: Some cosmetic cleanups.
884    
885            * inst/: Removed vignette on clustering. That and much more is now
886            described in the JSS paper on text mining. Based upon that
887            article an elaborated vignette will be incorporated in the future.
888    
889    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
890    
891            * R/: Updated generic S4 methods to comply with signature changes
892            in newer versions of R (> 2.3)
893    
894    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
895    
896            * ext/R/importRIS.R: Automatic RIS import is now possible.
897    
898    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
899    
900            * R/textdoccol.R: Added RIS HTML input format.
901    
902    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
903    
904            * R/textdoccol.R: Removed bug that caused invalid text document
905            collections when handling many input files.
906    
907    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
908    
909            * R/textdoccol.R: Restructured and extended file import
910            mechanism.
911    
912            * inst/doc/clustering.Rnw: Adapted vignette for use with
913            ReutNews.rda
914    
915            * man/ReutNews.Rd: Documentation for ReutNews.rda
916    
917            * data/ReutNews.rda: A tiny Reuters21578 example data set.
918    
919    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
920    
921            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
922            clustering facilities of this package.
923    
924    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
925    
926            * R/aobjects.R: Changed package document structure to avoid class
927            dependency problems.
928    
929  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
930    
931            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
932            data set.
933    
934          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
935          CMD check textmin" works without errors.          CMD check textmin" works without errors.
936    

Legend:
Removed from v.28  
changed lines
  Added in v.1018

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge