SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1041, Thu Feb 18 06:15:15 2010 UTC
# Line 1  Line 1 
1    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/complete.R (stemCompletion): New completion heuristics.
4    
5    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
8    
9    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
10    
11            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
12            setOldClass(c(..., "list")) works.
13    
14    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/transform.R (stemDocument.character): In case input is a
17            simple character just delegate to the default Snowball stemmer.
18    
19    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
20    
21            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
22            data.
23    
24    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
25    
26            * R/doc.R (`Content<-`): Be careful with names attribute.
27    
28    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
29    
30            * R/source.R (DirSource): Improved implementation especially when
31            handling many (>1M) files.
32    
33    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
34    
35            * R/source.R (getElem.URISource): Use encoding argument.
36    
37    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
38    
39            * R/doc.R (setOldClass): Register S3 document classes to be
40            recognized by S4 methods.
41    
42    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/matrix.R (termFreq): Add option to remove punctuation
45            characters.
46    
47    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
48    
49            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
50            merging multiple term-document matrices.
51    
52    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
53    
54            * R/corpus.R (setOldClass): Register S3 corpus classes to be
55            recognized by S4 methods.
56    
57            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
58            that CRAN Mac OS X builds do not fail any longer.
59    
60    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
61    
62            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
63            of RWeka:AlphabeticTokenizer() as default.
64    
65    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
66    
67            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
68            caused words at the beginning or the end of a line not to be removed. Do
69            not delete whitespace anymore.
70    
71    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
72    
73            * R/source.R (DirSource): Default to working directory if no path
74            is specified.
75    
76    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
77    
78            * R/source.R (DirSource): Stop on empty directories.
79    
80    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
81    
82            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
83            named documents.
84    
85    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
86    
87            * R/transform.R (removeWords): Improve regular expressions.
88    
89    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
90    
91            * R/meta.R (DublinCore): Allow lower case tags.
92    
93    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
94    
95            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
96            instead of x$children.
97    
98    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
99    
100            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
101    
102    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
103    
104            * R/: Use S3 instead of S4 class system.
105    
106    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
107    
108            * R/reader.R (readMail): Moved to tm.plugin.mail package.
109    
110    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
113            postings are basically e-mails with some extra headers.
114    
115    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
116    
117            * R/transform.R: Move convertMboxEml, removeCitation,
118            removeMultipart, and removeSignature to the tm.plugin.mail package
119            since they are mainly utility functions (for handling e-mails) and
120            not very framework specific.
121    
122    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
123    
124            * man/: Fix documentation.
125    
126    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
127    
128            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
129            plain text document instead of an XML document for texts of the
130            Reuters-21578 dataset.
131    
132            * R/sparse.R: Removed since the slam package is now available on
133            CRAN.
134    
135            * DESCRIPTION (Depends): Add slam package.
136    
137    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
138    
139            * R/transform.R (stemDoc): Fix character(0) handling.
140    
141    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
142    
143            * R/doc.R (show): Pretty print.
144    
145    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
148            gracefully.
149    
150    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
151    
152            * R/corpus.R: Make corpus virtual. Implement corpus with standard
153            and permanent storage semantics.
154    
155            * DESCRIPTION: New major release. A *lot* of improvements.
156    
157    2009-05-04   Ingo Feinerer <feinerer@logic.at>
158    
159            * NAMESPACE: Export some simple_triplet_matrix functions.
160    
161    2009-04-28   Ingo Feinerer <feinerer@logic.at>
162    
163            * R/weight.R: Adapt tf-idf to new matrix format.
164    
165    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
166    
167            * R/matrix.R: Create two distinct classes for term-document and
168            document-term matrices.
169    
170    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
171    
172            * R/termdocmatrix.R: No longer use Matrix package. This reduces
173            package start-up time significantly.
174    
175    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
176    
177            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
178    
179    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
180    
181            * R/transform.R (tmReduce): Combine multiple maps into one
182            transformation.
183    
184    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/weight.R: Remove weightLogical since it does not return a
187            dgCMatrix.
188    
189            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
190            or TermDocumentMatrix instead.
191    
192    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
193    
194            * inst/doc/extensions.Rnw: Finished vignette.
195    
196    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
197    
198            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
199            DocumentTermMatrix representations.
200    
201    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
202    
203            * R/reader.R (readXML): New reader for arbitrary XML files.
204    
205    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
206    
207            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
208            (XMLSource): New XMLSource class for arbitrary XML files.
209            (Source): New slot Vectorized.
210    
211    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
212    
213            * R/reader.R (readTabular): Experimental reader for tabular data
214            structures which can be customized via user-defined mappings.
215    
216            * R/reader.R: Always use UTC time zone.
217    
218            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
219    
220    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
221    
222            * R/reader.R (readDOC): Options can be passed over to antiword.
223    
224            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
225            pdftotext.
226    
227    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
228    
229            * R/source.R (DirSource): Add pattern and ignore.case arguments
230            which are internally passed over to list.files().
231    
232    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
233    
234            * inst/doc/tm.Rnw: Suppress pointless loading message.
235    
236    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
237    
238            * DESCRIPTION: Speed up package loading (via moving packages not
239            strictly necessary for normal operation to Suggests instead of
240            Depends).
241    
242    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
243    
244            * R/reader.R (readNewsgroup): The date format is now configurable.
245    
246    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
247    
248            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
249    
250    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
251    
252            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
253    
254    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
255    
256            * R/source.R (DataframeSource): New source class for data frames.
257    
258            * R/source.R: Fixed non-standard call evaluation.
259    
260    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
261    
262            * R/source.R (URISource): New source class for a single document.
263    
264    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
265    
266            * R/source.R: Refactoring.
267    
268    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
269    
270            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
271            Rmpi installations more gracefully.
272    
273    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
274    
275            * R/source.R (Source): Add Length slot.
276    
277    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
278    
279            * R/AAA.R: Unify duplicated .onLoad function.
280    
281    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
282    
283            * DESCRIPTION (Suggests): Added Rmpi.
284    
285    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
286    
287            * R/source.R (getElem): Fix 'no visible binding' warning.
288    
289            * man/WeightFunction.Rd: Fix signature.
290    
291    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
292    
293            * R/weight.R: Introduce name abbreviations for weighting functions.
294    
295    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
296    
297            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
298    
299            * R/cluster.R: Provide convenience functions for using a MPI
300            cluster.
301    
302            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
303            available.
304    
305            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
306            available.
307    
308    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
309    
310            * R/textdoccol.R (lapply): Removed debug print out.
311    
312    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
313    
314            * R/reader.R (readRCV1): Improved meta data extraction from
315            Reuters Corpus Volume 1 documents.
316    
317    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
318    
319            * R/transform.R: Ensure that all mappings preserve multiline
320            structures.
321    
322    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/filter.R: Every filter has now an attribute indicating whether
325            it sould be applied to document level (doclevel).
326    
327            * R/textdoccol.R (tmFilter): Set searchFullText as new default
328            filter.
329    
330    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
331    
332            * R/transform.R (replacePatterns): Replaced removeWords by
333            replacePatterns. Suggested by Christian Buchta.
334    
335            * R/textdoccol.R (inspect): Improved formatting.
336    
337    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * inst/CITATION: Updated JSS article information.
340    
341            * R/textdoccol.R (setAs): Added coerce method from list to
342            corpus.
343    
344            * R/meta.R (meta): Improved meta data handling.
345    
346    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
349            Christian Buchta.
350    
351            * inst/CITATION: Added template to include JSS article reference.
352    
353    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
354    
355            * R/textdoccol.R (tmMap): Introduced lazy mapping.
356    
357            * R/source.R: Added VectorSource.
358    
359    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * man/: Language codes should be in ISO 639-1 format.
362    
363            * R/textdoccol.R (asPlain): Preserve local meta data.
364    
365    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
366    
367            * R/textdoccol.R (writeCorpus): Function for writing a corpus
368            containing plain text documents to disk.
369    
370    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
373            always set correctly.
374    
375            * R/textdoccol.R: Set load = TRUE as default for load on demand
376            since in most cases this is the wanted behaviour.
377    
378    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
381    
382            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
383    
384    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
385    
386            * R/meta.R (meta): New function for consistent access to meta data
387            of document collections, repositories, and texts.
388    
389    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/: Better support for encodings.
392    
393    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
396            selection when no reader argument is given.
397    
398    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/source.R (CSVSource): Now uses read.csv instead of scan
401            internally.
402    
403    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/reader.R (getReaders): Returns available reader functions.
406    
407            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
408            as default.
409    
410    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
411    
412            * R/stopwords.R (stopwords): Shortened code, removed codetools
413            variable warnings.
414    
415            * man/: Documentation for showMeta, added an example for tmMap.
416    
417            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
418            some minor typos fixed.
419    
420    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * R/aobjects.R (showMeta): Added method for pretty printing a
423            text document's meta data.
424    
425    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/textdoccol.R (TextDocCol): Better handling of empty
428            arguments.
429    
430            * NAMESPACE: Exported readDOC.
431    
432            * man/completeStems.Rd: Added an example.
433    
434    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
435    
436            * R/stopwords.R (stopwords): Look up .dat files at every
437            call. Allows users to modify stopword .dat files interactively.
438    
439    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441            * R/termdocmatrix.R (termFreq): Correct processing of empty
442            documents.
443    
444    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
445    
446            * man/: Updated documentation.
447    
448    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
449    
450            * R/complete.R (completeStems): Completes (heuristically) word
451            stems.
452    
453            * R/termdocmatrix.R (TermDocMatrix2): New modular
454            constructor.
455    
456            * NAMESPACE: Exported termFreq.
457    
458    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
459    
460            * R/reader.R (readDOC): Added MS Word reader (using antiword).
461    
462    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
463    
464            * R/weight.R: Weighting functions for TermDocMatrix.
465    
466    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
467    
468            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
469            functions for accessing dimension, column, and row names.
470    
471            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
472    
473    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
474    
475            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
476    
477    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
480    
481    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * R/reader.R (readPDF): Removed manual checks for pdftotext and
484            pdfinfo. The system call gives a warning anyway.
485    
486    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
487    
488            * R/textdoccol.R (asPlain): Conversion from
489            StructuredTextDocuments to PlainTextDocuments.
490    
491    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
492    
493            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
494            for accessing term-document matrices.
495    
496            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
497            are installed.
498    
499    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
500    
501            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
502            Christian Buchta.
503    
504    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
505    
506            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
507    
508    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
509    
510            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
511    
512            * R/reader.R (readPDF): Added PDF reader.
513    
514    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
515    
516            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
517    
518            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
519    
520            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
521    
522            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
523    
524    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
525    
526            * R/distmeasure.R (dissimilarity): Replaced dists call from
527            package cba by new dist call from package proxy.
528    
529    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
530    
531            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
532    
533    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
534    
535            * R/termdocmatrix.R: require() uses the quietly option to suppress
536            loading messages.
537    
538    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
539    
540            * R/dictionary.R: Added dictionary support.
541    
542    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
543    
544            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
545            documents. This simplifies some functions, e.g., asPlain.
546    
547    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
548    
549            * inst/doc/tm.Rnw: Fixed some typos in vignette.
550    
551    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
552    
553            * R/textdoccol.R (replaceWords): Added method to replace a set of
554            words by a single word. Useful for synonyms.
555    
556    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
557    
558            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
559    
560    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
561    
562            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
563            vectors. Thanks to Ariel Maguyon for his error report.
564            (removeSparseTerms): New function to remove columns from a
565            term-document matrix exceeding a sparse factor.
566    
567    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
568    
569            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
570    
571    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
572    
573            * man/sFilter.Rd: Corrected documentation on statement format (use
574            '==' instead of '=').
575    
576    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
577    
578            * R/aobjects.R (StructuredTextDocument): Inherits from
579            TextDocument.
580    
581    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
582    
583            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
584            on sparse matrices as proposed by Martin Maechler.
585    
586    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
589            \pkg{filehash} version makes them deprecated.
590    
591    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
592    
593            * R/termdocmatrix.R (textvector): Stemming is now performed before
594            erasing stopwords.
595            (weightMatrix): Adapted to handle sparse matrices.
596            (TermDocMatrix): Sparse matrix is now efficiently built by
597            direct stepwise insertion of row values into it.
598    
599    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
600    
601            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
602            due to ongoing problems. For our purposes the latter is as useful
603            as the replaced package.
604    
605    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
606    
607            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
608    
609            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
610    
611    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
612    
613            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
614            languages with available stopwords.
615    
616    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
617    
618            * inst/doc/tm.Rnw: Minor corrections in the vignette.
619    
620    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
621    
622            * DESCRIPTION: Update to version 0.2, since a lot of new features
623            have been integrated.
624    
625            * inst/stopwords: Updated existing stopwords and added stopwords
626            for various other languages.
627    
628    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * man/: Updated documentation.
631    
632            * Work/testDb.R: Script to test database stuff.
633    
634            * R/: Fixed various database related bugs. Seems to be rather
635            useable now, i.e., consider as alpha status for now.
636    
637    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
638    
639            * R/: Fixed some bugs related to database support.
640    
641    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
642    
643            * man/: Added a lot of examples to the manuals.
644    
645    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
646    
647            * man/: Updated parts of the documentation.
648    
649            * R/textdoccol.R (asPlain): Added conversion from newsgroup
650            documents to plain text documents.
651    
652    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
653    
654            * R/textdoccol.R: Finished experimental database support. Not yet
655            intensively tested.
656    
657            * R/source.R: Now each source has a default reader.
658    
659            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
660            class anymore.
661    
662            * R/plaintextdoc.R: Custom show method for plain text documents.
663    
664            * R/aobjects.R: Added a class for structured text documents.
665    
666            * R/reader.R: Replaced remaining \code{parser} occurrences with
667            \code{reader}.
668    
669            * R/textdoccol.R (summary): Indent tags.
670    
671            * R/textdoccol.R (removePunctuation): Transform method to remove
672            punctuation marks.
673    
674    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
675    
676            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
677            using prescindMeta().
678    
679    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
680    
681            * R/textdoccol.R: Improved database support.
682    
683    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
686    
687            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
688            language code.
689    
690            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
691            into parserControl argument.
692    
693            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
694    
695    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
696    
697            * Work/tmDataSetup.R: The datasets acq and crude can now be
698            created on the fly.
699    
700            * R/stopwords.R: Introduced a function returning the stopwords for
701            a given language (English, German and French at the moment)
702    
703            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
704            otherwise falls back to Snowball package.
705    
706    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
707    
708            * man/dissimilarity-methods.Rd: Make clear that any method offered
709            by "dists" from package "cba" can be used.
710    
711    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
712    
713            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
714            to Kurt's latex suggestion. Removed points and underscores in
715            variable names for consistent naming.
716    
717            * DESCRIPTION: Update to version 0.1-2.
718    
719            * man/TextRepository.Rd: Fixed bug in documentation.
720    
721    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
722    
723            * DESCRIPTION: Update to version 0.1-1.
724    
725    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
726    
727            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
728            wordStem.
729    
730    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
731    
732            * R/: Changes due to Kurt's review.
733    
734    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
735    
736            * R/: Implemented improvements based upon comments by David
737            Meyer.
738    
739    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
740    
741            * inst/doc/: Rewrote vignette.
742    
743            * man/: Improved documentation.
744    
745    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
746    
747            * man/: Updated documentation.
748    
749            * DESCRIPTION: Changed package name to "tm". Updated version to
750            0.1 for first CRAN release.
751    
752            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
753            list archive example.
754    
755            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
756            archive example.
757    
758            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
759            from (several mails per box) mbox format to (single mail per file)
760            eml format.
761    
762    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
763    
764            * data/crude.rda: Rebuilt.
765    
766            * data/acq.rda: Rebuilt.
767    
768            * R/reader.R: Factored out reader and parser methods from
769            textdoccol.R.
770    
771            * R/source.R: Factored out Source methods from aobjects.R and
772            textdoccol.R.
773            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
774            feeds.
775    
776            * R/textdoccol.R (DirSource): Added support for recursive
777            traversal of directories.
778    
779    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
780    
781            * R/textdoccol.R ([[): Loads the document corpus automatically
782            into memory upon access.
783            (tm_transform, tm_filter): Removed several checks whether the
784            document is already loaded ([[ ensures this now).
785            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
786            mailing list archive.
787    
788    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
789    
790            * R/aobjects.R (TextDocument): Is now a virtual class.
791            (Source): Is now a virtual class.
792    
793    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
794    
795            * R/textdoccol.R (c): Support for an arbitrary number of document
796            collections.
797    
798    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
799    
800            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
801            append_meta and remove_meta.
802    
803            * R/textdoccol.R: Removed modify_metadata method.
804    
805            * R/textrepo.R: Removed modify_metadata method.
806    
807            * R/textdoccol.R (remove_meta): Supports removal of document
808            collection metadata and document (= in data frame) metadata.
809    
810    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
811    
812            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
813    
814            * data/crude.rda: Rebuilt.
815    
816            * data/acq.rda: Rebuilt.
817    
818            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
819    
820            * R/textdoccol.R ([): Bug fix for subsetting a document
821            collection's data frame.
822    
823    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
824    
825            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
826            to s_filter.
827    
828            * R/textdoccol.R: Local text documents' metadata can now be copied
829            to a document collection's data frame with prescind_meta.
830    
831    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/: Text documents' slot metadata is now accessible in s_filter.
834    
835            * R/: Rewrote s_filter function (has still some restrictions).
836    
837    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
838    
839            * R/: Various fixes in handling metadata.
840    
841            * R/: Added update mechanism for text document collections.
842    
843    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
844    
845            * R/: Merging of document collections now creates a binary tree
846            for reconstructing merged document collections.
847    
848            * R/: Redesign of metadata for document collections.
849    
850    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
851    
852            * R/: Messages now use \code{ngettext}.
853    
854    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
855    
856            * R/: Added functions for modifying and removing metadata.
857    
858    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
859    
860            * man/: Updated some documentation.
861    
862            * R/: Corrected some connection issues.
863    
864            * inst/doc: Worked on the vignette.
865    
866    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
867    
868            * inst/: Added texts and started vignette.
869    
870            * R/: Final changes based upon David's comments.
871    
872    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
873    
874            * NAMESPACE: Corrected exports (generic methods need exportMethods
875            directives!).
876    
877    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
878    
879            * R/: Modified the TextDocCol constructur and various parsers. It
880            is now modular and supports various file formats via plugins (see
881            the new "Source" class).
882    
883    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
884    
885            * man/: Revised documentation after previous code changes.
886    
887    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * R/: Remaining changes as discussed with David.
890    
891    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
892    
893            * R/: Some changes as suggested by David. The rest will follow
894            within the next days.
895    
896    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
897    
898            * man/: Finished documentation.
899    
900    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
901    
902            * man/: Wrote some documentation.
903    
904    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
905    
906            * R/: Further syntactic sugar in form of additional assignment and
907            accessor methods.
908    
909    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
910    
911            * R/: Syntactic sugar in form of "length", "show" and "summary"
912            operators.
913    
914    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
915    
916            * R/: Diverse updates. Mainly on default operators ("[" or "c")
917            and dissimilarities.
918    
919    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
920    
921            * R/: Added similarity functions.
922    
923            * data/: Added english stopwords.
924    
925    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
926    
927            * data/: Examples compiled for new features
928    
929            * R/: Changes due to new structure.
930    
931            * NAMESPACE: Corrected namespace to reflect new structure.
932    
933            * R/termdocmatrix.R: Adapted for new naming scheme.
934    
935    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
936    
937            * R/textdoccol.R: Adapted code for new class structure. Wrote
938            several transform and filter functions operating on text document
939            collections (alias text document databases).
940    
941            * R/aobjects.R: Adapted class structure with inheritance,
942            repositories and additional meta data. Loading files on demand is
943            now possible.
944    
945    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
946    
947            * R/: Some cosmetic cleanups.
948    
949            * inst/: Removed vignette on clustering. That and much more is now
950            described in the JSS paper on text mining. Based upon that
951            article an elaborated vignette will be incorporated in the future.
952    
953    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
954    
955            * R/: Updated generic S4 methods to comply with signature changes
956            in newer versions of R (> 2.3)
957    
958    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
959    
960            * ext/R/importRIS.R: Automatic RIS import is now possible.
961    
962    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
963    
964            * R/textdoccol.R: Added RIS HTML input format.
965    
966    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
967    
968            * R/textdoccol.R: Removed bug that caused invalid text document
969            collections when handling many input files.
970    
971  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
972    
973          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1041

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge