SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC pkg/ChangeLog revision 1046, Fri Feb 26 12:45:38 2010 UTC
# Line 1  Line 1 
1    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/score.R (tm_tag_score): Compute a score from the number of
4            tags matching in a document.
5    
6    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/complete.R (stemCompletion): New completion heuristics.
9    
10    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
13    
14    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
15    
16            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
17            setOldClass(c(..., "list")) works.
18    
19    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
20    
21            * R/transform.R (stemDocument.character): In case input is a
22            simple character just delegate to the default Snowball stemmer.
23    
24    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
25    
26            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
27            data.
28    
29    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
30    
31            * R/doc.R (`Content<-`): Be careful with names attribute.
32    
33    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
34    
35            * R/source.R (DirSource): Improved implementation especially when
36            handling many (> 1M) files.
37    
38    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
39    
40            * R/source.R (getElem.URISource): Use encoding argument.
41    
42    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/doc.R (setOldClass): Register S3 document classes to be
45            recognized by S4 methods.
46    
47    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
48    
49            * R/matrix.R (termFreq): Add option to remove punctuation
50            characters.
51    
52    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
53    
54            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
55            merging multiple term-document matrices.
56    
57    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
58    
59            * R/corpus.R (setOldClass): Register S3 corpus classes to be
60            recognized by S4 methods.
61    
62            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
63            that CRAN Mac OS X builds do not fail any longer.
64    
65    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
66    
67            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
68            of RWeka:AlphabeticTokenizer() as default.
69    
70    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
71    
72            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
73            caused words at the beginning or the end of a line not to be removed. Do
74            not delete whitespace anymore.
75    
76    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
77    
78            * R/source.R (DirSource): Default to working directory if no path
79            is specified.
80    
81    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
82    
83            * R/source.R (DirSource): Stop on empty directories.
84    
85    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
86    
87            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
88            named documents.
89    
90    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
91    
92            * R/transform.R (removeWords): Improve regular expressions.
93    
94    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
95    
96            * R/meta.R (DublinCore): Allow lower case tags.
97    
98    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
99    
100            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
101            instead of x$children.
102    
103    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
104    
105            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
106    
107    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
108    
109            * R/: Use S3 instead of S4 class system.
110    
111    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
112    
113            * R/reader.R (readMail): Moved to tm.plugin.mail package.
114    
115    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
116    
117            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
118            postings are basically e-mails with some extra headers.
119    
120    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
121    
122            * R/transform.R: Move convertMboxEml, removeCitation,
123            removeMultipart, and removeSignature to the tm.plugin.mail package
124            since they are mainly utility functions (for handling e-mails) and
125            not very framework specific.
126    
127    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
128    
129            * man/: Fix documentation.
130    
131    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
132    
133            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
134            plain text document instead of an XML document for texts of the
135            Reuters-21578 dataset.
136    
137            * R/sparse.R: Removed since the slam package is now available on
138            CRAN.
139    
140            * DESCRIPTION (Depends): Add slam package.
141    
142    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
143    
144            * R/transform.R (stemDoc): Fix character(0) handling.
145    
146    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
147    
148            * R/doc.R (show): Pretty print.
149    
150    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
151    
152            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
153            gracefully.
154    
155    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
156    
157            * R/corpus.R: Make corpus virtual. Implement corpus with standard
158            and permanent storage semantics.
159    
160            * DESCRIPTION: New major release. A *lot* of improvements.
161    
162    2009-05-04   Ingo Feinerer <feinerer@logic.at>
163    
164            * NAMESPACE: Export some simple_triplet_matrix functions.
165    
166    2009-04-28   Ingo Feinerer <feinerer@logic.at>
167    
168            * R/weight.R: Adapt tf-idf to new matrix format.
169    
170    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
171    
172            * R/matrix.R: Create two distinct classes for term-document and
173            document-term matrices.
174    
175    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
176    
177            * R/termdocmatrix.R: No longer use Matrix package. This reduces
178            package start-up time significantly.
179    
180    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
181    
182            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
183    
184    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/transform.R (tmReduce): Combine multiple maps into one
187            transformation.
188    
189    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
190    
191            * R/weight.R: Remove weightLogical since it does not return a
192            dgCMatrix.
193    
194            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
195            or TermDocumentMatrix instead.
196    
197    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
198    
199            * inst/doc/extensions.Rnw: Finished vignette.
200    
201    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
202    
203            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
204            DocumentTermMatrix representations.
205    
206    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
207    
208            * R/reader.R (readXML): New reader for arbitrary XML files.
209    
210    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
211    
212            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
213            (XMLSource): New XMLSource class for arbitrary XML files.
214            (Source): New slot Vectorized.
215    
216    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
217    
218            * R/reader.R (readTabular): Experimental reader for tabular data
219            structures which can be customized via user-defined mappings.
220    
221            * R/reader.R: Always use UTC time zone.
222    
223            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
224    
225    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
226    
227            * R/reader.R (readDOC): Options can be passed over to antiword.
228    
229            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
230            pdftotext.
231    
232    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
233    
234            * R/source.R (DirSource): Add pattern and ignore.case arguments
235            which are internally passed over to list.files().
236    
237    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
238    
239            * inst/doc/tm.Rnw: Suppress pointless loading message.
240    
241    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
242    
243            * DESCRIPTION: Speed up package loading (via moving packages not
244            strictly necessary for normal operation to Suggests instead of
245            Depends).
246    
247    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
248    
249            * R/reader.R (readNewsgroup): The date format is now configurable.
250    
251    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
252    
253            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
254    
255    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
256    
257            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
258    
259    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
260    
261            * R/source.R (DataframeSource): New source class for data frames.
262    
263            * R/source.R: Fixed non-standard call evaluation.
264    
265    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
266    
267            * R/source.R (URISource): New source class for a single document.
268    
269    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
270    
271            * R/source.R: Refactoring.
272    
273    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
274    
275            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
276            Rmpi installations more gracefully.
277    
278    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
279    
280            * R/source.R (Source): Add Length slot.
281    
282    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
283    
284            * R/AAA.R: Unify duplicated .onLoad function.
285    
286    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
287    
288            * DESCRIPTION (Suggests): Added Rmpi.
289    
290    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
291    
292            * R/source.R (getElem): Fix 'no visible binding' warning.
293    
294            * man/WeightFunction.Rd: Fix signature.
295    
296    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
297    
298            * R/weight.R: Introduce name abbreviations for weighting functions.
299    
300    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
301    
302            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
303    
304            * R/cluster.R: Provide convenience functions for using a MPI
305            cluster.
306    
307            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
308            available.
309    
310            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
311            available.
312    
313    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
314    
315            * R/textdoccol.R (lapply): Removed debug print out.
316    
317    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
318    
319            * R/reader.R (readRCV1): Improved meta data extraction from
320            Reuters Corpus Volume 1 documents.
321    
322    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/transform.R: Ensure that all mappings preserve multiline
325            structures.
326    
327    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
328    
329            * R/filter.R: Every filter has now an attribute indicating whether
330            it sould be applied to document level (doclevel).
331    
332            * R/textdoccol.R (tmFilter): Set searchFullText as new default
333            filter.
334    
335    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
336    
337            * R/transform.R (replacePatterns): Replaced removeWords by
338            replacePatterns. Suggested by Christian Buchta.
339    
340            * R/textdoccol.R (inspect): Improved formatting.
341    
342    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
343    
344            * inst/CITATION: Updated JSS article information.
345    
346            * R/textdoccol.R (setAs): Added coerce method from list to
347            corpus.
348    
349            * R/meta.R (meta): Improved meta data handling.
350    
351    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
354            Christian Buchta.
355    
356            * inst/CITATION: Added template to include JSS article reference.
357    
358    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
359    
360            * R/textdoccol.R (tmMap): Introduced lazy mapping.
361    
362            * R/source.R: Added VectorSource.
363    
364    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * man/: Language codes should be in ISO 639-1 format.
367    
368            * R/textdoccol.R (asPlain): Preserve local meta data.
369    
370    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * R/textdoccol.R (writeCorpus): Function for writing a corpus
373            containing plain text documents to disk.
374    
375    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
376    
377            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
378            always set correctly.
379    
380            * R/textdoccol.R: Set load = TRUE as default for load on demand
381            since in most cases this is the wanted behaviour.
382    
383    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
384    
385            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
386    
387            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
388    
389    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/meta.R (meta): New function for consistent access to meta data
392            of document collections, repositories, and texts.
393    
394    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/: Better support for encodings.
397    
398    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
401            selection when no reader argument is given.
402    
403    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/source.R (CSVSource): Now uses read.csv instead of scan
406            internally.
407    
408    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
409    
410            * R/reader.R (getReaders): Returns available reader functions.
411    
412            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
413            as default.
414    
415    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
416    
417            * R/stopwords.R (stopwords): Shortened code, removed codetools
418            variable warnings.
419    
420            * man/: Documentation for showMeta, added an example for tmMap.
421    
422            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
423            some minor typos fixed.
424    
425    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/aobjects.R (showMeta): Added method for pretty printing a
428            text document's meta data.
429    
430    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * R/textdoccol.R (TextDocCol): Better handling of empty
433            arguments.
434    
435            * NAMESPACE: Exported readDOC.
436    
437            * man/completeStems.Rd: Added an example.
438    
439    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441            * R/stopwords.R (stopwords): Look up .dat files at every
442            call. Allows users to modify stopword .dat files interactively.
443    
444    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
445    
446            * R/termdocmatrix.R (termFreq): Correct processing of empty
447            documents.
448    
449    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
450    
451            * man/: Updated documentation.
452    
453    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
454    
455            * R/complete.R (completeStems): Completes (heuristically) word
456            stems.
457    
458            * R/termdocmatrix.R (TermDocMatrix2): New modular
459            constructor.
460    
461            * NAMESPACE: Exported termFreq.
462    
463    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * R/reader.R (readDOC): Added MS Word reader (using antiword).
466    
467    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
468    
469            * R/weight.R: Weighting functions for TermDocMatrix.
470    
471    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
472    
473            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
474            functions for accessing dimension, column, and row names.
475    
476            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
477    
478    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
479    
480            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
481    
482    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
483    
484            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
485    
486    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
487    
488            * R/reader.R (readPDF): Removed manual checks for pdftotext and
489            pdfinfo. The system call gives a warning anyway.
490    
491    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
492    
493            * R/textdoccol.R (asPlain): Conversion from
494            StructuredTextDocuments to PlainTextDocuments.
495    
496    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
497    
498            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
499            for accessing term-document matrices.
500    
501            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
502            are installed.
503    
504    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
505    
506            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
507            Christian Buchta.
508    
509    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
510    
511            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
512    
513    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
514    
515            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
516    
517            * R/reader.R (readPDF): Added PDF reader.
518    
519    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
520    
521            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
522    
523            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
524    
525            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
526    
527            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
528    
529    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
530    
531            * R/distmeasure.R (dissimilarity): Replaced dists call from
532            package cba by new dist call from package proxy.
533    
534    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
535    
536            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
537    
538    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
539    
540            * R/termdocmatrix.R: require() uses the quietly option to suppress
541            loading messages.
542    
543    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
544    
545            * R/dictionary.R: Added dictionary support.
546    
547    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
548    
549            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
550            documents. This simplifies some functions, e.g., asPlain.
551    
552    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
553    
554            * inst/doc/tm.Rnw: Fixed some typos in vignette.
555    
556    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
557    
558            * R/textdoccol.R (replaceWords): Added method to replace a set of
559            words by a single word. Useful for synonyms.
560    
561    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
562    
563            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
564    
565    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
566    
567            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
568            vectors. Thanks to Ariel Maguyon for his error report.
569            (removeSparseTerms): New function to remove columns from a
570            term-document matrix exceeding a sparse factor.
571    
572    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
573    
574            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
575    
576    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
577    
578            * man/sFilter.Rd: Corrected documentation on statement format (use
579            '==' instead of '=').
580    
581    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
582    
583            * R/aobjects.R (StructuredTextDocument): Inherits from
584            TextDocument.
585    
586    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
589            on sparse matrices as proposed by Martin Maechler.
590    
591    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
592    
593            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
594            \pkg{filehash} version makes them deprecated.
595    
596    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
597    
598            * R/termdocmatrix.R (textvector): Stemming is now performed before
599            erasing stopwords.
600            (weightMatrix): Adapted to handle sparse matrices.
601            (TermDocMatrix): Sparse matrix is now efficiently built by
602            direct stepwise insertion of row values into it.
603    
604    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
605    
606            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
607            due to ongoing problems. For our purposes the latter is as useful
608            as the replaced package.
609    
610    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
611    
612            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
613    
614            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
615    
616    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
617    
618            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
619            languages with available stopwords.
620    
621    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
622    
623            * inst/doc/tm.Rnw: Minor corrections in the vignette.
624    
625    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
626    
627            * DESCRIPTION: Update to version 0.2, since a lot of new features
628            have been integrated.
629    
630            * inst/stopwords: Updated existing stopwords and added stopwords
631            for various other languages.
632    
633    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
634    
635            * man/: Updated documentation.
636    
637            * Work/testDb.R: Script to test database stuff.
638    
639            * R/: Fixed various database related bugs. Seems to be rather
640            useable now, i.e., consider as alpha status for now.
641    
642    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
643    
644            * R/: Fixed some bugs related to database support.
645    
646    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
647    
648            * man/: Added a lot of examples to the manuals.
649    
650    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
651    
652            * man/: Updated parts of the documentation.
653    
654            * R/textdoccol.R (asPlain): Added conversion from newsgroup
655            documents to plain text documents.
656    
657    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * R/textdoccol.R: Finished experimental database support. Not yet
660            intensively tested.
661    
662            * R/source.R: Now each source has a default reader.
663    
664            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
665            class anymore.
666    
667            * R/plaintextdoc.R: Custom show method for plain text documents.
668    
669            * R/aobjects.R: Added a class for structured text documents.
670    
671            * R/reader.R: Replaced remaining \code{parser} occurrences with
672            \code{reader}.
673    
674            * R/textdoccol.R (summary): Indent tags.
675    
676            * R/textdoccol.R (removePunctuation): Transform method to remove
677            punctuation marks.
678    
679    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
680    
681            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
682            using prescindMeta().
683    
684    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
685    
686            * R/textdoccol.R: Improved database support.
687    
688    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
689    
690            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
691    
692            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
693            language code.
694    
695            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
696            into parserControl argument.
697    
698            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
699    
700    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
701    
702            * Work/tmDataSetup.R: The datasets acq and crude can now be
703            created on the fly.
704    
705            * R/stopwords.R: Introduced a function returning the stopwords for
706            a given language (English, German and French at the moment)
707    
708            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
709            otherwise falls back to Snowball package.
710    
711    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
712    
713            * man/dissimilarity-methods.Rd: Make clear that any method offered
714            by "dists" from package "cba" can be used.
715    
716    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
717    
718            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
719            to Kurt's latex suggestion. Removed points and underscores in
720            variable names for consistent naming.
721    
722            * DESCRIPTION: Update to version 0.1-2.
723    
724            * man/TextRepository.Rd: Fixed bug in documentation.
725    
726    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
727    
728            * DESCRIPTION: Update to version 0.1-1.
729    
730    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
731    
732            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
733            wordStem.
734    
735    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
736    
737            * R/: Changes due to Kurt's review.
738    
739    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
740    
741            * R/: Implemented improvements based upon comments by David
742            Meyer.
743    
744    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
745    
746            * inst/doc/: Rewrote vignette.
747    
748            * man/: Improved documentation.
749    
750    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
751    
752            * man/: Updated documentation.
753    
754            * DESCRIPTION: Changed package name to "tm". Updated version to
755            0.1 for first CRAN release.
756    
757            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
758            list archive example.
759    
760            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
761            archive example.
762    
763            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
764            from (several mails per box) mbox format to (single mail per file)
765            eml format.
766    
767    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
768    
769            * data/crude.rda: Rebuilt.
770    
771            * data/acq.rda: Rebuilt.
772    
773            * R/reader.R: Factored out reader and parser methods from
774            textdoccol.R.
775    
776            * R/source.R: Factored out Source methods from aobjects.R and
777            textdoccol.R.
778            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
779            feeds.
780    
781            * R/textdoccol.R (DirSource): Added support for recursive
782            traversal of directories.
783    
784    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
785    
786            * R/textdoccol.R ([[): Loads the document corpus automatically
787            into memory upon access.
788            (tm_transform, tm_filter): Removed several checks whether the
789            document is already loaded ([[ ensures this now).
790            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
791            mailing list archive.
792    
793    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
794    
795            * R/aobjects.R (TextDocument): Is now a virtual class.
796            (Source): Is now a virtual class.
797    
798    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
799    
800            * R/textdoccol.R (c): Support for an arbitrary number of document
801            collections.
802    
803    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
806            append_meta and remove_meta.
807    
808            * R/textdoccol.R: Removed modify_metadata method.
809    
810            * R/textrepo.R: Removed modify_metadata method.
811    
812            * R/textdoccol.R (remove_meta): Supports removal of document
813            collection metadata and document (= in data frame) metadata.
814    
815    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
816    
817            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
818    
819            * data/crude.rda: Rebuilt.
820    
821            * data/acq.rda: Rebuilt.
822    
823            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
824    
825            * R/textdoccol.R ([): Bug fix for subsetting a document
826            collection's data frame.
827    
828    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
829    
830            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
831            to s_filter.
832    
833            * R/textdoccol.R: Local text documents' metadata can now be copied
834            to a document collection's data frame with prescind_meta.
835    
836    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
837    
838            * R/: Text documents' slot metadata is now accessible in s_filter.
839    
840            * R/: Rewrote s_filter function (has still some restrictions).
841    
842    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
843    
844            * R/: Various fixes in handling metadata.
845    
846            * R/: Added update mechanism for text document collections.
847    
848    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
849    
850            * R/: Merging of document collections now creates a binary tree
851            for reconstructing merged document collections.
852    
853            * R/: Redesign of metadata for document collections.
854    
855    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * R/: Messages now use \code{ngettext}.
858    
859    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
860    
861            * R/: Added functions for modifying and removing metadata.
862    
863    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
864    
865            * man/: Updated some documentation.
866    
867            * R/: Corrected some connection issues.
868    
869            * inst/doc: Worked on the vignette.
870    
871    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
872    
873            * inst/: Added texts and started vignette.
874    
875            * R/: Final changes based upon David's comments.
876    
877    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
878    
879            * NAMESPACE: Corrected exports (generic methods need exportMethods
880            directives!).
881    
882    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
883    
884            * R/: Modified the TextDocCol constructur and various parsers. It
885            is now modular and supports various file formats via plugins (see
886            the new "Source" class).
887    
888    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
889    
890            * man/: Revised documentation after previous code changes.
891    
892    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894            * R/: Remaining changes as discussed with David.
895    
896    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
897    
898            * R/: Some changes as suggested by David. The rest will follow
899            within the next days.
900    
901    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
902    
903            * man/: Finished documentation.
904    
905    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
906    
907            * man/: Wrote some documentation.
908    
909    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
910    
911            * R/: Further syntactic sugar in form of additional assignment and
912            accessor methods.
913    
914    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
915    
916            * R/: Syntactic sugar in form of "length", "show" and "summary"
917            operators.
918    
919    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
920    
921            * R/: Diverse updates. Mainly on default operators ("[" or "c")
922            and dissimilarities.
923    
924    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
925    
926            * R/: Added similarity functions.
927    
928            * data/: Added english stopwords.
929    
930    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
931    
932            * data/: Examples compiled for new features
933    
934            * R/: Changes due to new structure.
935    
936            * NAMESPACE: Corrected namespace to reflect new structure.
937    
938            * R/termdocmatrix.R: Adapted for new naming scheme.
939    
940    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
941    
942            * R/textdoccol.R: Adapted code for new class structure. Wrote
943            several transform and filter functions operating on text document
944            collections (alias text document databases).
945    
946            * R/aobjects.R: Adapted class structure with inheritance,
947            repositories and additional meta data. Loading files on demand is
948            now possible.
949    
950    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
951    
952            * R/: Some cosmetic cleanups.
953    
954            * inst/: Removed vignette on clustering. That and much more is now
955            described in the JSS paper on text mining. Based upon that
956            article an elaborated vignette will be incorporated in the future.
957    
958    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
959    
960            * R/: Updated generic S4 methods to comply with signature changes
961            in newer versions of R (> 2.3)
962    
963    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
964    
965            * ext/R/importRIS.R: Automatic RIS import is now possible.
966    
967    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
968    
969            * R/textdoccol.R: Added RIS HTML input format.
970    
971    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
972    
973            * R/textdoccol.R: Removed bug that caused invalid text document
974            collections when handling many input files.
975    
976    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
977    
978            * R/textdoccol.R: Restructured and extended file import
979            mechanism.
980    
981            * inst/doc/clustering.Rnw: Adapted vignette for use with
982            ReutNews.rda
983    
984            * man/ReutNews.Rd: Documentation for ReutNews.rda
985    
986            * data/ReutNews.rda: A tiny Reuters21578 example data set.
987    
988    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
989    
990            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
991            clustering facilities of this package.
992    
993    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
994    
995            * R/aobjects.R: Changed package document structure to avoid class
996            dependency problems.
997    
998  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
999    
1000            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
1001            data set.
1002    
1003          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
1004          CMD check textmin" works without errors.          CMD check textmin" works without errors.
1005    

Legend:
Removed from v.28  
changed lines
  Added in v.1046

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge