SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 34, Thu Dec 22 15:18:10 2005 UTC pkg/ChangeLog revision 1062, Wed Apr 7 17:25:20 2010 UTC
# Line 1  Line 1 
1    2010-04-07  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/meta.R (`content_or_meta`): Utility function.
4    
5    2010-03-19  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/reader.R (readReut21578XML, readReut21578XMLasPlain): Extract
8            TOPICS, LEWISSPLIT, CGISPLIT, and OLDID meta tags.
9    
10    2010-03-03  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/weight.R (weightTfIdf): Added normalization option.
13    
14            * man/tm_tag_score.Rd: Add General Inquirer example for sentiment
15            analysis.
16    
17    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
18    
19            * R/score.R (tm_tag_score): Compute a score from the number of
20            tags matching in a document.
21    
22    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
23    
24            * R/complete.R (stemCompletion): New completion heuristics.
25    
26    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
27    
28            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
29    
30    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
31    
32            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
33            setOldClass(c(..., "list")) works.
34    
35    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
36    
37            * R/transform.R (stemDocument.character): In case input is a
38            simple character just delegate to the default Snowball stemmer.
39    
40    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
41    
42            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
43            data.
44    
45    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
46    
47            * R/doc.R (`Content<-`): Be careful with names attribute.
48    
49    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
50    
51            * R/source.R (DirSource): Improved implementation especially when
52            handling many (> 1M) files.
53    
54    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
55    
56            * R/source.R (getElem.URISource): Use encoding argument.
57    
58    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
59    
60            * R/doc.R (setOldClass): Register S3 document classes to be
61            recognized by S4 methods.
62    
63    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
64    
65            * R/matrix.R (termFreq): Add option to remove punctuation
66            characters.
67    
68    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
69    
70            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
71            merging multiple term-document matrices.
72    
73    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
74    
75            * R/corpus.R (setOldClass): Register S3 corpus classes to be
76            recognized by S4 methods.
77    
78            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
79            that CRAN Mac OS X builds do not fail any longer.
80    
81    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
82    
83            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
84            of RWeka:AlphabeticTokenizer() as default.
85    
86    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
89            caused words at the beginning or the end of a line not to be removed. Do
90            not delete whitespace anymore.
91    
92    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
93    
94            * R/source.R (DirSource): Default to working directory if no path
95            is specified.
96    
97    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
98    
99            * R/source.R (DirSource): Stop on empty directories.
100    
101    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
102    
103            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
104            named documents.
105    
106    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
107    
108            * R/transform.R (removeWords): Improve regular expressions.
109    
110    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/meta.R (DublinCore): Allow lower case tags.
113    
114    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
117            instead of x$children.
118    
119    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
120    
121            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
122    
123    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
124    
125            * R/: Use S3 instead of S4 class system.
126    
127    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
128    
129            * R/reader.R (readMail): Moved to tm.plugin.mail package.
130    
131    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
132    
133            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
134            postings are basically e-mails with some extra headers.
135    
136    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
137    
138            * R/transform.R: Move convertMboxEml, removeCitation,
139            removeMultipart, and removeSignature to the tm.plugin.mail package
140            since they are mainly utility functions (for handling e-mails) and
141            not very framework specific.
142    
143    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
144    
145            * man/: Fix documentation.
146    
147    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
148    
149            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
150            plain text document instead of an XML document for texts of the
151            Reuters-21578 dataset.
152    
153            * R/sparse.R: Removed since the slam package is now available on
154            CRAN.
155    
156            * DESCRIPTION (Depends): Add slam package.
157    
158    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
159    
160            * R/transform.R (stemDoc): Fix character(0) handling.
161    
162    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
163    
164            * R/doc.R (show): Pretty print.
165    
166    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
167    
168            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
169            gracefully.
170    
171    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
172    
173            * R/corpus.R: Make corpus virtual. Implement corpus with standard
174            and permanent storage semantics.
175    
176            * DESCRIPTION: New major release. A *lot* of improvements.
177    
178    2009-05-04   Ingo Feinerer <feinerer@logic.at>
179    
180            * NAMESPACE: Export some simple_triplet_matrix functions.
181    
182    2009-04-28   Ingo Feinerer <feinerer@logic.at>
183    
184            * R/weight.R: Adapt tf-idf to new matrix format.
185    
186    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
187    
188            * R/matrix.R: Create two distinct classes for term-document and
189            document-term matrices.
190    
191    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
192    
193            * R/termdocmatrix.R: No longer use Matrix package. This reduces
194            package start-up time significantly.
195    
196    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
197    
198            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
199    
200    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
201    
202            * R/transform.R (tmReduce): Combine multiple maps into one
203            transformation.
204    
205    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
206    
207            * R/weight.R: Remove weightLogical since it does not return a
208            dgCMatrix.
209    
210            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
211            or TermDocumentMatrix instead.
212    
213    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
214    
215            * inst/doc/extensions.Rnw: Finished vignette.
216    
217    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
218    
219            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
220            DocumentTermMatrix representations.
221    
222    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
223    
224            * R/reader.R (readXML): New reader for arbitrary XML files.
225    
226    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
227    
228            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
229            (XMLSource): New XMLSource class for arbitrary XML files.
230            (Source): New slot Vectorized.
231    
232    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
233    
234            * R/reader.R (readTabular): Experimental reader for tabular data
235            structures which can be customized via user-defined mappings.
236    
237            * R/reader.R: Always use UTC time zone.
238    
239            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
240    
241    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
242    
243            * R/reader.R (readDOC): Options can be passed over to antiword.
244    
245            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
246            pdftotext.
247    
248    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
249    
250            * R/source.R (DirSource): Add pattern and ignore.case arguments
251            which are internally passed over to list.files().
252    
253    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
254    
255            * inst/doc/tm.Rnw: Suppress pointless loading message.
256    
257    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
258    
259            * DESCRIPTION: Speed up package loading (via moving packages not
260            strictly necessary for normal operation to Suggests instead of
261            Depends).
262    
263    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
264    
265            * R/reader.R (readNewsgroup): The date format is now configurable.
266    
267    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
268    
269            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
270    
271    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
272    
273            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
274    
275    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
276    
277            * R/source.R (DataframeSource): New source class for data frames.
278    
279            * R/source.R: Fixed non-standard call evaluation.
280    
281    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
282    
283            * R/source.R (URISource): New source class for a single document.
284    
285    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
286    
287            * R/source.R: Refactoring.
288    
289    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
290    
291            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
292            Rmpi installations more gracefully.
293    
294    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
295    
296            * R/source.R (Source): Add Length slot.
297    
298    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
299    
300            * R/AAA.R: Unify duplicated .onLoad function.
301    
302    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
303    
304            * DESCRIPTION (Suggests): Added Rmpi.
305    
306    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
307    
308            * R/source.R (getElem): Fix 'no visible binding' warning.
309    
310            * man/WeightFunction.Rd: Fix signature.
311    
312    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
313    
314            * R/weight.R: Introduce name abbreviations for weighting functions.
315    
316    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
317    
318            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
319    
320            * R/cluster.R: Provide convenience functions for using a MPI
321            cluster.
322    
323            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
324            available.
325    
326            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
327            available.
328    
329    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
330    
331            * R/textdoccol.R (lapply): Removed debug print out.
332    
333    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
334    
335            * R/reader.R (readRCV1): Improved meta data extraction from
336            Reuters Corpus Volume 1 documents.
337    
338    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
339    
340            * R/transform.R: Ensure that all mappings preserve multiline
341            structures.
342    
343    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
344    
345            * R/filter.R: Every filter has now an attribute indicating whether
346            it sould be applied to document level (doclevel).
347    
348            * R/textdoccol.R (tmFilter): Set searchFullText as new default
349            filter.
350    
351    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * R/transform.R (replacePatterns): Replaced removeWords by
354            replacePatterns. Suggested by Christian Buchta.
355    
356            * R/textdoccol.R (inspect): Improved formatting.
357    
358    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
359    
360            * inst/CITATION: Updated JSS article information.
361    
362            * R/textdoccol.R (setAs): Added coerce method from list to
363            corpus.
364    
365            * R/meta.R (meta): Improved meta data handling.
366    
367    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
368    
369            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
370            Christian Buchta.
371    
372            * inst/CITATION: Added template to include JSS article reference.
373    
374    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * R/textdoccol.R (tmMap): Introduced lazy mapping.
377    
378            * R/source.R: Added VectorSource.
379    
380    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * man/: Language codes should be in ISO 639-1 format.
383    
384            * R/textdoccol.R (asPlain): Preserve local meta data.
385    
386    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
387    
388            * R/textdoccol.R (writeCorpus): Function for writing a corpus
389            containing plain text documents to disk.
390    
391    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
392    
393            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
394            always set correctly.
395    
396            * R/textdoccol.R: Set load = TRUE as default for load on demand
397            since in most cases this is the wanted behaviour.
398    
399    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
400    
401            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
402    
403            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
404    
405    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
406    
407            * R/meta.R (meta): New function for consistent access to meta data
408            of document collections, repositories, and texts.
409    
410    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
411    
412            * R/: Better support for encodings.
413    
414    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
415    
416            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
417            selection when no reader argument is given.
418    
419    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
420    
421            * R/source.R (CSVSource): Now uses read.csv instead of scan
422            internally.
423    
424    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
425    
426            * R/reader.R (getReaders): Returns available reader functions.
427    
428            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
429            as default.
430    
431    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
432    
433            * R/stopwords.R (stopwords): Shortened code, removed codetools
434            variable warnings.
435    
436            * man/: Documentation for showMeta, added an example for tmMap.
437    
438            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
439            some minor typos fixed.
440    
441    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * R/aobjects.R (showMeta): Added method for pretty printing a
444            text document's meta data.
445    
446    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * R/textdoccol.R (TextDocCol): Better handling of empty
449            arguments.
450    
451            * NAMESPACE: Exported readDOC.
452    
453            * man/completeStems.Rd: Added an example.
454    
455    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
456    
457            * R/stopwords.R (stopwords): Look up .dat files at every
458            call. Allows users to modify stopword .dat files interactively.
459    
460    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * R/termdocmatrix.R (termFreq): Correct processing of empty
463            documents.
464    
465    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
466    
467            * man/: Updated documentation.
468    
469    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
470    
471            * R/complete.R (completeStems): Completes (heuristically) word
472            stems.
473    
474            * R/termdocmatrix.R (TermDocMatrix2): New modular
475            constructor.
476    
477            * NAMESPACE: Exported termFreq.
478    
479    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
480    
481            * R/reader.R (readDOC): Added MS Word reader (using antiword).
482    
483    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
484    
485            * R/weight.R: Weighting functions for TermDocMatrix.
486    
487    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
488    
489            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
490            functions for accessing dimension, column, and row names.
491    
492            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
493    
494    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
495    
496            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
497    
498    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
499    
500            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
501    
502    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * R/reader.R (readPDF): Removed manual checks for pdftotext and
505            pdfinfo. The system call gives a warning anyway.
506    
507    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
508    
509            * R/textdoccol.R (asPlain): Conversion from
510            StructuredTextDocuments to PlainTextDocuments.
511    
512    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
513    
514            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
515            for accessing term-document matrices.
516    
517            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
518            are installed.
519    
520    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
521    
522            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
523            Christian Buchta.
524    
525    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
526    
527            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
528    
529    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
530    
531            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
532    
533            * R/reader.R (readPDF): Added PDF reader.
534    
535    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
538    
539            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
540    
541            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
542    
543            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
544    
545    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
546    
547            * R/distmeasure.R (dissimilarity): Replaced dists call from
548            package cba by new dist call from package proxy.
549    
550    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
551    
552            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
553    
554    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
555    
556            * R/termdocmatrix.R: require() uses the quietly option to suppress
557            loading messages.
558    
559    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
560    
561            * R/dictionary.R: Added dictionary support.
562    
563    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
564    
565            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
566            documents. This simplifies some functions, e.g., asPlain.
567    
568    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
569    
570            * inst/doc/tm.Rnw: Fixed some typos in vignette.
571    
572    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
573    
574            * R/textdoccol.R (replaceWords): Added method to replace a set of
575            words by a single word. Useful for synonyms.
576    
577    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
578    
579            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
580    
581    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
582    
583            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
584            vectors. Thanks to Ariel Maguyon for his error report.
585            (removeSparseTerms): New function to remove columns from a
586            term-document matrix exceeding a sparse factor.
587    
588    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
589    
590            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
591    
592    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
593    
594            * man/sFilter.Rd: Corrected documentation on statement format (use
595            '==' instead of '=').
596    
597    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
598    
599            * R/aobjects.R (StructuredTextDocument): Inherits from
600            TextDocument.
601    
602    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
603    
604            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
605            on sparse matrices as proposed by Martin Maechler.
606    
607    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
608    
609            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
610            \pkg{filehash} version makes them deprecated.
611    
612    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
613    
614            * R/termdocmatrix.R (textvector): Stemming is now performed before
615            erasing stopwords.
616            (weightMatrix): Adapted to handle sparse matrices.
617            (TermDocMatrix): Sparse matrix is now efficiently built by
618            direct stepwise insertion of row values into it.
619    
620    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
621    
622            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
623            due to ongoing problems. For our purposes the latter is as useful
624            as the replaced package.
625    
626    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
627    
628            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
629    
630            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
631    
632    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
633    
634            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
635            languages with available stopwords.
636    
637    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
638    
639            * inst/doc/tm.Rnw: Minor corrections in the vignette.
640    
641    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
642    
643            * DESCRIPTION: Update to version 0.2, since a lot of new features
644            have been integrated.
645    
646            * inst/stopwords: Updated existing stopwords and added stopwords
647            for various other languages.
648    
649    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
650    
651            * man/: Updated documentation.
652    
653            * Work/testDb.R: Script to test database stuff.
654    
655            * R/: Fixed various database related bugs. Seems to be rather
656            useable now, i.e., consider as alpha status for now.
657    
658    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
659    
660            * R/: Fixed some bugs related to database support.
661    
662    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
663    
664            * man/: Added a lot of examples to the manuals.
665    
666    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
667    
668            * man/: Updated parts of the documentation.
669    
670            * R/textdoccol.R (asPlain): Added conversion from newsgroup
671            documents to plain text documents.
672    
673    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
674    
675            * R/textdoccol.R: Finished experimental database support. Not yet
676            intensively tested.
677    
678            * R/source.R: Now each source has a default reader.
679    
680            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
681            class anymore.
682    
683            * R/plaintextdoc.R: Custom show method for plain text documents.
684    
685            * R/aobjects.R: Added a class for structured text documents.
686    
687            * R/reader.R: Replaced remaining \code{parser} occurrences with
688            \code{reader}.
689    
690            * R/textdoccol.R (summary): Indent tags.
691    
692            * R/textdoccol.R (removePunctuation): Transform method to remove
693            punctuation marks.
694    
695    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
696    
697            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
698            using prescindMeta().
699    
700    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
701    
702            * R/textdoccol.R: Improved database support.
703    
704    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
705    
706            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
707    
708            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
709            language code.
710    
711            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
712            into parserControl argument.
713    
714            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
715    
716    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
717    
718            * Work/tmDataSetup.R: The datasets acq and crude can now be
719            created on the fly.
720    
721            * R/stopwords.R: Introduced a function returning the stopwords for
722            a given language (English, German and French at the moment)
723    
724            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
725            otherwise falls back to Snowball package.
726    
727    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
728    
729            * man/dissimilarity-methods.Rd: Make clear that any method offered
730            by "dists" from package "cba" can be used.
731    
732    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
733    
734            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
735            to Kurt's latex suggestion. Removed points and underscores in
736            variable names for consistent naming.
737    
738            * DESCRIPTION: Update to version 0.1-2.
739    
740            * man/TextRepository.Rd: Fixed bug in documentation.
741    
742    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
743    
744            * DESCRIPTION: Update to version 0.1-1.
745    
746    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
747    
748            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
749            wordStem.
750    
751    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
752    
753            * R/: Changes due to Kurt's review.
754    
755    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
756    
757            * R/: Implemented improvements based upon comments by David
758            Meyer.
759    
760    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
761    
762            * inst/doc/: Rewrote vignette.
763    
764            * man/: Improved documentation.
765    
766    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
767    
768            * man/: Updated documentation.
769    
770            * DESCRIPTION: Changed package name to "tm". Updated version to
771            0.1 for first CRAN release.
772    
773            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
774            list archive example.
775    
776            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
777            archive example.
778    
779            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
780            from (several mails per box) mbox format to (single mail per file)
781            eml format.
782    
783    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
784    
785            * data/crude.rda: Rebuilt.
786    
787            * data/acq.rda: Rebuilt.
788    
789            * R/reader.R: Factored out reader and parser methods from
790            textdoccol.R.
791    
792            * R/source.R: Factored out Source methods from aobjects.R and
793            textdoccol.R.
794            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
795            feeds.
796    
797            * R/textdoccol.R (DirSource): Added support for recursive
798            traversal of directories.
799    
800    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
801    
802            * R/textdoccol.R ([[): Loads the document corpus automatically
803            into memory upon access.
804            (tm_transform, tm_filter): Removed several checks whether the
805            document is already loaded ([[ ensures this now).
806            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
807            mailing list archive.
808    
809    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
810    
811            * R/aobjects.R (TextDocument): Is now a virtual class.
812            (Source): Is now a virtual class.
813    
814    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
815    
816            * R/textdoccol.R (c): Support for an arbitrary number of document
817            collections.
818    
819    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
820    
821            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
822            append_meta and remove_meta.
823    
824            * R/textdoccol.R: Removed modify_metadata method.
825    
826            * R/textrepo.R: Removed modify_metadata method.
827    
828            * R/textdoccol.R (remove_meta): Supports removal of document
829            collection metadata and document (= in data frame) metadata.
830    
831    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
834    
835            * data/crude.rda: Rebuilt.
836    
837            * data/acq.rda: Rebuilt.
838    
839            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
840    
841            * R/textdoccol.R ([): Bug fix for subsetting a document
842            collection's data frame.
843    
844    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
845    
846            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
847            to s_filter.
848    
849            * R/textdoccol.R: Local text documents' metadata can now be copied
850            to a document collection's data frame with prescind_meta.
851    
852    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
853    
854            * R/: Text documents' slot metadata is now accessible in s_filter.
855    
856            * R/: Rewrote s_filter function (has still some restrictions).
857    
858    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
859    
860            * R/: Various fixes in handling metadata.
861    
862            * R/: Added update mechanism for text document collections.
863    
864    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
865    
866            * R/: Merging of document collections now creates a binary tree
867            for reconstructing merged document collections.
868    
869            * R/: Redesign of metadata for document collections.
870    
871    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
872    
873            * R/: Messages now use \code{ngettext}.
874    
875    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
876    
877            * R/: Added functions for modifying and removing metadata.
878    
879    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
880    
881            * man/: Updated some documentation.
882    
883            * R/: Corrected some connection issues.
884    
885            * inst/doc: Worked on the vignette.
886    
887    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * inst/: Added texts and started vignette.
890    
891            * R/: Final changes based upon David's comments.
892    
893    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
894    
895            * NAMESPACE: Corrected exports (generic methods need exportMethods
896            directives!).
897    
898    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
899    
900            * R/: Modified the TextDocCol constructur and various parsers. It
901            is now modular and supports various file formats via plugins (see
902            the new "Source" class).
903    
904    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
905    
906            * man/: Revised documentation after previous code changes.
907    
908    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
909    
910            * R/: Remaining changes as discussed with David.
911    
912    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
913    
914            * R/: Some changes as suggested by David. The rest will follow
915            within the next days.
916    
917    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
918    
919            * man/: Finished documentation.
920    
921    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
922    
923            * man/: Wrote some documentation.
924    
925    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
926    
927            * R/: Further syntactic sugar in form of additional assignment and
928            accessor methods.
929    
930    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
931    
932            * R/: Syntactic sugar in form of "length", "show" and "summary"
933            operators.
934    
935    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
936    
937            * R/: Diverse updates. Mainly on default operators ("[" or "c")
938            and dissimilarities.
939    
940    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
941    
942            * R/: Added similarity functions.
943    
944            * data/: Added english stopwords.
945    
946    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
947    
948            * data/: Examples compiled for new features
949    
950            * R/: Changes due to new structure.
951    
952            * NAMESPACE: Corrected namespace to reflect new structure.
953    
954            * R/termdocmatrix.R: Adapted for new naming scheme.
955    
956    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
957    
958            * R/textdoccol.R: Adapted code for new class structure. Wrote
959            several transform and filter functions operating on text document
960            collections (alias text document databases).
961    
962            * R/aobjects.R: Adapted class structure with inheritance,
963            repositories and additional meta data. Loading files on demand is
964            now possible.
965    
966    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
967    
968            * R/: Some cosmetic cleanups.
969    
970            * inst/: Removed vignette on clustering. That and much more is now
971            described in the JSS paper on text mining. Based upon that
972            article an elaborated vignette will be incorporated in the future.
973    
974    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
975    
976            * R/: Updated generic S4 methods to comply with signature changes
977            in newer versions of R (> 2.3)
978    
979    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
980    
981            * ext/R/importRIS.R: Automatic RIS import is now possible.
982    
983    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
984    
985            * R/textdoccol.R: Added RIS HTML input format.
986    
987    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
988    
989            * R/textdoccol.R: Removed bug that caused invalid text document
990            collections when handling many input files.
991    
992    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
993    
994            * R/textdoccol.R: Restructured and extended file import
995            mechanism.
996    
997            * inst/doc/clustering.Rnw: Adapted vignette for use with
998            ReutNews.rda
999    
1000            * man/ReutNews.Rd: Documentation for ReutNews.rda
1001    
1002            * data/ReutNews.rda: A tiny Reuters21578 example data set.
1003    
1004  2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1005    
1006          * inst/doc/clustering.Rnw: Wrote a small vignette to present the          * inst/doc/clustering.Rnw: Wrote a small vignette to present the

Legend:
Removed from v.34  
changed lines
  Added in v.1062

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge