SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 17, Sat Nov 5 14:47:12 2005 UTC pkg/ChangeLog revision 1292, Tue Jan 28 16:31:18 2014 UTC
# Line 1  Line 1 
1    2014-01-28  Ingo Feinerer <feinerer@logic.at>
2    
3            * R/utils.R (map_IETF_Snowball): Process three letter codes.
4    
5    2014-01-07  Ingo Feinerer <feinerer@logic.at>
6    
7            * DESCRIPTION (Version): Prepare for CRAN New Year release.
8    
9    2014-01-05  Ingo Feinerer <feinerer@logic.at>
10    
11            * R/matrix.R (findAssocs): Allow multiple and non-existing terms.
12            Suggested by Christian Buchta.
13    
14            * R/source.R (is.Source): New check for valid source.
15    
16    2013-12-28  Ingo Feinerer <feinerer@logic.at>
17    
18            * R/matrix.R (findAssocs): Make corlimit inclusive.
19    
20    2013-09-27  Ingo Feinerer <feinerer@logic.at>
21    
22            * R/source.R: Allow multiple URIs for URISource.
23    
24    2013-09-19  Ingo Feinerer <feinerer@logic.at>
25    
26            * R/source.R (Source): New Source constructor.
27    
28    2013-08-26  Ingo Feinerer <feinerer@logic.at>
29    
30            * R/source.R (DirSource): Report non-existent or non-readable files.
31            Suggested by Ajinkya Kale and Milan Bouchet-Valat.
32    
33    2013-08-19  Ingo Feinerer <feinerer@logic.at>
34    
35            * R/corpus.R (setOldClass): Do not register VCorpus as S4 class
36            anymore.
37    
38            * R/doc.R (setOldClass): Do not register PlainTextDocument as S4 class
39            anymore.
40    
41    2013-08-09  Ingo Feinerer <feinerer@logic.at>
42    
43            * DESCRIPTION (License): Changed to GPL-3.
44    
45    2013-07-25  Ingo Feinerer <feinerer@logic.at>
46    
47            * R/complete.R (stemCompletion): Report NA instead of error when no
48            completion can be found by the prevalent heuristic. Suggested by Hugh
49            Devlin.
50    
51    2013-07-10  Ingo Feinerer <feinerer@logic.at>
52    
53            * R/reader.R (readPDF): Use tm:::pdfinfo() (which needs the pdfinfo
54            command line tool) instead of tools:::pdf_info().
55    
56    2013-04-11  Ingo Feinerer <feinerer@logic.at>
57    
58            * R/transform.R (removeWords): Use PCRE UCP to use Unicode properties
59            to determine character types.
60    
61    2012-12-14  Ingo Feinerer <feinerer@logic.at>
62    
63            * R/matrix.R (TermDocumentMatrix): Ensure dimnames of type character
64            when generating a simple_triplet_matrix. Reported by Arho Suominen.
65    
66    2012-12-10  Ingo Feinerer <feinerer@logic.at>
67    
68            * man/tm_reduce.Rd: Document right to left folding order. Adapt
69            example as well. Suggested by Mark Rosenstein.
70    
71    2012-12-04  Ingo Feinerer <feinerer@logic.at>
72    
73            * R/filter.R (sFilter): Avoid attach() and simplify.
74    
75    2012-11-02  Ingo Feinerer <feinerer@logic.at>
76    
77            * R/doc.R (.TextDocument): Use casts to ensure data types and to avoid
78            removal of attributes.
79    
80    2012-10-03 Ingo Feinerer  <feinerer@logic.at>
81    
82            * R/weight.R (weightTfIdf, weightSMART): Gracefully handle empty
83            columns and rows (avoids blow-up due to NaN values). Suggested by Jaap
84            Frölich.
85    
86    2012-07-27 Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/transform.R (removeWords): Allow longer stopword lists.
89    
90    2012-01-31  Ingo Feinerer  <feinerer@logic.at>
91    
92            * R/reader.R (readXML): Readers can now set the document language
93            themselves.
94    
95    2012-01-14  Ingo Feinerer  <feinerer@logic.at>
96    
97            * R/source.R (XMLSource, getElem.XMLSource): Simplifications as
98            proposed by Milan Bouchet-Valat.
99    
100    2012-01-11  Ingo Feinerer  <feinerer@logic.at>
101    
102            * R/matrix.R (termFreq): Fix processing of user provided
103            stopwords. Reported by Bettina Grün.
104    
105    2011-12-23  Ingo Feinerer  <feinerer@logic.at>
106    
107            * R/matrix.R (termFreq): Fix invalid handling of
108            control$wordLengths[1]. Reported by Steven C. Bagley.
109    
110    2011-12-17  Ingo Feinerer  <feinerer@logic.at>
111    
112            * DESCRIPTION (Version): Prepare for CRAN Christmas release.
113    
114    2011-12-12  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/utils.R (map_IETF_Snowball): Map empty input to "porter".
117    
118    2011-12-07  Ingo Feinerer  <feinerer@logic.at>
119    
120            * R/transform.R (removePunctuation): Add option to preserve
121            intra-word dashes.
122    
123    2011-12-06  Ingo Feinerer  <feinerer@logic.at>
124    
125            * R/matrix.R (termFreq): Allow reordering of control option
126            processing.
127    
128    2011-11-17  Ingo Feinerer  <feinerer@logic.at>
129    
130            * R/reader.R (readPDF): Use tools:::pdf_info() instead of external
131            pdfinfo tool.
132    
133            * inst/stopwords/SMART.dat: Add SMART information retrieval system
134            stopwords (which are also used by the MC toolkit).
135    
136            * R/matrix (termFreq): Allow local option \code{bounds$local} to
137            restrict how often a term may appear in each document (generalizes
138            \code{minDocFreq}). Similarly the local option \code{wordLenghts}
139            for word length bounds (generalizes \code{minWordLength}).
140    
141            * R/matrix.R (TermDocumentMatrix.VCorpus): New global option
142            \code{bounds$global} for restricting how often a term is allowed
143            to appear in different documents.
144    
145            * R/matrix.R (TermDocumentMatrix.VCorpus): Distinguish between
146            local options delegated internally to termFreq() and global
147            options which are processed by the term-document matrix
148            constructor itself.
149    
150    2011-11-15  Ingo Feinerer  <feinerer@logic.at>
151    
152            * man/getTokenizers.Rd: Document getTokenizers().
153    
154            * man/tokenizer.Rd: Document MC_tokenizer() and scan_tokenizer().
155    
156    2011-11-04  Ingo Feinerer  <feinerer@logic.at>
157    
158            * man/matrix.Rd: Document as.TermDocumentMatrix.term_frequency.
159    
160            * man/combine.Rd: Document c.term_frequency().
161    
162    2011-10-11  Ingo Feinerer  <feinerer@logic.at>
163    
164            * R/meta.R (`meta<-.Corpus`): Assume that the replacement value
165            can be accessed via '[' and not '[['.
166    
167    2011-08-24  Ingo Feinerer  <feinerer@logic.at>
168    
169            * R/stopwords.R (stopwords): Raise an error if no stopwords are
170            available for requested language. Suggested by Derek M Jones.
171    
172    2011-05-27  Ingo Feinerer  <feinerer@logic.at>
173    
174            * R/weight.R (weightSMART): Implement Cosine and pivoted unique
175            normalization.
176    
177    2011-02-17  Ingo Feinerer  <feinerer@logic.at>
178    
179            * R/transform.R (stemDocument.PlainTextDocument): Use language
180            argument.
181    
182    2011-02-04  Ingo Feinerer  <feinerer@logic.at>
183    
184            * R/source.R: Store strings and connections instead of unevaluated
185            calls.
186    
187    2010-11-26  Ingo Feinerer  <feinerer@logic.at>
188    
189            * R/corpus.R (Corpus): Allow init and exit hooks for readers.
190    
191    2010-10-22  Ingo Feinerer  <feinerer@logic.at>
192    
193            * R/matrix.R (.TermDocumentMatrix): Make Weighting an attribute
194            (instead of a list element).
195    
196    2010-10-16  Ingo Feinerer  <feinerer@logic.at>
197    
198            * R/corpus.R (`[[.VCorpus`, `[[.PCorpus'): Access individual
199            documents by names (fallback to IDs if names are not set).
200    
201    2010-08-25  Ingo Feinerer  <feinerer@logic.at>
202    
203            * R/corpus.R (c.Corpus): When concatenating corpora, the argument
204            \code{recursive} now determines whether existing corpus meta data
205            is used.
206    
207    2010-08-06  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/transform.R: Removed convert_UTF_8(). Use enc2utf8() instead.
210    
211    2010-06-17  Ingo Feinerer  <feinerer@logic.at>
212    
213            * R/matrix.R (TermDocumentMatrix): If a dictionary is given do not
214            remove terms not occurring in the corpus anymore.
215    
216    2010-06-02  Ingo Feinerer  <feinerer@logic.at>
217    
218            * R/plot.R (Zipf_plot, Heaps_plot): Plotting functions for Zipf's
219            and Heaps' law.
220    
221    2010-05-18  Ingo Feinerer  <feinerer@logic.at>
222    
223            * R/corpus.R (Corpus, PCorpus): Use element names as IDs if
224            provided by a source.
225    
226    2010-04-09  Ingo Feinerer  <feinerer@logic.at>
227    
228            * R/source.R (.Source): Provide document names.
229    
230    2010-04-07  Ingo Feinerer  <feinerer@logic.at>
231    
232            * R/meta.R (`content_or_meta`): Utility function.
233    
234    2010-03-19  Ingo Feinerer  <feinerer@logic.at>
235    
236            * R/reader.R (readReut21578XML, readReut21578XMLasPlain): Extract
237            TOPICS, LEWISSPLIT, CGISPLIT, and OLDID meta tags.
238    
239    2010-03-03  Ingo Feinerer  <feinerer@logic.at>
240    
241            * R/weight.R (weightTfIdf): Added normalization option.
242    
243            * man/tm_tag_score.Rd: Add General Inquirer example for sentiment
244            analysis.
245    
246    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
247    
248            * R/score.R (tm_tag_score): Compute a score from the number of
249            tags matching in a document.
250    
251    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
252    
253            * R/complete.R (stemCompletion): New completion heuristics.
254    
255    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
256    
257            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
258    
259    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
260    
261            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
262            setOldClass(c(..., "list")) works.
263    
264    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
265    
266            * R/transform.R (stemDocument.character): In case input is a
267            simple character just delegate to the default Snowball stemmer.
268    
269    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
270    
271            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
272            data.
273    
274    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
275    
276            * R/doc.R (`Content<-`): Be careful with names attribute.
277    
278    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
279    
280            * R/source.R (DirSource): Improved implementation especially when
281            handling many (> 1M) files.
282    
283    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
284    
285            * R/source.R (getElem.URISource): Use encoding argument.
286    
287    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
288    
289            * R/doc.R (setOldClass): Register S3 document classes to be
290            recognized by S4 methods.
291    
292    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
293    
294            * R/matrix.R (termFreq): Add option to remove punctuation
295            characters.
296    
297    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
298    
299            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
300            merging multiple term-document matrices.
301    
302    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
303    
304            * R/corpus.R (setOldClass): Register S3 corpus classes to be
305            recognized by S4 methods.
306    
307            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
308            that CRAN Mac OS X builds do not fail any longer.
309    
310    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
311    
312            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
313            of RWeka:AlphabeticTokenizer() as default.
314    
315    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
316    
317            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
318            caused words at the beginning or the end of a line not to be removed. Do
319            not delete whitespace anymore.
320    
321    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
322    
323            * R/source.R (DirSource): Default to working directory if no path
324            is specified.
325    
326    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
327    
328            * R/source.R (DirSource): Stop on empty directories.
329    
330    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
331    
332            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
333            named documents.
334    
335    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
336    
337            * R/transform.R (removeWords): Improve regular expressions.
338    
339    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
340    
341            * R/meta.R (DublinCore): Allow lower case tags.
342    
343    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
344    
345            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
346            instead of x$children.
347    
348    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
349    
350            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
351    
352    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
353    
354            * R/: Use S3 instead of S4 class system.
355    
356    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
357    
358            * R/reader.R (readMail): Moved to tm.plugin.mail package.
359    
360    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
361    
362            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
363            postings are basically e-mails with some extra headers.
364    
365    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
366    
367            * R/transform.R: Move convertMboxEml, removeCitation,
368            removeMultipart, and removeSignature to the tm.plugin.mail package
369            since they are mainly utility functions (for handling e-mails) and
370            not very framework specific.
371    
372    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
373    
374            * man/: Fix documentation.
375    
376    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
377    
378            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
379            plain text document instead of an XML document for texts of the
380            Reuters-21578 dataset.
381    
382            * R/sparse.R: Removed since the slam package is now available on
383            CRAN.
384    
385            * DESCRIPTION (Depends): Add slam package.
386    
387    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
388    
389            * R/transform.R (stemDoc): Fix character(0) handling.
390    
391    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
392    
393            * R/doc.R (show): Pretty print.
394    
395    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
396    
397            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
398            gracefully.
399    
400    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
401    
402            * R/corpus.R: Make corpus virtual. Implement corpus with standard
403            and permanent storage semantics.
404    
405            * DESCRIPTION: New major release. A *lot* of improvements.
406    
407    2009-05-04   Ingo Feinerer <feinerer@logic.at>
408    
409            * NAMESPACE: Export some simple_triplet_matrix functions.
410    
411    2009-04-28   Ingo Feinerer <feinerer@logic.at>
412    
413            * R/weight.R: Adapt tf-idf to new matrix format.
414    
415    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
416    
417            * R/matrix.R: Create two distinct classes for term-document and
418            document-term matrices.
419    
420    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
421    
422            * R/termdocmatrix.R: No longer use Matrix package. This reduces
423            package start-up time significantly.
424    
425    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
426    
427            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
428    
429    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
430    
431            * R/transform.R (tmReduce): Combine multiple maps into one
432            transformation.
433    
434    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
435    
436            * R/weight.R: Remove weightLogical since it does not return a
437            dgCMatrix.
438    
439            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
440            or TermDocumentMatrix instead.
441    
442    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
443    
444            * inst/doc/extensions.Rnw: Finished vignette.
445    
446    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
447    
448            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
449            DocumentTermMatrix representations.
450    
451    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
452    
453            * R/reader.R (readXML): New reader for arbitrary XML files.
454    
455    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
456    
457            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
458            (XMLSource): New XMLSource class for arbitrary XML files.
459            (Source): New slot Vectorized.
460    
461    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
462    
463            * R/reader.R (readTabular): Experimental reader for tabular data
464            structures which can be customized via user-defined mappings.
465    
466            * R/reader.R: Always use UTC time zone.
467    
468            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
469    
470    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
471    
472            * R/reader.R (readDOC): Options can be passed over to antiword.
473    
474            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
475            pdftotext.
476    
477    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
478    
479            * R/source.R (DirSource): Add pattern and ignore.case arguments
480            which are internally passed over to list.files().
481    
482    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
483    
484            * inst/doc/tm.Rnw: Suppress pointless loading message.
485    
486    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
487    
488            * DESCRIPTION: Speed up package loading (via moving packages not
489            strictly necessary for normal operation to Suggests instead of
490            Depends).
491    
492    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
493    
494            * R/reader.R (readNewsgroup): The date format is now configurable.
495    
496    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
497    
498            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
499    
500    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
501    
502            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
503    
504    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
505    
506            * R/source.R (DataframeSource): New source class for data frames.
507    
508            * R/source.R: Fixed non-standard call evaluation.
509    
510    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
511    
512            * R/source.R (URISource): New source class for a single document.
513    
514    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
515    
516            * R/source.R: Refactoring.
517    
518    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
519    
520            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
521            Rmpi installations more gracefully.
522    
523    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
524    
525            * R/source.R (Source): Add Length slot.
526    
527    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
528    
529            * R/AAA.R: Unify duplicated .onLoad function.
530    
531    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
532    
533            * DESCRIPTION (Suggests): Added Rmpi.
534    
535    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
536    
537            * R/source.R (getElem): Fix 'no visible binding' warning.
538    
539            * man/WeightFunction.Rd: Fix signature.
540    
541    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
542    
543            * R/weight.R: Introduce name abbreviations for weighting functions.
544    
545    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
546    
547            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
548    
549            * R/cluster.R: Provide convenience functions for using a MPI
550            cluster.
551    
552            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
553            available.
554    
555            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
556            available.
557    
558    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
559    
560            * R/textdoccol.R (lapply): Removed debug print out.
561    
562    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
563    
564            * R/reader.R (readRCV1): Improved meta data extraction from
565            Reuters Corpus Volume 1 documents.
566    
567    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
568    
569            * R/transform.R: Ensure that all mappings preserve multiline
570            structures.
571    
572    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
573    
574            * R/filter.R: Every filter has now an attribute indicating whether
575            it sould be applied to document level (doclevel).
576    
577            * R/textdoccol.R (tmFilter): Set searchFullText as new default
578            filter.
579    
580    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
581    
582            * R/transform.R (replacePatterns): Replaced removeWords by
583            replacePatterns. Suggested by Christian Buchta.
584    
585            * R/textdoccol.R (inspect): Improved formatting.
586    
587    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
588    
589            * inst/CITATION: Updated JSS article information.
590    
591            * R/textdoccol.R (setAs): Added coerce method from list to
592            corpus.
593    
594            * R/meta.R (meta): Improved meta data handling.
595    
596    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
597    
598            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
599            Christian Buchta.
600    
601            * inst/CITATION: Added template to include JSS article reference.
602    
603    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
604    
605            * R/textdoccol.R (tmMap): Introduced lazy mapping.
606    
607            * R/source.R: Added VectorSource.
608    
609    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
610    
611            * man/: Language codes should be in ISO 639-1 format.
612    
613            * R/textdoccol.R (asPlain): Preserve local meta data.
614    
615    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
616    
617            * R/textdoccol.R (writeCorpus): Function for writing a corpus
618            containing plain text documents to disk.
619    
620    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
621    
622            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
623            always set correctly.
624    
625            * R/textdoccol.R: Set load = TRUE as default for load on demand
626            since in most cases this is the wanted behaviour.
627    
628    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
631    
632            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
633    
634    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
635    
636            * R/meta.R (meta): New function for consistent access to meta data
637            of document collections, repositories, and texts.
638    
639    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
640    
641            * R/: Better support for encodings.
642    
643    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
644    
645            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
646            selection when no reader argument is given.
647    
648    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
649    
650            * R/source.R (CSVSource): Now uses read.csv instead of scan
651            internally.
652    
653    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
654    
655            * R/reader.R (getReaders): Returns available reader functions.
656    
657            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
658            as default.
659    
660    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
661    
662            * R/stopwords.R (stopwords): Shortened code, removed codetools
663            variable warnings.
664    
665            * man/: Documentation for showMeta, added an example for tmMap.
666    
667            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
668            some minor typos fixed.
669    
670    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
671    
672            * R/aobjects.R (showMeta): Added method for pretty printing a
673            text document's meta data.
674    
675    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
676    
677            * R/textdoccol.R (TextDocCol): Better handling of empty
678            arguments.
679    
680            * NAMESPACE: Exported readDOC.
681    
682            * man/completeStems.Rd: Added an example.
683    
684    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
685    
686            * R/stopwords.R (stopwords): Look up .dat files at every
687            call. Allows users to modify stopword .dat files interactively.
688    
689    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
690    
691            * R/termdocmatrix.R (termFreq): Correct processing of empty
692            documents.
693    
694    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
695    
696            * man/: Updated documentation.
697    
698    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
699    
700            * R/complete.R (completeStems): Completes (heuristically) word
701            stems.
702    
703            * R/termdocmatrix.R (TermDocMatrix2): New modular
704            constructor.
705    
706            * NAMESPACE: Exported termFreq.
707    
708    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
709    
710            * R/reader.R (readDOC): Added MS Word reader (using antiword).
711    
712    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
713    
714            * R/weight.R: Weighting functions for TermDocMatrix.
715    
716    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
717    
718            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
719            functions for accessing dimension, column, and row names.
720    
721            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
722    
723    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
724    
725            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
726    
727    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
728    
729            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
730    
731    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
732    
733            * R/reader.R (readPDF): Removed manual checks for pdftotext and
734            pdfinfo. The system call gives a warning anyway.
735    
736    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
737    
738            * R/textdoccol.R (asPlain): Conversion from
739            StructuredTextDocuments to PlainTextDocuments.
740    
741    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
742    
743            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
744            for accessing term-document matrices.
745    
746            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
747            are installed.
748    
749    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
750    
751            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
752            Christian Buchta.
753    
754    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
755    
756            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
757    
758    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
759    
760            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
761    
762            * R/reader.R (readPDF): Added PDF reader.
763    
764    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
765    
766            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
767    
768            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
769    
770            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
771    
772            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
773    
774    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
775    
776            * R/distmeasure.R (dissimilarity): Replaced dists call from
777            package cba by new dist call from package proxy.
778    
779    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
780    
781            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
782    
783    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
784    
785            * R/termdocmatrix.R: require() uses the quietly option to suppress
786            loading messages.
787    
788    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
789    
790            * R/dictionary.R: Added dictionary support.
791    
792    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
793    
794            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
795            documents. This simplifies some functions, e.g., asPlain.
796    
797    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
798    
799            * inst/doc/tm.Rnw: Fixed some typos in vignette.
800    
801    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
802    
803            * R/textdoccol.R (replaceWords): Added method to replace a set of
804            words by a single word. Useful for synonyms.
805    
806    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
807    
808            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
809    
810    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
811    
812            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
813            vectors. Thanks to Ariel Maguyon for his error report.
814            (removeSparseTerms): New function to remove columns from a
815            term-document matrix exceeding a sparse factor.
816    
817    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
818    
819            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
820    
821    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
822    
823            * man/sFilter.Rd: Corrected documentation on statement format (use
824            '==' instead of '=').
825    
826    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * R/aobjects.R (StructuredTextDocument): Inherits from
829            TextDocument.
830    
831    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
834            on sparse matrices as proposed by Martin Maechler.
835    
836    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
837    
838            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
839            \pkg{filehash} version makes them deprecated.
840    
841    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
842    
843            * R/termdocmatrix.R (textvector): Stemming is now performed before
844            erasing stopwords.
845            (weightMatrix): Adapted to handle sparse matrices.
846            (TermDocMatrix): Sparse matrix is now efficiently built by
847            direct stepwise insertion of row values into it.
848    
849    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
850    
851            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
852            due to ongoing problems. For our purposes the latter is as useful
853            as the replaced package.
854    
855    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
858    
859            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
860    
861    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
862    
863            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
864            languages with available stopwords.
865    
866    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
867    
868            * inst/doc/tm.Rnw: Minor corrections in the vignette.
869    
870    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
871    
872            * DESCRIPTION: Update to version 0.2, since a lot of new features
873            have been integrated.
874    
875            * inst/stopwords: Updated existing stopwords and added stopwords
876            for various other languages.
877    
878    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
879    
880            * man/: Updated documentation.
881    
882            * Work/testDb.R: Script to test database stuff.
883    
884            * R/: Fixed various database related bugs. Seems to be rather
885            useable now, i.e., consider as alpha status for now.
886    
887    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * R/: Fixed some bugs related to database support.
890    
891    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
892    
893            * man/: Added a lot of examples to the manuals.
894    
895    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
896    
897            * man/: Updated parts of the documentation.
898    
899            * R/textdoccol.R (asPlain): Added conversion from newsgroup
900            documents to plain text documents.
901    
902    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
903    
904            * R/textdoccol.R: Finished experimental database support. Not yet
905            intensively tested.
906    
907            * R/source.R: Now each source has a default reader.
908    
909            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
910            class anymore.
911    
912            * R/plaintextdoc.R: Custom show method for plain text documents.
913    
914            * R/aobjects.R: Added a class for structured text documents.
915    
916            * R/reader.R: Replaced remaining \code{parser} occurrences with
917            \code{reader}.
918    
919            * R/textdoccol.R (summary): Indent tags.
920    
921            * R/textdoccol.R (removePunctuation): Transform method to remove
922            punctuation marks.
923    
924    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
925    
926            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
927            using prescindMeta().
928    
929    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
930    
931            * R/textdoccol.R: Improved database support.
932    
933    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
934    
935            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
936    
937            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
938            language code.
939    
940            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
941            into parserControl argument.
942    
943            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
944    
945    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
946    
947            * Work/tmDataSetup.R: The datasets acq and crude can now be
948            created on the fly.
949    
950            * R/stopwords.R: Introduced a function returning the stopwords for
951            a given language (English, German and French at the moment)
952    
953            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
954            otherwise falls back to Snowball package.
955    
956    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
957    
958            * man/dissimilarity-methods.Rd: Make clear that any method offered
959            by "dists" from package "cba" can be used.
960    
961    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
962    
963            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
964            to Kurt's latex suggestion. Removed points and underscores in
965            variable names for consistent naming.
966    
967            * DESCRIPTION: Update to version 0.1-2.
968    
969            * man/TextRepository.Rd: Fixed bug in documentation.
970    
971    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
972    
973            * DESCRIPTION: Update to version 0.1-1.
974    
975    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
976    
977            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
978            wordStem.
979    
980    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
981    
982            * R/: Changes due to Kurt's review.
983    
984    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
985    
986            * R/: Implemented improvements based upon comments by David
987            Meyer.
988    
989    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
990    
991            * inst/doc/: Rewrote vignette.
992    
993            * man/: Improved documentation.
994    
995    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
996    
997            * man/: Updated documentation.
998    
999            * DESCRIPTION: Changed package name to "tm". Updated version to
1000            0.1 for first CRAN release.
1001    
1002            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
1003            list archive example.
1004    
1005            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
1006            archive example.
1007    
1008            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
1009            from (several mails per box) mbox format to (single mail per file)
1010            eml format.
1011    
1012    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1013    
1014            * data/crude.rda: Rebuilt.
1015    
1016            * data/acq.rda: Rebuilt.
1017    
1018            * R/reader.R: Factored out reader and parser methods from
1019            textdoccol.R.
1020    
1021            * R/source.R: Factored out Source methods from aobjects.R and
1022            textdoccol.R.
1023            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
1024            feeds.
1025    
1026            * R/textdoccol.R (DirSource): Added support for recursive
1027            traversal of directories.
1028    
1029    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1030    
1031            * R/textdoccol.R ([[): Loads the document corpus automatically
1032            into memory upon access.
1033            (tm_transform, tm_filter): Removed several checks whether the
1034            document is already loaded ([[ ensures this now).
1035            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
1036            mailing list archive.
1037    
1038    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1039    
1040            * R/aobjects.R (TextDocument): Is now a virtual class.
1041            (Source): Is now a virtual class.
1042    
1043    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1044    
1045            * R/textdoccol.R (c): Support for an arbitrary number of document
1046            collections.
1047    
1048    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1049    
1050            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
1051            append_meta and remove_meta.
1052    
1053            * R/textdoccol.R: Removed modify_metadata method.
1054    
1055            * R/textrepo.R: Removed modify_metadata method.
1056    
1057            * R/textdoccol.R (remove_meta): Supports removal of document
1058            collection metadata and document (= in data frame) metadata.
1059    
1060    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1061    
1062            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
1063    
1064            * data/crude.rda: Rebuilt.
1065    
1066            * data/acq.rda: Rebuilt.
1067    
1068            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
1069    
1070            * R/textdoccol.R ([): Bug fix for subsetting a document
1071            collection's data frame.
1072    
1073    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1074    
1075            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
1076            to s_filter.
1077    
1078            * R/textdoccol.R: Local text documents' metadata can now be copied
1079            to a document collection's data frame with prescind_meta.
1080    
1081    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1082    
1083            * R/: Text documents' slot metadata is now accessible in s_filter.
1084    
1085            * R/: Rewrote s_filter function (has still some restrictions).
1086    
1087    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1088    
1089            * R/: Various fixes in handling metadata.
1090    
1091            * R/: Added update mechanism for text document collections.
1092    
1093    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1094    
1095            * R/: Merging of document collections now creates a binary tree
1096            for reconstructing merged document collections.
1097    
1098            * R/: Redesign of metadata for document collections.
1099    
1100    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1101    
1102            * R/: Messages now use \code{ngettext}.
1103    
1104    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1105    
1106            * R/: Added functions for modifying and removing metadata.
1107    
1108    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1109    
1110            * man/: Updated some documentation.
1111    
1112            * R/: Corrected some connection issues.
1113    
1114            * inst/doc: Worked on the vignette.
1115    
1116    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1117    
1118            * inst/: Added texts and started vignette.
1119    
1120            * R/: Final changes based upon David's comments.
1121    
1122    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1123    
1124            * NAMESPACE: Corrected exports (generic methods need exportMethods
1125            directives!).
1126    
1127    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1128    
1129            * R/: Modified the TextDocCol constructur and various parsers. It
1130            is now modular and supports various file formats via plugins (see
1131            the new "Source" class).
1132    
1133    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1134    
1135            * man/: Revised documentation after previous code changes.
1136    
1137    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1138    
1139            * R/: Remaining changes as discussed with David.
1140    
1141    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1142    
1143            * R/: Some changes as suggested by David. The rest will follow
1144            within the next days.
1145    
1146    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1147    
1148            * man/: Finished documentation.
1149    
1150    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1151    
1152            * man/: Wrote some documentation.
1153    
1154    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1155    
1156            * R/: Further syntactic sugar in form of additional assignment and
1157            accessor methods.
1158    
1159    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1160    
1161            * R/: Syntactic sugar in form of "length", "show" and "summary"
1162            operators.
1163    
1164    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1165    
1166            * R/: Diverse updates. Mainly on default operators ("[" or "c")
1167            and dissimilarities.
1168    
1169    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1170    
1171            * R/: Added similarity functions.
1172    
1173            * data/: Added english stopwords.
1174    
1175    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1176    
1177            * data/: Examples compiled for new features
1178    
1179            * R/: Changes due to new structure.
1180    
1181            * NAMESPACE: Corrected namespace to reflect new structure.
1182    
1183            * R/termdocmatrix.R: Adapted for new naming scheme.
1184    
1185    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1186    
1187            * R/textdoccol.R: Adapted code for new class structure. Wrote
1188            several transform and filter functions operating on text document
1189            collections (alias text document databases).
1190    
1191            * R/aobjects.R: Adapted class structure with inheritance,
1192            repositories and additional meta data. Loading files on demand is
1193            now possible.
1194    
1195    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1196    
1197            * R/: Some cosmetic cleanups.
1198    
1199            * inst/: Removed vignette on clustering. That and much more is now
1200            described in the JSS paper on text mining. Based upon that
1201            article an elaborated vignette will be incorporated in the future.
1202    
1203    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1204    
1205            * R/: Updated generic S4 methods to comply with signature changes
1206            in newer versions of R (> 2.3)
1207    
1208    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1209    
1210            * ext/R/importRIS.R: Automatic RIS import is now possible.
1211    
1212    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1213    
1214            * R/textdoccol.R: Added RIS HTML input format.
1215    
1216    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1217    
1218            * R/textdoccol.R: Removed bug that caused invalid text document
1219            collections when handling many input files.
1220    
1221    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1222    
1223            * R/textdoccol.R: Restructured and extended file import
1224            mechanism.
1225    
1226            * inst/doc/clustering.Rnw: Adapted vignette for use with
1227            ReutNews.rda
1228    
1229            * man/ReutNews.Rd: Documentation for ReutNews.rda
1230    
1231            * data/ReutNews.rda: A tiny Reuters21578 example data set.
1232    
1233    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1234    
1235            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
1236            clustering facilities of this package.
1237    
1238    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1239    
1240            * R/aobjects.R: Changed package document structure to avoid class
1241            dependency problems.
1242    
1243    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1244    
1245            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
1246            data set.
1247    
1248            *  Finished documentation and reordered directory structure. Now "R
1249            CMD check textmin" works without errors.
1250    
1251    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1252    
1253            * src/: Various splits can now be easily created for the
1254            Reuters21578 data set.
1255    
1256    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1257    
1258            *  Updated documentation
1259    
1260    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1261    
1262            *  Wrote R documentation for some classes and methods.
1263    
1264    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1265    
1266            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
1267            files. See the questionnaire data/Umfrage.csv for such an example.
1268            We are now able to import files in Reuters-21578 XML format.
1269    
1270            *  Changed class interfaces in various files. Weighting of the text
1271            matrix is now possible.
1272    
1273    2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1274    
1275            * R/textdoccol.R: One can build term-document matrices if
1276            nessecary (with buildTDM(...)) and fill the field tdm from a text
1277            document collection with it.
1278    
1279            * R/textmatrix.R: Wrote S4 class for term-document matrices.
1280    
1281    2005-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1282    
1283            * R/textdoccol.R: We now can read in a whole XML file with several
1284            news items.
1285    
1286  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1287    
1288          * R/textdoccol.R: Set up an S4 class for a collection of text          * R/textdoccol.R: Set up an S4 class for a collection of text

Legend:
Removed from v.17  
changed lines
  Added in v.1292

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge