SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 17, Sat Nov 5 14:47:12 2005 UTC pkg/ChangeLog revision 1242, Mon Aug 19 05:33:57 2013 UTC
# Line 1  Line 1 
1    2013-08-19  Ingo Feinerer <feinerer@logic.at>
2    
3            * R/corpus.R (setOldClass): Do not register VCorpus as S4 class
4            anymore.
5    
6            * R/doc.R (setOldClass): Do not register PlainTextDocument as S4 class
7            anymore.
8    
9    2013-08-09  Ingo Feinerer <feinerer@logic.at>
10    
11            * DESCRIPTION (License): Changed to GPL-3.
12    
13    2013-07-25  Ingo Feinerer <feinerer@logic.at>
14    
15            * R/complete.R (stemCompletion): Report NA instead of error when no
16            completion can be found by the prevalent heuristic. Suggested by Hugh
17            Devlin.
18    
19    2013-07-10  Ingo Feinerer <feinerer@logic.at>
20    
21            * R/reader.R (readPDF): Use tm:::pdfinfo() (which needs the pdfinfo
22            command line tool) instead of tools:::pdf_info().
23    
24    2013-04-11  Ingo Feinerer <feinerer@logic.at>
25    
26            * R/transform.R (removeWords): Use PCRE UCP to use Unicode properties
27            to determine character types.
28    
29    2012-12-14  Ingo Feinerer <feinerer@logic.at>
30    
31            * R/matrix.R (TermDocumentMatrix): Ensure dimnames of type character
32            when generating a simple_triplet_matrix. Reported by Arho Suominen.
33    
34    2012-12-10  Ingo Feinerer <feinerer@logic.at>
35    
36            * man/tm_reduce.Rd: Document right to left folding order. Adapt
37            example as well. Suggested by Mark Rosenstein.
38    
39    2012-12-04  Ingo Feinerer <feinerer@logic.at>
40    
41            * R/filter.R (sFilter): Avoid attach() and simplify.
42    
43    2012-11-02  Ingo Feinerer <feinerer@logic.at>
44    
45            * R/doc.R (.TextDocument): Use casts to ensure data types and to avoid
46            removal of attributes.
47    
48    2012-10-03 Ingo Feinerer  <feinerer@logic.at>
49    
50            * R/weight.R (weightTfIdf, weightSMART): Gracefully handle empty
51            columns and rows (avoids blow-up due to NaN values). Suggested by Jaap
52            Frölich.
53    
54    2012-07-27 Ingo Feinerer  <feinerer@logic.at>
55    
56            * R/transform.R (removeWords): Allow longer stopword lists.
57    
58    2012-01-31  Ingo Feinerer  <feinerer@logic.at>
59    
60            * R/reader.R (readXML): Readers can now set the document language
61            themselves.
62    
63    2012-01-14  Ingo Feinerer  <feinerer@logic.at>
64    
65            * R/source.R (XMLSource, getElem.XMLSource): Simplifications as
66            proposed by Milan Bouchet-Valat.
67    
68    2012-01-11  Ingo Feinerer  <feinerer@logic.at>
69    
70            * R/matrix.R (termFreq): Fix processing of user provided
71            stopwords. Reported by Bettina Grün.
72    
73    2011-12-23  Ingo Feinerer  <feinerer@logic.at>
74    
75            * R/matrix.R (termFreq): Fix invalid handling of
76            control$wordLengths[1]. Reported by Steven C. Bagley.
77    
78    2011-12-17  Ingo Feinerer  <feinerer@logic.at>
79    
80            * DESCRIPTION (Version): Prepare for CRAN Christmas release.
81    
82    2011-12-12  Ingo Feinerer  <feinerer@logic.at>
83    
84            * R/utils.R (map_IETF_Snowball): Map empty input to "porter".
85    
86    2011-12-07  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/transform.R (removePunctuation): Add option to preserve
89            intra-word dashes.
90    
91    2011-12-06  Ingo Feinerer  <feinerer@logic.at>
92    
93            * R/matrix.R (termFreq): Allow reordering of control option
94            processing.
95    
96    2011-11-17  Ingo Feinerer  <feinerer@logic.at>
97    
98            * R/reader.R (readPDF): Use tools:::pdf_info() instead of external
99            pdfinfo tool.
100    
101            * inst/stopwords/SMART.dat: Add SMART information retrieval system
102            stopwords (which are also used by the MC toolkit).
103    
104            * R/matrix (termFreq): Allow local option \code{bounds$local} to
105            restrict how often a term may appear in each document (generalizes
106            \code{minDocFreq}). Similarly the local option \code{wordLenghts}
107            for word length bounds (generalizes \code{minWordLength}).
108    
109            * R/matrix.R (TermDocumentMatrix.VCorpus): New global option
110            \code{bounds$global} for restricting how often a term is allowed
111            to appear in different documents.
112    
113            * R/matrix.R (TermDocumentMatrix.VCorpus): Distinguish between
114            local options delegated internally to termFreq() and global
115            options which are processed by the term-document matrix
116            constructor itself.
117    
118    2011-11-15  Ingo Feinerer  <feinerer@logic.at>
119    
120            * man/getTokenizers.Rd: Document getTokenizers().
121    
122            * man/tokenizer.Rd: Document MC_tokenizer() and scan_tokenizer().
123    
124    2011-11-04  Ingo Feinerer  <feinerer@logic.at>
125    
126            * man/matrix.Rd: Document as.TermDocumentMatrix.term_frequency.
127    
128            * man/combine.Rd: Document c.term_frequency().
129    
130    2011-10-11  Ingo Feinerer  <feinerer@logic.at>
131    
132            * R/meta.R (`meta<-.Corpus`): Assume that the replacement value
133            can be accessed via '[' and not '[['.
134    
135    2011-08-24  Ingo Feinerer  <feinerer@logic.at>
136    
137            * R/stopwords.R (stopwords): Raise an error if no stopwords are
138            available for requested language. Suggested by Derek M Jones.
139    
140    2011-05-27  Ingo Feinerer  <feinerer@logic.at>
141    
142            * R/weight.R (weightSMART): Implement Cosine and pivoted unique
143            normalization.
144    
145    2011-02-17  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/transform.R (stemDocument.PlainTextDocument): Use language
148            argument.
149    
150    2011-02-04  Ingo Feinerer  <feinerer@logic.at>
151    
152            * R/source.R: Store strings and connections instead of unevaluated
153            calls.
154    
155    2010-11-26  Ingo Feinerer  <feinerer@logic.at>
156    
157            * R/corpus.R (Corpus): Allow init and exit hooks for readers.
158    
159    2010-10-22  Ingo Feinerer  <feinerer@logic.at>
160    
161            * R/matrix.R (.TermDocumentMatrix): Make Weighting an attribute
162            (instead of a list element).
163    
164    2010-10-16  Ingo Feinerer  <feinerer@logic.at>
165    
166            * R/corpus.R (`[[.VCorpus`, `[[.PCorpus'): Access individual
167            documents by names (fallback to IDs if names are not set).
168    
169    2010-08-25  Ingo Feinerer  <feinerer@logic.at>
170    
171            * R/corpus.R (c.Corpus): When concatenating corpora, the argument
172            \code{recursive} now determines whether existing corpus meta data
173            is used.
174    
175    2010-08-06  Ingo Feinerer  <feinerer@logic.at>
176    
177            * R/transform.R: Removed convert_UTF_8(). Use enc2utf8() instead.
178    
179    2010-06-17  Ingo Feinerer  <feinerer@logic.at>
180    
181            * R/matrix.R (TermDocumentMatrix): If a dictionary is given do not
182            remove terms not occurring in the corpus anymore.
183    
184    2010-06-02  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/plot.R (Zipf_plot, Heaps_plot): Plotting functions for Zipf's
187            and Heaps' law.
188    
189    2010-05-18  Ingo Feinerer  <feinerer@logic.at>
190    
191            * R/corpus.R (Corpus, PCorpus): Use element names as IDs if
192            provided by a source.
193    
194    2010-04-09  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/source.R (.Source): Provide document names.
197    
198    2010-04-07  Ingo Feinerer  <feinerer@logic.at>
199    
200            * R/meta.R (`content_or_meta`): Utility function.
201    
202    2010-03-19  Ingo Feinerer  <feinerer@logic.at>
203    
204            * R/reader.R (readReut21578XML, readReut21578XMLasPlain): Extract
205            TOPICS, LEWISSPLIT, CGISPLIT, and OLDID meta tags.
206    
207    2010-03-03  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/weight.R (weightTfIdf): Added normalization option.
210    
211            * man/tm_tag_score.Rd: Add General Inquirer example for sentiment
212            analysis.
213    
214    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
215    
216            * R/score.R (tm_tag_score): Compute a score from the number of
217            tags matching in a document.
218    
219    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
220    
221            * R/complete.R (stemCompletion): New completion heuristics.
222    
223    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
224    
225            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
226    
227    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
228    
229            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
230            setOldClass(c(..., "list")) works.
231    
232    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
233    
234            * R/transform.R (stemDocument.character): In case input is a
235            simple character just delegate to the default Snowball stemmer.
236    
237    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
238    
239            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
240            data.
241    
242    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
243    
244            * R/doc.R (`Content<-`): Be careful with names attribute.
245    
246    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
247    
248            * R/source.R (DirSource): Improved implementation especially when
249            handling many (> 1M) files.
250    
251    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
252    
253            * R/source.R (getElem.URISource): Use encoding argument.
254    
255    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
256    
257            * R/doc.R (setOldClass): Register S3 document classes to be
258            recognized by S4 methods.
259    
260    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
261    
262            * R/matrix.R (termFreq): Add option to remove punctuation
263            characters.
264    
265    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
266    
267            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
268            merging multiple term-document matrices.
269    
270    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
271    
272            * R/corpus.R (setOldClass): Register S3 corpus classes to be
273            recognized by S4 methods.
274    
275            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
276            that CRAN Mac OS X builds do not fail any longer.
277    
278    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
279    
280            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
281            of RWeka:AlphabeticTokenizer() as default.
282    
283    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
284    
285            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
286            caused words at the beginning or the end of a line not to be removed. Do
287            not delete whitespace anymore.
288    
289    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
290    
291            * R/source.R (DirSource): Default to working directory if no path
292            is specified.
293    
294    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
295    
296            * R/source.R (DirSource): Stop on empty directories.
297    
298    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
299    
300            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
301            named documents.
302    
303    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
304    
305            * R/transform.R (removeWords): Improve regular expressions.
306    
307    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
308    
309            * R/meta.R (DublinCore): Allow lower case tags.
310    
311    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
312    
313            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
314            instead of x$children.
315    
316    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
317    
318            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
319    
320    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
321    
322            * R/: Use S3 instead of S4 class system.
323    
324    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
325    
326            * R/reader.R (readMail): Moved to tm.plugin.mail package.
327    
328    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
329    
330            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
331            postings are basically e-mails with some extra headers.
332    
333    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
334    
335            * R/transform.R: Move convertMboxEml, removeCitation,
336            removeMultipart, and removeSignature to the tm.plugin.mail package
337            since they are mainly utility functions (for handling e-mails) and
338            not very framework specific.
339    
340    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
341    
342            * man/: Fix documentation.
343    
344    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
345    
346            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
347            plain text document instead of an XML document for texts of the
348            Reuters-21578 dataset.
349    
350            * R/sparse.R: Removed since the slam package is now available on
351            CRAN.
352    
353            * DESCRIPTION (Depends): Add slam package.
354    
355    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
356    
357            * R/transform.R (stemDoc): Fix character(0) handling.
358    
359    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
360    
361            * R/doc.R (show): Pretty print.
362    
363    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
364    
365            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
366            gracefully.
367    
368    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
369    
370            * R/corpus.R: Make corpus virtual. Implement corpus with standard
371            and permanent storage semantics.
372    
373            * DESCRIPTION: New major release. A *lot* of improvements.
374    
375    2009-05-04   Ingo Feinerer <feinerer@logic.at>
376    
377            * NAMESPACE: Export some simple_triplet_matrix functions.
378    
379    2009-04-28   Ingo Feinerer <feinerer@logic.at>
380    
381            * R/weight.R: Adapt tf-idf to new matrix format.
382    
383    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
384    
385            * R/matrix.R: Create two distinct classes for term-document and
386            document-term matrices.
387    
388    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
389    
390            * R/termdocmatrix.R: No longer use Matrix package. This reduces
391            package start-up time significantly.
392    
393    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
394    
395            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
396    
397    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
398    
399            * R/transform.R (tmReduce): Combine multiple maps into one
400            transformation.
401    
402    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
403    
404            * R/weight.R: Remove weightLogical since it does not return a
405            dgCMatrix.
406    
407            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
408            or TermDocumentMatrix instead.
409    
410    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
411    
412            * inst/doc/extensions.Rnw: Finished vignette.
413    
414    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
415    
416            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
417            DocumentTermMatrix representations.
418    
419    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
420    
421            * R/reader.R (readXML): New reader for arbitrary XML files.
422    
423    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
424    
425            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
426            (XMLSource): New XMLSource class for arbitrary XML files.
427            (Source): New slot Vectorized.
428    
429    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
430    
431            * R/reader.R (readTabular): Experimental reader for tabular data
432            structures which can be customized via user-defined mappings.
433    
434            * R/reader.R: Always use UTC time zone.
435    
436            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
437    
438    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
439    
440            * R/reader.R (readDOC): Options can be passed over to antiword.
441    
442            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
443            pdftotext.
444    
445    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
446    
447            * R/source.R (DirSource): Add pattern and ignore.case arguments
448            which are internally passed over to list.files().
449    
450    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
451    
452            * inst/doc/tm.Rnw: Suppress pointless loading message.
453    
454    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
455    
456            * DESCRIPTION: Speed up package loading (via moving packages not
457            strictly necessary for normal operation to Suggests instead of
458            Depends).
459    
460    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
461    
462            * R/reader.R (readNewsgroup): The date format is now configurable.
463    
464    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
465    
466            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
467    
468    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
469    
470            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
471    
472    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
473    
474            * R/source.R (DataframeSource): New source class for data frames.
475    
476            * R/source.R: Fixed non-standard call evaluation.
477    
478    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
479    
480            * R/source.R (URISource): New source class for a single document.
481    
482    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
483    
484            * R/source.R: Refactoring.
485    
486    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
487    
488            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
489            Rmpi installations more gracefully.
490    
491    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
492    
493            * R/source.R (Source): Add Length slot.
494    
495    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
496    
497            * R/AAA.R: Unify duplicated .onLoad function.
498    
499    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
500    
501            * DESCRIPTION (Suggests): Added Rmpi.
502    
503    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
504    
505            * R/source.R (getElem): Fix 'no visible binding' warning.
506    
507            * man/WeightFunction.Rd: Fix signature.
508    
509    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
510    
511            * R/weight.R: Introduce name abbreviations for weighting functions.
512    
513    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
514    
515            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
516    
517            * R/cluster.R: Provide convenience functions for using a MPI
518            cluster.
519    
520            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
521            available.
522    
523            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
524            available.
525    
526    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
527    
528            * R/textdoccol.R (lapply): Removed debug print out.
529    
530    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
531    
532            * R/reader.R (readRCV1): Improved meta data extraction from
533            Reuters Corpus Volume 1 documents.
534    
535    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * R/transform.R: Ensure that all mappings preserve multiline
538            structures.
539    
540    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * R/filter.R: Every filter has now an attribute indicating whether
543            it sould be applied to document level (doclevel).
544    
545            * R/textdoccol.R (tmFilter): Set searchFullText as new default
546            filter.
547    
548    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
549    
550            * R/transform.R (replacePatterns): Replaced removeWords by
551            replacePatterns. Suggested by Christian Buchta.
552    
553            * R/textdoccol.R (inspect): Improved formatting.
554    
555    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
556    
557            * inst/CITATION: Updated JSS article information.
558    
559            * R/textdoccol.R (setAs): Added coerce method from list to
560            corpus.
561    
562            * R/meta.R (meta): Improved meta data handling.
563    
564    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
565    
566            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
567            Christian Buchta.
568    
569            * inst/CITATION: Added template to include JSS article reference.
570    
571    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
572    
573            * R/textdoccol.R (tmMap): Introduced lazy mapping.
574    
575            * R/source.R: Added VectorSource.
576    
577    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
578    
579            * man/: Language codes should be in ISO 639-1 format.
580    
581            * R/textdoccol.R (asPlain): Preserve local meta data.
582    
583    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
584    
585            * R/textdoccol.R (writeCorpus): Function for writing a corpus
586            containing plain text documents to disk.
587    
588    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
589    
590            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
591            always set correctly.
592    
593            * R/textdoccol.R: Set load = TRUE as default for load on demand
594            since in most cases this is the wanted behaviour.
595    
596    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
597    
598            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
599    
600            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
601    
602    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
603    
604            * R/meta.R (meta): New function for consistent access to meta data
605            of document collections, repositories, and texts.
606    
607    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
608    
609            * R/: Better support for encodings.
610    
611    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
612    
613            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
614            selection when no reader argument is given.
615    
616    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
617    
618            * R/source.R (CSVSource): Now uses read.csv instead of scan
619            internally.
620    
621    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
622    
623            * R/reader.R (getReaders): Returns available reader functions.
624    
625            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
626            as default.
627    
628    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * R/stopwords.R (stopwords): Shortened code, removed codetools
631            variable warnings.
632    
633            * man/: Documentation for showMeta, added an example for tmMap.
634    
635            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
636            some minor typos fixed.
637    
638    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
639    
640            * R/aobjects.R (showMeta): Added method for pretty printing a
641            text document's meta data.
642    
643    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
644    
645            * R/textdoccol.R (TextDocCol): Better handling of empty
646            arguments.
647    
648            * NAMESPACE: Exported readDOC.
649    
650            * man/completeStems.Rd: Added an example.
651    
652    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
653    
654            * R/stopwords.R (stopwords): Look up .dat files at every
655            call. Allows users to modify stopword .dat files interactively.
656    
657    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * R/termdocmatrix.R (termFreq): Correct processing of empty
660            documents.
661    
662    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
663    
664            * man/: Updated documentation.
665    
666    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
667    
668            * R/complete.R (completeStems): Completes (heuristically) word
669            stems.
670    
671            * R/termdocmatrix.R (TermDocMatrix2): New modular
672            constructor.
673    
674            * NAMESPACE: Exported termFreq.
675    
676    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
677    
678            * R/reader.R (readDOC): Added MS Word reader (using antiword).
679    
680    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
681    
682            * R/weight.R: Weighting functions for TermDocMatrix.
683    
684    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
685    
686            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
687            functions for accessing dimension, column, and row names.
688    
689            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
690    
691    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
692    
693            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
694    
695    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
696    
697            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
698    
699    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
700    
701            * R/reader.R (readPDF): Removed manual checks for pdftotext and
702            pdfinfo. The system call gives a warning anyway.
703    
704    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
705    
706            * R/textdoccol.R (asPlain): Conversion from
707            StructuredTextDocuments to PlainTextDocuments.
708    
709    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
710    
711            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
712            for accessing term-document matrices.
713    
714            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
715            are installed.
716    
717    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
718    
719            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
720            Christian Buchta.
721    
722    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
723    
724            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
725    
726    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
727    
728            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
729    
730            * R/reader.R (readPDF): Added PDF reader.
731    
732    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
733    
734            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
735    
736            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
737    
738            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
739    
740            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
741    
742    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
743    
744            * R/distmeasure.R (dissimilarity): Replaced dists call from
745            package cba by new dist call from package proxy.
746    
747    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
748    
749            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
750    
751    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
752    
753            * R/termdocmatrix.R: require() uses the quietly option to suppress
754            loading messages.
755    
756    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
757    
758            * R/dictionary.R: Added dictionary support.
759    
760    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
761    
762            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
763            documents. This simplifies some functions, e.g., asPlain.
764    
765    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
766    
767            * inst/doc/tm.Rnw: Fixed some typos in vignette.
768    
769    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
770    
771            * R/textdoccol.R (replaceWords): Added method to replace a set of
772            words by a single word. Useful for synonyms.
773    
774    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
775    
776            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
777    
778    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
779    
780            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
781            vectors. Thanks to Ariel Maguyon for his error report.
782            (removeSparseTerms): New function to remove columns from a
783            term-document matrix exceeding a sparse factor.
784    
785    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
786    
787            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
788    
789    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
790    
791            * man/sFilter.Rd: Corrected documentation on statement format (use
792            '==' instead of '=').
793    
794    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
795    
796            * R/aobjects.R (StructuredTextDocument): Inherits from
797            TextDocument.
798    
799    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
802            on sparse matrices as proposed by Martin Maechler.
803    
804    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
805    
806            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
807            \pkg{filehash} version makes them deprecated.
808    
809    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
810    
811            * R/termdocmatrix.R (textvector): Stemming is now performed before
812            erasing stopwords.
813            (weightMatrix): Adapted to handle sparse matrices.
814            (TermDocMatrix): Sparse matrix is now efficiently built by
815            direct stepwise insertion of row values into it.
816    
817    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
818    
819            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
820            due to ongoing problems. For our purposes the latter is as useful
821            as the replaced package.
822    
823    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
824    
825            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
826    
827            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
828    
829    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
830    
831            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
832            languages with available stopwords.
833    
834    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
835    
836            * inst/doc/tm.Rnw: Minor corrections in the vignette.
837    
838    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
839    
840            * DESCRIPTION: Update to version 0.2, since a lot of new features
841            have been integrated.
842    
843            * inst/stopwords: Updated existing stopwords and added stopwords
844            for various other languages.
845    
846    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
847    
848            * man/: Updated documentation.
849    
850            * Work/testDb.R: Script to test database stuff.
851    
852            * R/: Fixed various database related bugs. Seems to be rather
853            useable now, i.e., consider as alpha status for now.
854    
855    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * R/: Fixed some bugs related to database support.
858    
859    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
860    
861            * man/: Added a lot of examples to the manuals.
862    
863    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
864    
865            * man/: Updated parts of the documentation.
866    
867            * R/textdoccol.R (asPlain): Added conversion from newsgroup
868            documents to plain text documents.
869    
870    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
871    
872            * R/textdoccol.R: Finished experimental database support. Not yet
873            intensively tested.
874    
875            * R/source.R: Now each source has a default reader.
876    
877            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
878            class anymore.
879    
880            * R/plaintextdoc.R: Custom show method for plain text documents.
881    
882            * R/aobjects.R: Added a class for structured text documents.
883    
884            * R/reader.R: Replaced remaining \code{parser} occurrences with
885            \code{reader}.
886    
887            * R/textdoccol.R (summary): Indent tags.
888    
889            * R/textdoccol.R (removePunctuation): Transform method to remove
890            punctuation marks.
891    
892    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
895            using prescindMeta().
896    
897    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
898    
899            * R/textdoccol.R: Improved database support.
900    
901    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
902    
903            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
904    
905            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
906            language code.
907    
908            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
909            into parserControl argument.
910    
911            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
912    
913    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
914    
915            * Work/tmDataSetup.R: The datasets acq and crude can now be
916            created on the fly.
917    
918            * R/stopwords.R: Introduced a function returning the stopwords for
919            a given language (English, German and French at the moment)
920    
921            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
922            otherwise falls back to Snowball package.
923    
924    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
925    
926            * man/dissimilarity-methods.Rd: Make clear that any method offered
927            by "dists" from package "cba" can be used.
928    
929    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
930    
931            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
932            to Kurt's latex suggestion. Removed points and underscores in
933            variable names for consistent naming.
934    
935            * DESCRIPTION: Update to version 0.1-2.
936    
937            * man/TextRepository.Rd: Fixed bug in documentation.
938    
939    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
940    
941            * DESCRIPTION: Update to version 0.1-1.
942    
943    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
944    
945            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
946            wordStem.
947    
948    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
949    
950            * R/: Changes due to Kurt's review.
951    
952    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
953    
954            * R/: Implemented improvements based upon comments by David
955            Meyer.
956    
957    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
958    
959            * inst/doc/: Rewrote vignette.
960    
961            * man/: Improved documentation.
962    
963    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
964    
965            * man/: Updated documentation.
966    
967            * DESCRIPTION: Changed package name to "tm". Updated version to
968            0.1 for first CRAN release.
969    
970            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
971            list archive example.
972    
973            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
974            archive example.
975    
976            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
977            from (several mails per box) mbox format to (single mail per file)
978            eml format.
979    
980    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
981    
982            * data/crude.rda: Rebuilt.
983    
984            * data/acq.rda: Rebuilt.
985    
986            * R/reader.R: Factored out reader and parser methods from
987            textdoccol.R.
988    
989            * R/source.R: Factored out Source methods from aobjects.R and
990            textdoccol.R.
991            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
992            feeds.
993    
994            * R/textdoccol.R (DirSource): Added support for recursive
995            traversal of directories.
996    
997    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
998    
999            * R/textdoccol.R ([[): Loads the document corpus automatically
1000            into memory upon access.
1001            (tm_transform, tm_filter): Removed several checks whether the
1002            document is already loaded ([[ ensures this now).
1003            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
1004            mailing list archive.
1005    
1006    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1007    
1008            * R/aobjects.R (TextDocument): Is now a virtual class.
1009            (Source): Is now a virtual class.
1010    
1011    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1012    
1013            * R/textdoccol.R (c): Support for an arbitrary number of document
1014            collections.
1015    
1016    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1017    
1018            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
1019            append_meta and remove_meta.
1020    
1021            * R/textdoccol.R: Removed modify_metadata method.
1022    
1023            * R/textrepo.R: Removed modify_metadata method.
1024    
1025            * R/textdoccol.R (remove_meta): Supports removal of document
1026            collection metadata and document (= in data frame) metadata.
1027    
1028    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1029    
1030            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
1031    
1032            * data/crude.rda: Rebuilt.
1033    
1034            * data/acq.rda: Rebuilt.
1035    
1036            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
1037    
1038            * R/textdoccol.R ([): Bug fix for subsetting a document
1039            collection's data frame.
1040    
1041    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1042    
1043            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
1044            to s_filter.
1045    
1046            * R/textdoccol.R: Local text documents' metadata can now be copied
1047            to a document collection's data frame with prescind_meta.
1048    
1049    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1050    
1051            * R/: Text documents' slot metadata is now accessible in s_filter.
1052    
1053            * R/: Rewrote s_filter function (has still some restrictions).
1054    
1055    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1056    
1057            * R/: Various fixes in handling metadata.
1058    
1059            * R/: Added update mechanism for text document collections.
1060    
1061    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1062    
1063            * R/: Merging of document collections now creates a binary tree
1064            for reconstructing merged document collections.
1065    
1066            * R/: Redesign of metadata for document collections.
1067    
1068    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1069    
1070            * R/: Messages now use \code{ngettext}.
1071    
1072    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1073    
1074            * R/: Added functions for modifying and removing metadata.
1075    
1076    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1077    
1078            * man/: Updated some documentation.
1079    
1080            * R/: Corrected some connection issues.
1081    
1082            * inst/doc: Worked on the vignette.
1083    
1084    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1085    
1086            * inst/: Added texts and started vignette.
1087    
1088            * R/: Final changes based upon David's comments.
1089    
1090    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1091    
1092            * NAMESPACE: Corrected exports (generic methods need exportMethods
1093            directives!).
1094    
1095    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1096    
1097            * R/: Modified the TextDocCol constructur and various parsers. It
1098            is now modular and supports various file formats via plugins (see
1099            the new "Source" class).
1100    
1101    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1102    
1103            * man/: Revised documentation after previous code changes.
1104    
1105    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1106    
1107            * R/: Remaining changes as discussed with David.
1108    
1109    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1110    
1111            * R/: Some changes as suggested by David. The rest will follow
1112            within the next days.
1113    
1114    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1115    
1116            * man/: Finished documentation.
1117    
1118    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1119    
1120            * man/: Wrote some documentation.
1121    
1122    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1123    
1124            * R/: Further syntactic sugar in form of additional assignment and
1125            accessor methods.
1126    
1127    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1128    
1129            * R/: Syntactic sugar in form of "length", "show" and "summary"
1130            operators.
1131    
1132    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1133    
1134            * R/: Diverse updates. Mainly on default operators ("[" or "c")
1135            and dissimilarities.
1136    
1137    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1138    
1139            * R/: Added similarity functions.
1140    
1141            * data/: Added english stopwords.
1142    
1143    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1144    
1145            * data/: Examples compiled for new features
1146    
1147            * R/: Changes due to new structure.
1148    
1149            * NAMESPACE: Corrected namespace to reflect new structure.
1150    
1151            * R/termdocmatrix.R: Adapted for new naming scheme.
1152    
1153    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1154    
1155            * R/textdoccol.R: Adapted code for new class structure. Wrote
1156            several transform and filter functions operating on text document
1157            collections (alias text document databases).
1158    
1159            * R/aobjects.R: Adapted class structure with inheritance,
1160            repositories and additional meta data. Loading files on demand is
1161            now possible.
1162    
1163    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1164    
1165            * R/: Some cosmetic cleanups.
1166    
1167            * inst/: Removed vignette on clustering. That and much more is now
1168            described in the JSS paper on text mining. Based upon that
1169            article an elaborated vignette will be incorporated in the future.
1170    
1171    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1172    
1173            * R/: Updated generic S4 methods to comply with signature changes
1174            in newer versions of R (> 2.3)
1175    
1176    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1177    
1178            * ext/R/importRIS.R: Automatic RIS import is now possible.
1179    
1180    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1181    
1182            * R/textdoccol.R: Added RIS HTML input format.
1183    
1184    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1185    
1186            * R/textdoccol.R: Removed bug that caused invalid text document
1187            collections when handling many input files.
1188    
1189    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1190    
1191            * R/textdoccol.R: Restructured and extended file import
1192            mechanism.
1193    
1194            * inst/doc/clustering.Rnw: Adapted vignette for use with
1195            ReutNews.rda
1196    
1197            * man/ReutNews.Rd: Documentation for ReutNews.rda
1198    
1199            * data/ReutNews.rda: A tiny Reuters21578 example data set.
1200    
1201    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1202    
1203            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
1204            clustering facilities of this package.
1205    
1206    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1207    
1208            * R/aobjects.R: Changed package document structure to avoid class
1209            dependency problems.
1210    
1211    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1212    
1213            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
1214            data set.
1215    
1216            *  Finished documentation and reordered directory structure. Now "R
1217            CMD check textmin" works without errors.
1218    
1219    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1220    
1221            * src/: Various splits can now be easily created for the
1222            Reuters21578 data set.
1223    
1224    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1225    
1226            *  Updated documentation
1227    
1228    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1229    
1230            *  Wrote R documentation for some classes and methods.
1231    
1232    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1233    
1234            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
1235            files. See the questionnaire data/Umfrage.csv for such an example.
1236            We are now able to import files in Reuters-21578 XML format.
1237    
1238            *  Changed class interfaces in various files. Weighting of the text
1239            matrix is now possible.
1240    
1241    2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1242    
1243            * R/textdoccol.R: One can build term-document matrices if
1244            nessecary (with buildTDM(...)) and fill the field tdm from a text
1245            document collection with it.
1246    
1247            * R/textmatrix.R: Wrote S4 class for term-document matrices.
1248    
1249    2005-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1250    
1251            * R/textdoccol.R: We now can read in a whole XML file with several
1252            news items.
1253    
1254  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1255    
1256          * R/textdoccol.R: Set up an S4 class for a collection of text          * R/textdoccol.R: Set up an S4 class for a collection of text

Legend:
Removed from v.17  
changed lines
  Added in v.1242

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge