SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 41, Sun Mar 12 17:14:15 2006 UTC pkg/ChangeLog revision 1150, Tue Nov 15 15:37:17 2011 UTC
# Line 1  Line 1 
1    2011-11-15  Ingo Feinerer  <feinerer@logic.at>
2    
3            * man/getTokenizers.Rd: Document getTokenizers().
4    
5            * man/tokenizer.Rd: Document MC_tokenizer() and scan_tokenizer().
6    
7    2011-11-04  Ingo Feinerer  <feinerer@logic.at>
8    
9            * man/matrix.Rd: Document as.TermDocumentMatrix.term_frequency.
10    
11            * man/combine.Rd: Document c.term_frequency().
12    
13    2011-10-11  Ingo Feinerer  <feinerer@logic.at>
14    
15            * R/meta.R (`meta<-.Corpus`): Assume that the replacement value
16            can be accessed via '[' and not '[['.
17    
18    2011-08-24  Ingo Feinerer  <feinerer@logic.at>
19    
20            * R/stopwords.R (stopwords): Raise an error if no stopwords are
21            available for requested language. Suggested by Derek M Jones.
22    
23    2011-05-27  Ingo Feinerer  <feinerer@logic.at>
24    
25            * R/weight.R (weightSMART): Implement Cosine and pivoted unique
26            normalization.
27    
28    2011-02-17  Ingo Feinerer  <feinerer@logic.at>
29    
30            * R/transform.R (stemDocument.PlainTextDocument): Use language
31            argument.
32    
33    2011-02-04  Ingo Feinerer  <feinerer@logic.at>
34    
35            * R/source.R: Store strings and connections instead of unevaluated
36            calls.
37    
38    2010-11-26  Ingo Feinerer  <feinerer@logic.at>
39    
40            * R/corpus.R (Corpus): Allow init and exit hooks for readers.
41    
42    2010-10-22  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/matrix.R (.TermDocumentMatrix): Make Weighting an attribute
45            (instead of a list element).
46    
47    2010-10-16  Ingo Feinerer  <feinerer@logic.at>
48    
49            * R/corpus.R (`[[.VCorpus`, `[[.PCorpus'): Access individual
50            documents by names (fallback to IDs if names are not set).
51    
52    2010-08-25  Ingo Feinerer  <feinerer@logic.at>
53    
54            * R/corpus.R (c.Corpus): When concatenating corpora, the argument
55            \code{recursive} now determines whether existing corpus meta data
56            is used.
57    
58    2010-08-06  Ingo Feinerer  <feinerer@logic.at>
59    
60            * R/transform.R: Removed convert_UTF_8(). Use enc2utf8() instead.
61    
62    2010-06-17  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/matrix.R (TermDocumentMatrix): If a dictionary is given do not
65            remove terms not occurring in the corpus anymore.
66    
67    2010-06-02  Ingo Feinerer  <feinerer@logic.at>
68    
69            * R/plot.R (Zipf_plot, Heaps_plot): Plotting functions for Zipf's
70            and Heaps' law.
71    
72    2010-05-18  Ingo Feinerer  <feinerer@logic.at>
73    
74            * R/corpus.R (Corpus, PCorpus): Use element names as IDs if
75            provided by a source.
76    
77    2010-04-09  Ingo Feinerer  <feinerer@logic.at>
78    
79            * R/source.R (.Source): Provide document names.
80    
81    2010-04-07  Ingo Feinerer  <feinerer@logic.at>
82    
83            * R/meta.R (`content_or_meta`): Utility function.
84    
85    2010-03-19  Ingo Feinerer  <feinerer@logic.at>
86    
87            * R/reader.R (readReut21578XML, readReut21578XMLasPlain): Extract
88            TOPICS, LEWISSPLIT, CGISPLIT, and OLDID meta tags.
89    
90    2010-03-03  Ingo Feinerer  <feinerer@logic.at>
91    
92            * R/weight.R (weightTfIdf): Added normalization option.
93    
94            * man/tm_tag_score.Rd: Add General Inquirer example for sentiment
95            analysis.
96    
97    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
98    
99            * R/score.R (tm_tag_score): Compute a score from the number of
100            tags matching in a document.
101    
102    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
103    
104            * R/complete.R (stemCompletion): New completion heuristics.
105    
106    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
107    
108            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
109    
110    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
111    
112            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
113            setOldClass(c(..., "list")) works.
114    
115    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
116    
117            * R/transform.R (stemDocument.character): In case input is a
118            simple character just delegate to the default Snowball stemmer.
119    
120    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
121    
122            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
123            data.
124    
125    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
126    
127            * R/doc.R (`Content<-`): Be careful with names attribute.
128    
129    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
130    
131            * R/source.R (DirSource): Improved implementation especially when
132            handling many (> 1M) files.
133    
134    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
135    
136            * R/source.R (getElem.URISource): Use encoding argument.
137    
138    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
139    
140            * R/doc.R (setOldClass): Register S3 document classes to be
141            recognized by S4 methods.
142    
143    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
144    
145            * R/matrix.R (termFreq): Add option to remove punctuation
146            characters.
147    
148    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
149    
150            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
151            merging multiple term-document matrices.
152    
153    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
154    
155            * R/corpus.R (setOldClass): Register S3 corpus classes to be
156            recognized by S4 methods.
157    
158            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
159            that CRAN Mac OS X builds do not fail any longer.
160    
161    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
162    
163            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
164            of RWeka:AlphabeticTokenizer() as default.
165    
166    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
167    
168            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
169            caused words at the beginning or the end of a line not to be removed. Do
170            not delete whitespace anymore.
171    
172    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
173    
174            * R/source.R (DirSource): Default to working directory if no path
175            is specified.
176    
177    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
178    
179            * R/source.R (DirSource): Stop on empty directories.
180    
181    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
182    
183            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
184            named documents.
185    
186    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
187    
188            * R/transform.R (removeWords): Improve regular expressions.
189    
190    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
191    
192            * R/meta.R (DublinCore): Allow lower case tags.
193    
194    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
197            instead of x$children.
198    
199    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
200    
201            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
202    
203    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
204    
205            * R/: Use S3 instead of S4 class system.
206    
207    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/reader.R (readMail): Moved to tm.plugin.mail package.
210    
211    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
212    
213            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
214            postings are basically e-mails with some extra headers.
215    
216    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
217    
218            * R/transform.R: Move convertMboxEml, removeCitation,
219            removeMultipart, and removeSignature to the tm.plugin.mail package
220            since they are mainly utility functions (for handling e-mails) and
221            not very framework specific.
222    
223    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
224    
225            * man/: Fix documentation.
226    
227    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
228    
229            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
230            plain text document instead of an XML document for texts of the
231            Reuters-21578 dataset.
232    
233            * R/sparse.R: Removed since the slam package is now available on
234            CRAN.
235    
236            * DESCRIPTION (Depends): Add slam package.
237    
238    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
239    
240            * R/transform.R (stemDoc): Fix character(0) handling.
241    
242    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
243    
244            * R/doc.R (show): Pretty print.
245    
246    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
247    
248            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
249            gracefully.
250    
251    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
252    
253            * R/corpus.R: Make corpus virtual. Implement corpus with standard
254            and permanent storage semantics.
255    
256            * DESCRIPTION: New major release. A *lot* of improvements.
257    
258    2009-05-04   Ingo Feinerer <feinerer@logic.at>
259    
260            * NAMESPACE: Export some simple_triplet_matrix functions.
261    
262    2009-04-28   Ingo Feinerer <feinerer@logic.at>
263    
264            * R/weight.R: Adapt tf-idf to new matrix format.
265    
266    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
267    
268            * R/matrix.R: Create two distinct classes for term-document and
269            document-term matrices.
270    
271    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
272    
273            * R/termdocmatrix.R: No longer use Matrix package. This reduces
274            package start-up time significantly.
275    
276    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
277    
278            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
279    
280    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
281    
282            * R/transform.R (tmReduce): Combine multiple maps into one
283            transformation.
284    
285    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
286    
287            * R/weight.R: Remove weightLogical since it does not return a
288            dgCMatrix.
289    
290            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
291            or TermDocumentMatrix instead.
292    
293    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
294    
295            * inst/doc/extensions.Rnw: Finished vignette.
296    
297    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
298    
299            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
300            DocumentTermMatrix representations.
301    
302    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
303    
304            * R/reader.R (readXML): New reader for arbitrary XML files.
305    
306    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
307    
308            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
309            (XMLSource): New XMLSource class for arbitrary XML files.
310            (Source): New slot Vectorized.
311    
312    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
313    
314            * R/reader.R (readTabular): Experimental reader for tabular data
315            structures which can be customized via user-defined mappings.
316    
317            * R/reader.R: Always use UTC time zone.
318    
319            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
320    
321    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
322    
323            * R/reader.R (readDOC): Options can be passed over to antiword.
324    
325            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
326            pdftotext.
327    
328    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
329    
330            * R/source.R (DirSource): Add pattern and ignore.case arguments
331            which are internally passed over to list.files().
332    
333    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
334    
335            * inst/doc/tm.Rnw: Suppress pointless loading message.
336    
337    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
338    
339            * DESCRIPTION: Speed up package loading (via moving packages not
340            strictly necessary for normal operation to Suggests instead of
341            Depends).
342    
343    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
344    
345            * R/reader.R (readNewsgroup): The date format is now configurable.
346    
347    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
348    
349            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
350    
351    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
352    
353            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
354    
355    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
356    
357            * R/source.R (DataframeSource): New source class for data frames.
358    
359            * R/source.R: Fixed non-standard call evaluation.
360    
361    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
362    
363            * R/source.R (URISource): New source class for a single document.
364    
365    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
366    
367            * R/source.R: Refactoring.
368    
369    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
370    
371            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
372            Rmpi installations more gracefully.
373    
374    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
375    
376            * R/source.R (Source): Add Length slot.
377    
378    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
379    
380            * R/AAA.R: Unify duplicated .onLoad function.
381    
382    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
383    
384            * DESCRIPTION (Suggests): Added Rmpi.
385    
386    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
387    
388            * R/source.R (getElem): Fix 'no visible binding' warning.
389    
390            * man/WeightFunction.Rd: Fix signature.
391    
392    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
393    
394            * R/weight.R: Introduce name abbreviations for weighting functions.
395    
396    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
397    
398            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
399    
400            * R/cluster.R: Provide convenience functions for using a MPI
401            cluster.
402    
403            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
404            available.
405    
406            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
407            available.
408    
409    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
410    
411            * R/textdoccol.R (lapply): Removed debug print out.
412    
413    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
414    
415            * R/reader.R (readRCV1): Improved meta data extraction from
416            Reuters Corpus Volume 1 documents.
417    
418    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
419    
420            * R/transform.R: Ensure that all mappings preserve multiline
421            structures.
422    
423    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
424    
425            * R/filter.R: Every filter has now an attribute indicating whether
426            it sould be applied to document level (doclevel).
427    
428            * R/textdoccol.R (tmFilter): Set searchFullText as new default
429            filter.
430    
431    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
432    
433            * R/transform.R (replacePatterns): Replaced removeWords by
434            replacePatterns. Suggested by Christian Buchta.
435    
436            * R/textdoccol.R (inspect): Improved formatting.
437    
438    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
439    
440            * inst/CITATION: Updated JSS article information.
441    
442            * R/textdoccol.R (setAs): Added coerce method from list to
443            corpus.
444    
445            * R/meta.R (meta): Improved meta data handling.
446    
447    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
448    
449            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
450            Christian Buchta.
451    
452            * inst/CITATION: Added template to include JSS article reference.
453    
454    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/textdoccol.R (tmMap): Introduced lazy mapping.
457    
458            * R/source.R: Added VectorSource.
459    
460    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * man/: Language codes should be in ISO 639-1 format.
463    
464            * R/textdoccol.R (asPlain): Preserve local meta data.
465    
466    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
467    
468            * R/textdoccol.R (writeCorpus): Function for writing a corpus
469            containing plain text documents to disk.
470    
471    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
472    
473            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
474            always set correctly.
475    
476            * R/textdoccol.R: Set load = TRUE as default for load on demand
477            since in most cases this is the wanted behaviour.
478    
479    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
480    
481            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
482    
483            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
484    
485    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
486    
487            * R/meta.R (meta): New function for consistent access to meta data
488            of document collections, repositories, and texts.
489    
490    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
491    
492            * R/: Better support for encodings.
493    
494    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
495    
496            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
497            selection when no reader argument is given.
498    
499    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
500    
501            * R/source.R (CSVSource): Now uses read.csv instead of scan
502            internally.
503    
504    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
505    
506            * R/reader.R (getReaders): Returns available reader functions.
507    
508            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
509            as default.
510    
511    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
512    
513            * R/stopwords.R (stopwords): Shortened code, removed codetools
514            variable warnings.
515    
516            * man/: Documentation for showMeta, added an example for tmMap.
517    
518            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
519            some minor typos fixed.
520    
521    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
522    
523            * R/aobjects.R (showMeta): Added method for pretty printing a
524            text document's meta data.
525    
526    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
527    
528            * R/textdoccol.R (TextDocCol): Better handling of empty
529            arguments.
530    
531            * NAMESPACE: Exported readDOC.
532    
533            * man/completeStems.Rd: Added an example.
534    
535    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * R/stopwords.R (stopwords): Look up .dat files at every
538            call. Allows users to modify stopword .dat files interactively.
539    
540    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
541    
542            * R/termdocmatrix.R (termFreq): Correct processing of empty
543            documents.
544    
545    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
546    
547            * man/: Updated documentation.
548    
549    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
550    
551            * R/complete.R (completeStems): Completes (heuristically) word
552            stems.
553    
554            * R/termdocmatrix.R (TermDocMatrix2): New modular
555            constructor.
556    
557            * NAMESPACE: Exported termFreq.
558    
559    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
560    
561            * R/reader.R (readDOC): Added MS Word reader (using antiword).
562    
563    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
564    
565            * R/weight.R: Weighting functions for TermDocMatrix.
566    
567    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
568    
569            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
570            functions for accessing dimension, column, and row names.
571    
572            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
573    
574    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
575    
576            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
577    
578    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
579    
580            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
581    
582    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
583    
584            * R/reader.R (readPDF): Removed manual checks for pdftotext and
585            pdfinfo. The system call gives a warning anyway.
586    
587    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
588    
589            * R/textdoccol.R (asPlain): Conversion from
590            StructuredTextDocuments to PlainTextDocuments.
591    
592    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
593    
594            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
595            for accessing term-document matrices.
596    
597            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
598            are installed.
599    
600    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
601    
602            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
603            Christian Buchta.
604    
605    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
606    
607            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
608    
609    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
610    
611            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
612    
613            * R/reader.R (readPDF): Added PDF reader.
614    
615    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
616    
617            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
618    
619            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
620    
621            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
622    
623            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
624    
625    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
626    
627            * R/distmeasure.R (dissimilarity): Replaced dists call from
628            package cba by new dist call from package proxy.
629    
630    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
631    
632            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
633    
634    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
635    
636            * R/termdocmatrix.R: require() uses the quietly option to suppress
637            loading messages.
638    
639    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
640    
641            * R/dictionary.R: Added dictionary support.
642    
643    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
644    
645            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
646            documents. This simplifies some functions, e.g., asPlain.
647    
648    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
649    
650            * inst/doc/tm.Rnw: Fixed some typos in vignette.
651    
652    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
653    
654            * R/textdoccol.R (replaceWords): Added method to replace a set of
655            words by a single word. Useful for synonyms.
656    
657    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
660    
661    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
662    
663            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
664            vectors. Thanks to Ariel Maguyon for his error report.
665            (removeSparseTerms): New function to remove columns from a
666            term-document matrix exceeding a sparse factor.
667    
668    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
669    
670            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
671    
672    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
673    
674            * man/sFilter.Rd: Corrected documentation on statement format (use
675            '==' instead of '=').
676    
677    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
678    
679            * R/aobjects.R (StructuredTextDocument): Inherits from
680            TextDocument.
681    
682    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
683    
684            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
685            on sparse matrices as proposed by Martin Maechler.
686    
687    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
688    
689            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
690            \pkg{filehash} version makes them deprecated.
691    
692    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
693    
694            * R/termdocmatrix.R (textvector): Stemming is now performed before
695            erasing stopwords.
696            (weightMatrix): Adapted to handle sparse matrices.
697            (TermDocMatrix): Sparse matrix is now efficiently built by
698            direct stepwise insertion of row values into it.
699    
700    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
701    
702            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
703            due to ongoing problems. For our purposes the latter is as useful
704            as the replaced package.
705    
706    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
707    
708            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
709    
710            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
711    
712    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
713    
714            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
715            languages with available stopwords.
716    
717    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
718    
719            * inst/doc/tm.Rnw: Minor corrections in the vignette.
720    
721    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
722    
723            * DESCRIPTION: Update to version 0.2, since a lot of new features
724            have been integrated.
725    
726            * inst/stopwords: Updated existing stopwords and added stopwords
727            for various other languages.
728    
729    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
730    
731            * man/: Updated documentation.
732    
733            * Work/testDb.R: Script to test database stuff.
734    
735            * R/: Fixed various database related bugs. Seems to be rather
736            useable now, i.e., consider as alpha status for now.
737    
738    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
739    
740            * R/: Fixed some bugs related to database support.
741    
742    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
743    
744            * man/: Added a lot of examples to the manuals.
745    
746    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
747    
748            * man/: Updated parts of the documentation.
749    
750            * R/textdoccol.R (asPlain): Added conversion from newsgroup
751            documents to plain text documents.
752    
753    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
754    
755            * R/textdoccol.R: Finished experimental database support. Not yet
756            intensively tested.
757    
758            * R/source.R: Now each source has a default reader.
759    
760            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
761            class anymore.
762    
763            * R/plaintextdoc.R: Custom show method for plain text documents.
764    
765            * R/aobjects.R: Added a class for structured text documents.
766    
767            * R/reader.R: Replaced remaining \code{parser} occurrences with
768            \code{reader}.
769    
770            * R/textdoccol.R (summary): Indent tags.
771    
772            * R/textdoccol.R (removePunctuation): Transform method to remove
773            punctuation marks.
774    
775    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
776    
777            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
778            using prescindMeta().
779    
780    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
781    
782            * R/textdoccol.R: Improved database support.
783    
784    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
785    
786            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
787    
788            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
789            language code.
790    
791            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
792            into parserControl argument.
793    
794            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
795    
796    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
797    
798            * Work/tmDataSetup.R: The datasets acq and crude can now be
799            created on the fly.
800    
801            * R/stopwords.R: Introduced a function returning the stopwords for
802            a given language (English, German and French at the moment)
803    
804            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
805            otherwise falls back to Snowball package.
806    
807    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
808    
809            * man/dissimilarity-methods.Rd: Make clear that any method offered
810            by "dists" from package "cba" can be used.
811    
812    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
813    
814            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
815            to Kurt's latex suggestion. Removed points and underscores in
816            variable names for consistent naming.
817    
818            * DESCRIPTION: Update to version 0.1-2.
819    
820            * man/TextRepository.Rd: Fixed bug in documentation.
821    
822    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
823    
824            * DESCRIPTION: Update to version 0.1-1.
825    
826    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
829            wordStem.
830    
831    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/: Changes due to Kurt's review.
834    
835    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
836    
837            * R/: Implemented improvements based upon comments by David
838            Meyer.
839    
840    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
841    
842            * inst/doc/: Rewrote vignette.
843    
844            * man/: Improved documentation.
845    
846    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
847    
848            * man/: Updated documentation.
849    
850            * DESCRIPTION: Changed package name to "tm". Updated version to
851            0.1 for first CRAN release.
852    
853            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
854            list archive example.
855    
856            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
857            archive example.
858    
859            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
860            from (several mails per box) mbox format to (single mail per file)
861            eml format.
862    
863    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
864    
865            * data/crude.rda: Rebuilt.
866    
867            * data/acq.rda: Rebuilt.
868    
869            * R/reader.R: Factored out reader and parser methods from
870            textdoccol.R.
871    
872            * R/source.R: Factored out Source methods from aobjects.R and
873            textdoccol.R.
874            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
875            feeds.
876    
877            * R/textdoccol.R (DirSource): Added support for recursive
878            traversal of directories.
879    
880    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
881    
882            * R/textdoccol.R ([[): Loads the document corpus automatically
883            into memory upon access.
884            (tm_transform, tm_filter): Removed several checks whether the
885            document is already loaded ([[ ensures this now).
886            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
887            mailing list archive.
888    
889    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
890    
891            * R/aobjects.R (TextDocument): Is now a virtual class.
892            (Source): Is now a virtual class.
893    
894    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
895    
896            * R/textdoccol.R (c): Support for an arbitrary number of document
897            collections.
898    
899    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
900    
901            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
902            append_meta and remove_meta.
903    
904            * R/textdoccol.R: Removed modify_metadata method.
905    
906            * R/textrepo.R: Removed modify_metadata method.
907    
908            * R/textdoccol.R (remove_meta): Supports removal of document
909            collection metadata and document (= in data frame) metadata.
910    
911    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
912    
913            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
914    
915            * data/crude.rda: Rebuilt.
916    
917            * data/acq.rda: Rebuilt.
918    
919            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
920    
921            * R/textdoccol.R ([): Bug fix for subsetting a document
922            collection's data frame.
923    
924    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
925    
926            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
927            to s_filter.
928    
929            * R/textdoccol.R: Local text documents' metadata can now be copied
930            to a document collection's data frame with prescind_meta.
931    
932    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
933    
934            * R/: Text documents' slot metadata is now accessible in s_filter.
935    
936            * R/: Rewrote s_filter function (has still some restrictions).
937    
938    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
939    
940            * R/: Various fixes in handling metadata.
941    
942            * R/: Added update mechanism for text document collections.
943    
944    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
945    
946            * R/: Merging of document collections now creates a binary tree
947            for reconstructing merged document collections.
948    
949            * R/: Redesign of metadata for document collections.
950    
951    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
952    
953            * R/: Messages now use \code{ngettext}.
954    
955    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
956    
957            * R/: Added functions for modifying and removing metadata.
958    
959    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
960    
961            * man/: Updated some documentation.
962    
963            * R/: Corrected some connection issues.
964    
965            * inst/doc: Worked on the vignette.
966    
967    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
968    
969            * inst/: Added texts and started vignette.
970    
971            * R/: Final changes based upon David's comments.
972    
973    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
974    
975            * NAMESPACE: Corrected exports (generic methods need exportMethods
976            directives!).
977    
978    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
979    
980            * R/: Modified the TextDocCol constructur and various parsers. It
981            is now modular and supports various file formats via plugins (see
982            the new "Source" class).
983    
984    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
985    
986            * man/: Revised documentation after previous code changes.
987    
988    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
989    
990            * R/: Remaining changes as discussed with David.
991    
992    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
993    
994            * R/: Some changes as suggested by David. The rest will follow
995            within the next days.
996    
997    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
998    
999            * man/: Finished documentation.
1000    
1001    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1002    
1003            * man/: Wrote some documentation.
1004    
1005    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1006    
1007            * R/: Further syntactic sugar in form of additional assignment and
1008            accessor methods.
1009    
1010    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1011    
1012            * R/: Syntactic sugar in form of "length", "show" and "summary"
1013            operators.
1014    
1015    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1016    
1017            * R/: Diverse updates. Mainly on default operators ("[" or "c")
1018            and dissimilarities.
1019    
1020    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1021    
1022            * R/: Added similarity functions.
1023    
1024            * data/: Added english stopwords.
1025    
1026    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1027    
1028            * data/: Examples compiled for new features
1029    
1030            * R/: Changes due to new structure.
1031    
1032            * NAMESPACE: Corrected namespace to reflect new structure.
1033    
1034            * R/termdocmatrix.R: Adapted for new naming scheme.
1035    
1036    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1037    
1038            * R/textdoccol.R: Adapted code for new class structure. Wrote
1039            several transform and filter functions operating on text document
1040            collections (alias text document databases).
1041    
1042            * R/aobjects.R: Adapted class structure with inheritance,
1043            repositories and additional meta data. Loading files on demand is
1044            now possible.
1045    
1046    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1047    
1048            * R/: Some cosmetic cleanups.
1049    
1050            * inst/: Removed vignette on clustering. That and much more is now
1051            described in the JSS paper on text mining. Based upon that
1052            article an elaborated vignette will be incorporated in the future.
1053    
1054    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1055    
1056            * R/: Updated generic S4 methods to comply with signature changes
1057            in newer versions of R (> 2.3)
1058    
1059  2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
1060    
1061          * ext/R/importRIS.R: Automatic RIS import is now possible.          * ext/R/importRIS.R: Automatic RIS import is now possible.

Legend:
Removed from v.41  
changed lines
  Added in v.1150

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge