SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1051, Wed Mar 3 17:50:10 2010 UTC
# Line 1  Line 1 
1    2010-03-03  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/weight.R (weightTfIdf): Added normalization option.
4    
5            * man/tm_tag_score.Rd: Add General Inquirer example for sentiment
6            analysis.
7    
8    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
9    
10            * R/score.R (tm_tag_score): Compute a score from the number of
11            tags matching in a document.
12    
13    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
14    
15            * R/complete.R (stemCompletion): New completion heuristics.
16    
17    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
18    
19            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
20    
21    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
22    
23            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
24            setOldClass(c(..., "list")) works.
25    
26    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
27    
28            * R/transform.R (stemDocument.character): In case input is a
29            simple character just delegate to the default Snowball stemmer.
30    
31    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
32    
33            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
34            data.
35    
36    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
37    
38            * R/doc.R (`Content<-`): Be careful with names attribute.
39    
40    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
41    
42            * R/source.R (DirSource): Improved implementation especially when
43            handling many (> 1M) files.
44    
45    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
46    
47            * R/source.R (getElem.URISource): Use encoding argument.
48    
49    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
50    
51            * R/doc.R (setOldClass): Register S3 document classes to be
52            recognized by S4 methods.
53    
54    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
55    
56            * R/matrix.R (termFreq): Add option to remove punctuation
57            characters.
58    
59    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
60    
61            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
62            merging multiple term-document matrices.
63    
64    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
65    
66            * R/corpus.R (setOldClass): Register S3 corpus classes to be
67            recognized by S4 methods.
68    
69            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
70            that CRAN Mac OS X builds do not fail any longer.
71    
72    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
73    
74            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
75            of RWeka:AlphabeticTokenizer() as default.
76    
77    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
78    
79            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
80            caused words at the beginning or the end of a line not to be removed. Do
81            not delete whitespace anymore.
82    
83    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
84    
85            * R/source.R (DirSource): Default to working directory if no path
86            is specified.
87    
88    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
89    
90            * R/source.R (DirSource): Stop on empty directories.
91    
92    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
93    
94            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
95            named documents.
96    
97    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
98    
99            * R/transform.R (removeWords): Improve regular expressions.
100    
101    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
102    
103            * R/meta.R (DublinCore): Allow lower case tags.
104    
105    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
106    
107            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
108            instead of x$children.
109    
110    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
113    
114    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/: Use S3 instead of S4 class system.
117    
118    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
119    
120            * R/reader.R (readMail): Moved to tm.plugin.mail package.
121    
122    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
123    
124            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
125            postings are basically e-mails with some extra headers.
126    
127    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
128    
129            * R/transform.R: Move convertMboxEml, removeCitation,
130            removeMultipart, and removeSignature to the tm.plugin.mail package
131            since they are mainly utility functions (for handling e-mails) and
132            not very framework specific.
133    
134    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
135    
136            * man/: Fix documentation.
137    
138    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
139    
140            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
141            plain text document instead of an XML document for texts of the
142            Reuters-21578 dataset.
143    
144            * R/sparse.R: Removed since the slam package is now available on
145            CRAN.
146    
147            * DESCRIPTION (Depends): Add slam package.
148    
149    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
150    
151            * R/transform.R (stemDoc): Fix character(0) handling.
152    
153    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
154    
155            * R/doc.R (show): Pretty print.
156    
157    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
158    
159            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
160            gracefully.
161    
162    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
163    
164            * R/corpus.R: Make corpus virtual. Implement corpus with standard
165            and permanent storage semantics.
166    
167            * DESCRIPTION: New major release. A *lot* of improvements.
168    
169    2009-05-04   Ingo Feinerer <feinerer@logic.at>
170    
171            * NAMESPACE: Export some simple_triplet_matrix functions.
172    
173    2009-04-28   Ingo Feinerer <feinerer@logic.at>
174    
175            * R/weight.R: Adapt tf-idf to new matrix format.
176    
177    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
178    
179            * R/matrix.R: Create two distinct classes for term-document and
180            document-term matrices.
181    
182    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
183    
184            * R/termdocmatrix.R: No longer use Matrix package. This reduces
185            package start-up time significantly.
186    
187    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
188    
189            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
190    
191    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
192    
193            * R/transform.R (tmReduce): Combine multiple maps into one
194            transformation.
195    
196    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
197    
198            * R/weight.R: Remove weightLogical since it does not return a
199            dgCMatrix.
200    
201            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
202            or TermDocumentMatrix instead.
203    
204    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
205    
206            * inst/doc/extensions.Rnw: Finished vignette.
207    
208    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
209    
210            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
211            DocumentTermMatrix representations.
212    
213    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
214    
215            * R/reader.R (readXML): New reader for arbitrary XML files.
216    
217    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
218    
219            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
220            (XMLSource): New XMLSource class for arbitrary XML files.
221            (Source): New slot Vectorized.
222    
223    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
224    
225            * R/reader.R (readTabular): Experimental reader for tabular data
226            structures which can be customized via user-defined mappings.
227    
228            * R/reader.R: Always use UTC time zone.
229    
230            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
231    
232    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
233    
234            * R/reader.R (readDOC): Options can be passed over to antiword.
235    
236            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
237            pdftotext.
238    
239    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
240    
241            * R/source.R (DirSource): Add pattern and ignore.case arguments
242            which are internally passed over to list.files().
243    
244    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
245    
246            * inst/doc/tm.Rnw: Suppress pointless loading message.
247    
248    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
249    
250            * DESCRIPTION: Speed up package loading (via moving packages not
251            strictly necessary for normal operation to Suggests instead of
252            Depends).
253    
254    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
255    
256            * R/reader.R (readNewsgroup): The date format is now configurable.
257    
258    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
259    
260            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
261    
262    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
263    
264            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
265    
266    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
267    
268            * R/source.R (DataframeSource): New source class for data frames.
269    
270            * R/source.R: Fixed non-standard call evaluation.
271    
272    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
273    
274            * R/source.R (URISource): New source class for a single document.
275    
276    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
277    
278            * R/source.R: Refactoring.
279    
280    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
281    
282            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
283            Rmpi installations more gracefully.
284    
285    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
286    
287            * R/source.R (Source): Add Length slot.
288    
289    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
290    
291            * R/AAA.R: Unify duplicated .onLoad function.
292    
293    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
294    
295            * DESCRIPTION (Suggests): Added Rmpi.
296    
297    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
298    
299            * R/source.R (getElem): Fix 'no visible binding' warning.
300    
301            * man/WeightFunction.Rd: Fix signature.
302    
303    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
304    
305            * R/weight.R: Introduce name abbreviations for weighting functions.
306    
307    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
308    
309            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
310    
311            * R/cluster.R: Provide convenience functions for using a MPI
312            cluster.
313    
314            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
315            available.
316    
317            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
318            available.
319    
320    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
321    
322            * R/textdoccol.R (lapply): Removed debug print out.
323    
324    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/reader.R (readRCV1): Improved meta data extraction from
327            Reuters Corpus Volume 1 documents.
328    
329    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * R/transform.R: Ensure that all mappings preserve multiline
332            structures.
333    
334    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
335    
336            * R/filter.R: Every filter has now an attribute indicating whether
337            it sould be applied to document level (doclevel).
338    
339            * R/textdoccol.R (tmFilter): Set searchFullText as new default
340            filter.
341    
342    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
343    
344            * R/transform.R (replacePatterns): Replaced removeWords by
345            replacePatterns. Suggested by Christian Buchta.
346    
347            * R/textdoccol.R (inspect): Improved formatting.
348    
349    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
350    
351            * inst/CITATION: Updated JSS article information.
352    
353            * R/textdoccol.R (setAs): Added coerce method from list to
354            corpus.
355    
356            * R/meta.R (meta): Improved meta data handling.
357    
358    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
359    
360            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
361            Christian Buchta.
362    
363            * inst/CITATION: Added template to include JSS article reference.
364    
365    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
366    
367            * R/textdoccol.R (tmMap): Introduced lazy mapping.
368    
369            * R/source.R: Added VectorSource.
370    
371    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
372    
373            * man/: Language codes should be in ISO 639-1 format.
374    
375            * R/textdoccol.R (asPlain): Preserve local meta data.
376    
377    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
378    
379            * R/textdoccol.R (writeCorpus): Function for writing a corpus
380            containing plain text documents to disk.
381    
382    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
383    
384            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
385            always set correctly.
386    
387            * R/textdoccol.R: Set load = TRUE as default for load on demand
388            since in most cases this is the wanted behaviour.
389    
390    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
391    
392            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
393    
394            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
395    
396    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
397    
398            * R/meta.R (meta): New function for consistent access to meta data
399            of document collections, repositories, and texts.
400    
401    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
402    
403            * R/: Better support for encodings.
404    
405    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
406    
407            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
408            selection when no reader argument is given.
409    
410    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
411    
412            * R/source.R (CSVSource): Now uses read.csv instead of scan
413            internally.
414    
415    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
416    
417            * R/reader.R (getReaders): Returns available reader functions.
418    
419            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
420            as default.
421    
422    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
423    
424            * R/stopwords.R (stopwords): Shortened code, removed codetools
425            variable warnings.
426    
427            * man/: Documentation for showMeta, added an example for tmMap.
428    
429            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
430            some minor typos fixed.
431    
432    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
433    
434            * R/aobjects.R (showMeta): Added method for pretty printing a
435            text document's meta data.
436    
437    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
438    
439            * R/textdoccol.R (TextDocCol): Better handling of empty
440            arguments.
441    
442            * NAMESPACE: Exported readDOC.
443    
444            * man/completeStems.Rd: Added an example.
445    
446    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * R/stopwords.R (stopwords): Look up .dat files at every
449            call. Allows users to modify stopword .dat files interactively.
450    
451    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
452    
453            * R/termdocmatrix.R (termFreq): Correct processing of empty
454            documents.
455    
456    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
457    
458            * man/: Updated documentation.
459    
460    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * R/complete.R (completeStems): Completes (heuristically) word
463            stems.
464    
465            * R/termdocmatrix.R (TermDocMatrix2): New modular
466            constructor.
467    
468            * NAMESPACE: Exported termFreq.
469    
470    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
471    
472            * R/reader.R (readDOC): Added MS Word reader (using antiword).
473    
474    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
475    
476            * R/weight.R: Weighting functions for TermDocMatrix.
477    
478    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
479    
480            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
481            functions for accessing dimension, column, and row names.
482    
483            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
484    
485    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
486    
487            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
488    
489    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
490    
491            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
492    
493    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
494    
495            * R/reader.R (readPDF): Removed manual checks for pdftotext and
496            pdfinfo. The system call gives a warning anyway.
497    
498    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
499    
500            * R/textdoccol.R (asPlain): Conversion from
501            StructuredTextDocuments to PlainTextDocuments.
502    
503    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
504    
505            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
506            for accessing term-document matrices.
507    
508            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
509            are installed.
510    
511    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
512    
513            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
514            Christian Buchta.
515    
516    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
517    
518            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
519    
520    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
521    
522            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
523    
524            * R/reader.R (readPDF): Added PDF reader.
525    
526    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
527    
528            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
529    
530            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
531    
532            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
533    
534            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
535    
536    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
537    
538            * R/distmeasure.R (dissimilarity): Replaced dists call from
539            package cba by new dist call from package proxy.
540    
541    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
542    
543            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
544    
545    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
546    
547            * R/termdocmatrix.R: require() uses the quietly option to suppress
548            loading messages.
549    
550    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
551    
552            * R/dictionary.R: Added dictionary support.
553    
554    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
555    
556            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
557            documents. This simplifies some functions, e.g., asPlain.
558    
559    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
560    
561            * inst/doc/tm.Rnw: Fixed some typos in vignette.
562    
563    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
564    
565            * R/textdoccol.R (replaceWords): Added method to replace a set of
566            words by a single word. Useful for synonyms.
567    
568    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
569    
570            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
571    
572    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
573    
574            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
575            vectors. Thanks to Ariel Maguyon for his error report.
576            (removeSparseTerms): New function to remove columns from a
577            term-document matrix exceeding a sparse factor.
578    
579    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
580    
581            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
582    
583    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
584    
585            * man/sFilter.Rd: Corrected documentation on statement format (use
586            '==' instead of '=').
587    
588    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
589    
590            * R/aobjects.R (StructuredTextDocument): Inherits from
591            TextDocument.
592    
593    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
594    
595            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
596            on sparse matrices as proposed by Martin Maechler.
597    
598    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
599    
600            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
601            \pkg{filehash} version makes them deprecated.
602    
603    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
604    
605            * R/termdocmatrix.R (textvector): Stemming is now performed before
606            erasing stopwords.
607            (weightMatrix): Adapted to handle sparse matrices.
608            (TermDocMatrix): Sparse matrix is now efficiently built by
609            direct stepwise insertion of row values into it.
610    
611    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
612    
613            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
614            due to ongoing problems. For our purposes the latter is as useful
615            as the replaced package.
616    
617    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
618    
619            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
620    
621            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
622    
623    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
624    
625            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
626            languages with available stopwords.
627    
628    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * inst/doc/tm.Rnw: Minor corrections in the vignette.
631    
632    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
633    
634            * DESCRIPTION: Update to version 0.2, since a lot of new features
635            have been integrated.
636    
637            * inst/stopwords: Updated existing stopwords and added stopwords
638            for various other languages.
639    
640    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
641    
642            * man/: Updated documentation.
643    
644            * Work/testDb.R: Script to test database stuff.
645    
646            * R/: Fixed various database related bugs. Seems to be rather
647            useable now, i.e., consider as alpha status for now.
648    
649    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
650    
651            * R/: Fixed some bugs related to database support.
652    
653    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
654    
655            * man/: Added a lot of examples to the manuals.
656    
657    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * man/: Updated parts of the documentation.
660    
661            * R/textdoccol.R (asPlain): Added conversion from newsgroup
662            documents to plain text documents.
663    
664    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
665    
666            * R/textdoccol.R: Finished experimental database support. Not yet
667            intensively tested.
668    
669            * R/source.R: Now each source has a default reader.
670    
671            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
672            class anymore.
673    
674            * R/plaintextdoc.R: Custom show method for plain text documents.
675    
676            * R/aobjects.R: Added a class for structured text documents.
677    
678            * R/reader.R: Replaced remaining \code{parser} occurrences with
679            \code{reader}.
680    
681            * R/textdoccol.R (summary): Indent tags.
682    
683            * R/textdoccol.R (removePunctuation): Transform method to remove
684            punctuation marks.
685    
686    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
687    
688            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
689            using prescindMeta().
690    
691    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
692    
693            * R/textdoccol.R: Improved database support.
694    
695    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
696    
697            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
698    
699            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
700            language code.
701    
702            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
703            into parserControl argument.
704    
705            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
706    
707    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
708    
709            * Work/tmDataSetup.R: The datasets acq and crude can now be
710            created on the fly.
711    
712            * R/stopwords.R: Introduced a function returning the stopwords for
713            a given language (English, German and French at the moment)
714    
715            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
716            otherwise falls back to Snowball package.
717    
718    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
719    
720            * man/dissimilarity-methods.Rd: Make clear that any method offered
721            by "dists" from package "cba" can be used.
722    
723    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
724    
725            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
726            to Kurt's latex suggestion. Removed points and underscores in
727            variable names for consistent naming.
728    
729            * DESCRIPTION: Update to version 0.1-2.
730    
731            * man/TextRepository.Rd: Fixed bug in documentation.
732    
733    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
734    
735            * DESCRIPTION: Update to version 0.1-1.
736    
737    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
738    
739            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
740            wordStem.
741    
742    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
743    
744            * R/: Changes due to Kurt's review.
745    
746    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
747    
748            * R/: Implemented improvements based upon comments by David
749            Meyer.
750    
751    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
752    
753            * inst/doc/: Rewrote vignette.
754    
755            * man/: Improved documentation.
756    
757    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
758    
759            * man/: Updated documentation.
760    
761            * DESCRIPTION: Changed package name to "tm". Updated version to
762            0.1 for first CRAN release.
763    
764            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
765            list archive example.
766    
767            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
768            archive example.
769    
770            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
771            from (several mails per box) mbox format to (single mail per file)
772            eml format.
773    
774    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
775    
776            * data/crude.rda: Rebuilt.
777    
778            * data/acq.rda: Rebuilt.
779    
780            * R/reader.R: Factored out reader and parser methods from
781            textdoccol.R.
782    
783            * R/source.R: Factored out Source methods from aobjects.R and
784            textdoccol.R.
785            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
786            feeds.
787    
788            * R/textdoccol.R (DirSource): Added support for recursive
789            traversal of directories.
790    
791    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
792    
793            * R/textdoccol.R ([[): Loads the document corpus automatically
794            into memory upon access.
795            (tm_transform, tm_filter): Removed several checks whether the
796            document is already loaded ([[ ensures this now).
797            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
798            mailing list archive.
799    
800    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
801    
802            * R/aobjects.R (TextDocument): Is now a virtual class.
803            (Source): Is now a virtual class.
804    
805    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
806    
807            * R/textdoccol.R (c): Support for an arbitrary number of document
808            collections.
809    
810    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
811    
812            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
813            append_meta and remove_meta.
814    
815            * R/textdoccol.R: Removed modify_metadata method.
816    
817            * R/textrepo.R: Removed modify_metadata method.
818    
819            * R/textdoccol.R (remove_meta): Supports removal of document
820            collection metadata and document (= in data frame) metadata.
821    
822    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
823    
824            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
825    
826            * data/crude.rda: Rebuilt.
827    
828            * data/acq.rda: Rebuilt.
829    
830            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
831    
832            * R/textdoccol.R ([): Bug fix for subsetting a document
833            collection's data frame.
834    
835    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
836    
837            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
838            to s_filter.
839    
840            * R/textdoccol.R: Local text documents' metadata can now be copied
841            to a document collection's data frame with prescind_meta.
842    
843    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
844    
845            * R/: Text documents' slot metadata is now accessible in s_filter.
846    
847            * R/: Rewrote s_filter function (has still some restrictions).
848    
849    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
850    
851            * R/: Various fixes in handling metadata.
852    
853            * R/: Added update mechanism for text document collections.
854    
855    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * R/: Merging of document collections now creates a binary tree
858            for reconstructing merged document collections.
859    
860            * R/: Redesign of metadata for document collections.
861    
862    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
863    
864            * R/: Messages now use \code{ngettext}.
865    
866    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
867    
868            * R/: Added functions for modifying and removing metadata.
869    
870    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
871    
872            * man/: Updated some documentation.
873    
874            * R/: Corrected some connection issues.
875    
876            * inst/doc: Worked on the vignette.
877    
878    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
879    
880            * inst/: Added texts and started vignette.
881    
882            * R/: Final changes based upon David's comments.
883    
884    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
885    
886            * NAMESPACE: Corrected exports (generic methods need exportMethods
887            directives!).
888    
889    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
890    
891            * R/: Modified the TextDocCol constructur and various parsers. It
892            is now modular and supports various file formats via plugins (see
893            the new "Source" class).
894    
895    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
896    
897            * man/: Revised documentation after previous code changes.
898    
899    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
900    
901            * R/: Remaining changes as discussed with David.
902    
903    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
904    
905            * R/: Some changes as suggested by David. The rest will follow
906            within the next days.
907    
908    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
909    
910            * man/: Finished documentation.
911    
912    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
913    
914            * man/: Wrote some documentation.
915    
916    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
917    
918            * R/: Further syntactic sugar in form of additional assignment and
919            accessor methods.
920    
921    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
922    
923            * R/: Syntactic sugar in form of "length", "show" and "summary"
924            operators.
925    
926    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
927    
928            * R/: Diverse updates. Mainly on default operators ("[" or "c")
929            and dissimilarities.
930    
931    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
932    
933            * R/: Added similarity functions.
934    
935            * data/: Added english stopwords.
936    
937    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
938    
939            * data/: Examples compiled for new features
940    
941            * R/: Changes due to new structure.
942    
943            * NAMESPACE: Corrected namespace to reflect new structure.
944    
945            * R/termdocmatrix.R: Adapted for new naming scheme.
946    
947    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
948    
949            * R/textdoccol.R: Adapted code for new class structure. Wrote
950            several transform and filter functions operating on text document
951            collections (alias text document databases).
952    
953            * R/aobjects.R: Adapted class structure with inheritance,
954            repositories and additional meta data. Loading files on demand is
955            now possible.
956    
957    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
958    
959            * R/: Some cosmetic cleanups.
960    
961            * inst/: Removed vignette on clustering. That and much more is now
962            described in the JSS paper on text mining. Based upon that
963            article an elaborated vignette will be incorporated in the future.
964    
965    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
966    
967            * R/: Updated generic S4 methods to comply with signature changes
968            in newer versions of R (> 2.3)
969    
970    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
971    
972            * ext/R/importRIS.R: Automatic RIS import is now possible.
973    
974    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
975    
976            * R/textdoccol.R: Added RIS HTML input format.
977    
978    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
979    
980            * R/textdoccol.R: Removed bug that caused invalid text document
981            collections when handling many input files.
982    
983  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
984    
985          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1051

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge