SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 34, Thu Dec 22 15:18:10 2005 UTC pkg/ChangeLog revision 1048, Wed Mar 3 06:14:10 2010 UTC
# Line 1  Line 1 
1    2010-03-03  Ingo Feinerer  <feinerer@logic.at>
2    
3            * man/tm_tag_score.Rd: Add General Inquirer example for sentiment
4            analysis.
5    
6    2010-02-25  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/score.R (tm_tag_score): Compute a score from the number of
9            tags matching in a document.
10    
11    2010-02-18  Ingo Feinerer  <feinerer@logic.at>
12    
13            * R/complete.R (stemCompletion): New completion heuristics.
14    
15    2010-02-17  Ingo Feinerer  <feinerer@logic.at>
16    
17            * R/plot.R (plot.TermDocumentMatrix): Memory improvements.
18    
19    2010-02-06  Ingo Feinerer  <feinerer@logic.at>
20    
21            * DESCRIPTION (Depends): Depend on R (>= 2.10.0) to ensure that
22            setOldClass(c(..., "list")) works.
23    
24    2010-01-22  Ingo Feinerer  <feinerer@logic.at>
25    
26            * R/transform.R (stemDocument.character): In case input is a
27            simple character just delegate to the default Snowball stemmer.
28    
29    2010-01-15  Ingo Feinerer  <feinerer@logic.at>
30    
31            * R/reader.R (readReut21578XML, readRCV1): Extract more meta
32            data.
33    
34    2010-01-12  Ingo Feinerer  <feinerer@logic.at>
35    
36            * R/doc.R (`Content<-`): Be careful with names attribute.
37    
38    2010-01-07  Stefan Theussl  <stefan.theussl@wu.ac.at>
39    
40            * R/source.R (DirSource): Improved implementation especially when
41            handling many (> 1M) files.
42    
43    2009-12-22  Ingo Feinerer  <feinerer@logic.at>
44    
45            * R/source.R (getElem.URISource): Use encoding argument.
46    
47    2009-12-11  Ingo Feinerer  <feinerer@logic.at>
48    
49            * R/doc.R (setOldClass): Register S3 document classes to be
50            recognized by S4 methods.
51    
52    2009-11-25  Ingo Feinerer  <feinerer@logic.at>
53    
54            * R/matrix.R (termFreq): Add option to remove punctuation
55            characters.
56    
57    2009-11-19  Ingo Feinerer  <feinerer@logic.at>
58    
59            * R/matrix.R (c.TermDocumentMatrix): Added combine method for
60            merging multiple term-document matrices.
61    
62    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/corpus.R (setOldClass): Register S3 corpus classes to be
65            recognized by S4 methods.
66    
67            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
68            that CRAN Mac OS X builds do not fail any longer.
69    
70    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
71    
72            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
73            of RWeka:AlphabeticTokenizer() as default.
74    
75    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
76    
77            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
78            caused words at the beginning or the end of a line not to be removed. Do
79            not delete whitespace anymore.
80    
81    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
82    
83            * R/source.R (DirSource): Default to working directory if no path
84            is specified.
85    
86    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/source.R (DirSource): Stop on empty directories.
89    
90    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
91    
92            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
93            named documents.
94    
95    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
96    
97            * R/transform.R (removeWords): Improve regular expressions.
98    
99    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
100    
101            * R/meta.R (DublinCore): Allow lower case tags.
102    
103    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
104    
105            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
106            instead of x$children.
107    
108    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
109    
110            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
111    
112    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
113    
114            * R/: Use S3 instead of S4 class system.
115    
116    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
117    
118            * R/reader.R (readMail): Moved to tm.plugin.mail package.
119    
120    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
121    
122            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
123            postings are basically e-mails with some extra headers.
124    
125    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
126    
127            * R/transform.R: Move convertMboxEml, removeCitation,
128            removeMultipart, and removeSignature to the tm.plugin.mail package
129            since they are mainly utility functions (for handling e-mails) and
130            not very framework specific.
131    
132    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
133    
134            * man/: Fix documentation.
135    
136    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
137    
138            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
139            plain text document instead of an XML document for texts of the
140            Reuters-21578 dataset.
141    
142            * R/sparse.R: Removed since the slam package is now available on
143            CRAN.
144    
145            * DESCRIPTION (Depends): Add slam package.
146    
147    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
148    
149            * R/transform.R (stemDoc): Fix character(0) handling.
150    
151    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
152    
153            * R/doc.R (show): Pretty print.
154    
155    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
156    
157            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
158            gracefully.
159    
160    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
161    
162            * R/corpus.R: Make corpus virtual. Implement corpus with standard
163            and permanent storage semantics.
164    
165            * DESCRIPTION: New major release. A *lot* of improvements.
166    
167    2009-05-04   Ingo Feinerer <feinerer@logic.at>
168    
169            * NAMESPACE: Export some simple_triplet_matrix functions.
170    
171    2009-04-28   Ingo Feinerer <feinerer@logic.at>
172    
173            * R/weight.R: Adapt tf-idf to new matrix format.
174    
175    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
176    
177            * R/matrix.R: Create two distinct classes for term-document and
178            document-term matrices.
179    
180    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
181    
182            * R/termdocmatrix.R: No longer use Matrix package. This reduces
183            package start-up time significantly.
184    
185    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
186    
187            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
188    
189    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
190    
191            * R/transform.R (tmReduce): Combine multiple maps into one
192            transformation.
193    
194    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/weight.R: Remove weightLogical since it does not return a
197            dgCMatrix.
198    
199            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
200            or TermDocumentMatrix instead.
201    
202    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
203    
204            * inst/doc/extensions.Rnw: Finished vignette.
205    
206    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
207    
208            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
209            DocumentTermMatrix representations.
210    
211    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
212    
213            * R/reader.R (readXML): New reader for arbitrary XML files.
214    
215    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
216    
217            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
218            (XMLSource): New XMLSource class for arbitrary XML files.
219            (Source): New slot Vectorized.
220    
221    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
222    
223            * R/reader.R (readTabular): Experimental reader for tabular data
224            structures which can be customized via user-defined mappings.
225    
226            * R/reader.R: Always use UTC time zone.
227    
228            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
229    
230    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
231    
232            * R/reader.R (readDOC): Options can be passed over to antiword.
233    
234            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
235            pdftotext.
236    
237    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
238    
239            * R/source.R (DirSource): Add pattern and ignore.case arguments
240            which are internally passed over to list.files().
241    
242    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
243    
244            * inst/doc/tm.Rnw: Suppress pointless loading message.
245    
246    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
247    
248            * DESCRIPTION: Speed up package loading (via moving packages not
249            strictly necessary for normal operation to Suggests instead of
250            Depends).
251    
252    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
253    
254            * R/reader.R (readNewsgroup): The date format is now configurable.
255    
256    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
257    
258            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
259    
260    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
261    
262            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
263    
264    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
265    
266            * R/source.R (DataframeSource): New source class for data frames.
267    
268            * R/source.R: Fixed non-standard call evaluation.
269    
270    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
271    
272            * R/source.R (URISource): New source class for a single document.
273    
274    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
275    
276            * R/source.R: Refactoring.
277    
278    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
279    
280            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
281            Rmpi installations more gracefully.
282    
283    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
284    
285            * R/source.R (Source): Add Length slot.
286    
287    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
288    
289            * R/AAA.R: Unify duplicated .onLoad function.
290    
291    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
292    
293            * DESCRIPTION (Suggests): Added Rmpi.
294    
295    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
296    
297            * R/source.R (getElem): Fix 'no visible binding' warning.
298    
299            * man/WeightFunction.Rd: Fix signature.
300    
301    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
302    
303            * R/weight.R: Introduce name abbreviations for weighting functions.
304    
305    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
306    
307            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
308    
309            * R/cluster.R: Provide convenience functions for using a MPI
310            cluster.
311    
312            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
313            available.
314    
315            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
316            available.
317    
318    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
319    
320            * R/textdoccol.R (lapply): Removed debug print out.
321    
322    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/reader.R (readRCV1): Improved meta data extraction from
325            Reuters Corpus Volume 1 documents.
326    
327    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
328    
329            * R/transform.R: Ensure that all mappings preserve multiline
330            structures.
331    
332    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
333    
334            * R/filter.R: Every filter has now an attribute indicating whether
335            it sould be applied to document level (doclevel).
336    
337            * R/textdoccol.R (tmFilter): Set searchFullText as new default
338            filter.
339    
340    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
341    
342            * R/transform.R (replacePatterns): Replaced removeWords by
343            replacePatterns. Suggested by Christian Buchta.
344    
345            * R/textdoccol.R (inspect): Improved formatting.
346    
347    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
348    
349            * inst/CITATION: Updated JSS article information.
350    
351            * R/textdoccol.R (setAs): Added coerce method from list to
352            corpus.
353    
354            * R/meta.R (meta): Improved meta data handling.
355    
356    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
357    
358            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
359            Christian Buchta.
360    
361            * inst/CITATION: Added template to include JSS article reference.
362    
363    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * R/textdoccol.R (tmMap): Introduced lazy mapping.
366    
367            * R/source.R: Added VectorSource.
368    
369    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
370    
371            * man/: Language codes should be in ISO 639-1 format.
372    
373            * R/textdoccol.R (asPlain): Preserve local meta data.
374    
375    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
376    
377            * R/textdoccol.R (writeCorpus): Function for writing a corpus
378            containing plain text documents to disk.
379    
380    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
383            always set correctly.
384    
385            * R/textdoccol.R: Set load = TRUE as default for load on demand
386            since in most cases this is the wanted behaviour.
387    
388    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
389    
390            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
391    
392            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
393    
394    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/meta.R (meta): New function for consistent access to meta data
397            of document collections, repositories, and texts.
398    
399    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
400    
401            * R/: Better support for encodings.
402    
403    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
406            selection when no reader argument is given.
407    
408    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
409    
410            * R/source.R (CSVSource): Now uses read.csv instead of scan
411            internally.
412    
413    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
414    
415            * R/reader.R (getReaders): Returns available reader functions.
416    
417            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
418            as default.
419    
420    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * R/stopwords.R (stopwords): Shortened code, removed codetools
423            variable warnings.
424    
425            * man/: Documentation for showMeta, added an example for tmMap.
426    
427            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
428            some minor typos fixed.
429    
430    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * R/aobjects.R (showMeta): Added method for pretty printing a
433            text document's meta data.
434    
435    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * R/textdoccol.R (TextDocCol): Better handling of empty
438            arguments.
439    
440            * NAMESPACE: Exported readDOC.
441    
442            * man/completeStems.Rd: Added an example.
443    
444    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
445    
446            * R/stopwords.R (stopwords): Look up .dat files at every
447            call. Allows users to modify stopword .dat files interactively.
448    
449    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
450    
451            * R/termdocmatrix.R (termFreq): Correct processing of empty
452            documents.
453    
454    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * man/: Updated documentation.
457    
458    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
459    
460            * R/complete.R (completeStems): Completes (heuristically) word
461            stems.
462    
463            * R/termdocmatrix.R (TermDocMatrix2): New modular
464            constructor.
465    
466            * NAMESPACE: Exported termFreq.
467    
468    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
469    
470            * R/reader.R (readDOC): Added MS Word reader (using antiword).
471    
472    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * R/weight.R: Weighting functions for TermDocMatrix.
475    
476    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
477    
478            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
479            functions for accessing dimension, column, and row names.
480    
481            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
482    
483    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
484    
485            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
486    
487    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
488    
489            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
490    
491    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
492    
493            * R/reader.R (readPDF): Removed manual checks for pdftotext and
494            pdfinfo. The system call gives a warning anyway.
495    
496    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
497    
498            * R/textdoccol.R (asPlain): Conversion from
499            StructuredTextDocuments to PlainTextDocuments.
500    
501    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
502    
503            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
504            for accessing term-document matrices.
505    
506            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
507            are installed.
508    
509    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
510    
511            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
512            Christian Buchta.
513    
514    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
515    
516            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
517    
518    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
519    
520            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
521    
522            * R/reader.R (readPDF): Added PDF reader.
523    
524    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
525    
526            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
527    
528            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
529    
530            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
531    
532            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
533    
534    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
535    
536            * R/distmeasure.R (dissimilarity): Replaced dists call from
537            package cba by new dist call from package proxy.
538    
539    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
540    
541            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
542    
543    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
544    
545            * R/termdocmatrix.R: require() uses the quietly option to suppress
546            loading messages.
547    
548    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
549    
550            * R/dictionary.R: Added dictionary support.
551    
552    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
553    
554            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
555            documents. This simplifies some functions, e.g., asPlain.
556    
557    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
558    
559            * inst/doc/tm.Rnw: Fixed some typos in vignette.
560    
561    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
562    
563            * R/textdoccol.R (replaceWords): Added method to replace a set of
564            words by a single word. Useful for synonyms.
565    
566    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
567    
568            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
569    
570    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
571    
572            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
573            vectors. Thanks to Ariel Maguyon for his error report.
574            (removeSparseTerms): New function to remove columns from a
575            term-document matrix exceeding a sparse factor.
576    
577    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
578    
579            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
580    
581    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
582    
583            * man/sFilter.Rd: Corrected documentation on statement format (use
584            '==' instead of '=').
585    
586    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * R/aobjects.R (StructuredTextDocument): Inherits from
589            TextDocument.
590    
591    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
592    
593            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
594            on sparse matrices as proposed by Martin Maechler.
595    
596    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
597    
598            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
599            \pkg{filehash} version makes them deprecated.
600    
601    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
602    
603            * R/termdocmatrix.R (textvector): Stemming is now performed before
604            erasing stopwords.
605            (weightMatrix): Adapted to handle sparse matrices.
606            (TermDocMatrix): Sparse matrix is now efficiently built by
607            direct stepwise insertion of row values into it.
608    
609    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
610    
611            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
612            due to ongoing problems. For our purposes the latter is as useful
613            as the replaced package.
614    
615    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
616    
617            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
618    
619            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
620    
621    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
622    
623            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
624            languages with available stopwords.
625    
626    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
627    
628            * inst/doc/tm.Rnw: Minor corrections in the vignette.
629    
630    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
631    
632            * DESCRIPTION: Update to version 0.2, since a lot of new features
633            have been integrated.
634    
635            * inst/stopwords: Updated existing stopwords and added stopwords
636            for various other languages.
637    
638    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
639    
640            * man/: Updated documentation.
641    
642            * Work/testDb.R: Script to test database stuff.
643    
644            * R/: Fixed various database related bugs. Seems to be rather
645            useable now, i.e., consider as alpha status for now.
646    
647    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
648    
649            * R/: Fixed some bugs related to database support.
650    
651    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
652    
653            * man/: Added a lot of examples to the manuals.
654    
655    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
656    
657            * man/: Updated parts of the documentation.
658    
659            * R/textdoccol.R (asPlain): Added conversion from newsgroup
660            documents to plain text documents.
661    
662    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
663    
664            * R/textdoccol.R: Finished experimental database support. Not yet
665            intensively tested.
666    
667            * R/source.R: Now each source has a default reader.
668    
669            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
670            class anymore.
671    
672            * R/plaintextdoc.R: Custom show method for plain text documents.
673    
674            * R/aobjects.R: Added a class for structured text documents.
675    
676            * R/reader.R: Replaced remaining \code{parser} occurrences with
677            \code{reader}.
678    
679            * R/textdoccol.R (summary): Indent tags.
680    
681            * R/textdoccol.R (removePunctuation): Transform method to remove
682            punctuation marks.
683    
684    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
685    
686            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
687            using prescindMeta().
688    
689    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
690    
691            * R/textdoccol.R: Improved database support.
692    
693    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
694    
695            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
696    
697            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
698            language code.
699    
700            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
701            into parserControl argument.
702    
703            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
704    
705    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
706    
707            * Work/tmDataSetup.R: The datasets acq and crude can now be
708            created on the fly.
709    
710            * R/stopwords.R: Introduced a function returning the stopwords for
711            a given language (English, German and French at the moment)
712    
713            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
714            otherwise falls back to Snowball package.
715    
716    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
717    
718            * man/dissimilarity-methods.Rd: Make clear that any method offered
719            by "dists" from package "cba" can be used.
720    
721    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
722    
723            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
724            to Kurt's latex suggestion. Removed points and underscores in
725            variable names for consistent naming.
726    
727            * DESCRIPTION: Update to version 0.1-2.
728    
729            * man/TextRepository.Rd: Fixed bug in documentation.
730    
731    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
732    
733            * DESCRIPTION: Update to version 0.1-1.
734    
735    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
736    
737            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
738            wordStem.
739    
740    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
741    
742            * R/: Changes due to Kurt's review.
743    
744    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
745    
746            * R/: Implemented improvements based upon comments by David
747            Meyer.
748    
749    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
750    
751            * inst/doc/: Rewrote vignette.
752    
753            * man/: Improved documentation.
754    
755    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
756    
757            * man/: Updated documentation.
758    
759            * DESCRIPTION: Changed package name to "tm". Updated version to
760            0.1 for first CRAN release.
761    
762            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
763            list archive example.
764    
765            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
766            archive example.
767    
768            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
769            from (several mails per box) mbox format to (single mail per file)
770            eml format.
771    
772    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
773    
774            * data/crude.rda: Rebuilt.
775    
776            * data/acq.rda: Rebuilt.
777    
778            * R/reader.R: Factored out reader and parser methods from
779            textdoccol.R.
780    
781            * R/source.R: Factored out Source methods from aobjects.R and
782            textdoccol.R.
783            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
784            feeds.
785    
786            * R/textdoccol.R (DirSource): Added support for recursive
787            traversal of directories.
788    
789    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
790    
791            * R/textdoccol.R ([[): Loads the document corpus automatically
792            into memory upon access.
793            (tm_transform, tm_filter): Removed several checks whether the
794            document is already loaded ([[ ensures this now).
795            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
796            mailing list archive.
797    
798    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
799    
800            * R/aobjects.R (TextDocument): Is now a virtual class.
801            (Source): Is now a virtual class.
802    
803    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * R/textdoccol.R (c): Support for an arbitrary number of document
806            collections.
807    
808    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
809    
810            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
811            append_meta and remove_meta.
812    
813            * R/textdoccol.R: Removed modify_metadata method.
814    
815            * R/textrepo.R: Removed modify_metadata method.
816    
817            * R/textdoccol.R (remove_meta): Supports removal of document
818            collection metadata and document (= in data frame) metadata.
819    
820    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
821    
822            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
823    
824            * data/crude.rda: Rebuilt.
825    
826            * data/acq.rda: Rebuilt.
827    
828            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
829    
830            * R/textdoccol.R ([): Bug fix for subsetting a document
831            collection's data frame.
832    
833    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
834    
835            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
836            to s_filter.
837    
838            * R/textdoccol.R: Local text documents' metadata can now be copied
839            to a document collection's data frame with prescind_meta.
840    
841    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
842    
843            * R/: Text documents' slot metadata is now accessible in s_filter.
844    
845            * R/: Rewrote s_filter function (has still some restrictions).
846    
847    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
848    
849            * R/: Various fixes in handling metadata.
850    
851            * R/: Added update mechanism for text document collections.
852    
853    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
854    
855            * R/: Merging of document collections now creates a binary tree
856            for reconstructing merged document collections.
857    
858            * R/: Redesign of metadata for document collections.
859    
860    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
861    
862            * R/: Messages now use \code{ngettext}.
863    
864    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
865    
866            * R/: Added functions for modifying and removing metadata.
867    
868    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
869    
870            * man/: Updated some documentation.
871    
872            * R/: Corrected some connection issues.
873    
874            * inst/doc: Worked on the vignette.
875    
876    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
877    
878            * inst/: Added texts and started vignette.
879    
880            * R/: Final changes based upon David's comments.
881    
882    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
883    
884            * NAMESPACE: Corrected exports (generic methods need exportMethods
885            directives!).
886    
887    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * R/: Modified the TextDocCol constructur and various parsers. It
890            is now modular and supports various file formats via plugins (see
891            the new "Source" class).
892    
893    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
894    
895            * man/: Revised documentation after previous code changes.
896    
897    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
898    
899            * R/: Remaining changes as discussed with David.
900    
901    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
902    
903            * R/: Some changes as suggested by David. The rest will follow
904            within the next days.
905    
906    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
907    
908            * man/: Finished documentation.
909    
910    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
911    
912            * man/: Wrote some documentation.
913    
914    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
915    
916            * R/: Further syntactic sugar in form of additional assignment and
917            accessor methods.
918    
919    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
920    
921            * R/: Syntactic sugar in form of "length", "show" and "summary"
922            operators.
923    
924    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
925    
926            * R/: Diverse updates. Mainly on default operators ("[" or "c")
927            and dissimilarities.
928    
929    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
930    
931            * R/: Added similarity functions.
932    
933            * data/: Added english stopwords.
934    
935    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
936    
937            * data/: Examples compiled for new features
938    
939            * R/: Changes due to new structure.
940    
941            * NAMESPACE: Corrected namespace to reflect new structure.
942    
943            * R/termdocmatrix.R: Adapted for new naming scheme.
944    
945    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
946    
947            * R/textdoccol.R: Adapted code for new class structure. Wrote
948            several transform and filter functions operating on text document
949            collections (alias text document databases).
950    
951            * R/aobjects.R: Adapted class structure with inheritance,
952            repositories and additional meta data. Loading files on demand is
953            now possible.
954    
955    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
956    
957            * R/: Some cosmetic cleanups.
958    
959            * inst/: Removed vignette on clustering. That and much more is now
960            described in the JSS paper on text mining. Based upon that
961            article an elaborated vignette will be incorporated in the future.
962    
963    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
964    
965            * R/: Updated generic S4 methods to comply with signature changes
966            in newer versions of R (> 2.3)
967    
968    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
969    
970            * ext/R/importRIS.R: Automatic RIS import is now possible.
971    
972    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
973    
974            * R/textdoccol.R: Added RIS HTML input format.
975    
976    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
977    
978            * R/textdoccol.R: Removed bug that caused invalid text document
979            collections when handling many input files.
980    
981    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
982    
983            * R/textdoccol.R: Restructured and extended file import
984            mechanism.
985    
986            * inst/doc/clustering.Rnw: Adapted vignette for use with
987            ReutNews.rda
988    
989            * man/ReutNews.Rd: Documentation for ReutNews.rda
990    
991            * data/ReutNews.rda: A tiny Reuters21578 example data set.
992    
993  2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
994    
995          * inst/doc/clustering.Rnw: Wrote a small vignette to present the          * inst/doc/clustering.Rnw: Wrote a small vignette to present the

Legend:
Removed from v.34  
changed lines
  Added in v.1048

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge