SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 34, Thu Dec 22 15:18:10 2005 UTC trunk/tm/ChangeLog revision 785, Sat Oct 13 10:46:28 2007 UTC
# Line 1  Line 1 
1    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
4            functions for accessing dimension, column, and row names.
5    
6            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
7    
8    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
9    
10            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
11    
12    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
13    
14            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
15    
16    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
17    
18            * R/reader.R (readPDF): Removed manual checks for pdftotext and
19            pdfinfo. The system call gives a warning anyway.
20    
21    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
22    
23            * R/textdoccol.R (asPlain): Conversion from
24            StructuredTextDocuments to PlainTextDocuments.
25    
26    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
27    
28            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
29            for accessing term-document matrices.
30    
31            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
32            are installed.
33    
34    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
35    
36            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
37            Christian Buchta.
38    
39    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
40    
41            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
42    
43    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
44    
45            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
46    
47            * R/reader.R (readPDF): Added PDF reader.
48    
49    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
50    
51            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
52    
53            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
54    
55            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
56    
57            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
58    
59    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
60    
61            * R/distmeasure.R (dissimilarity): Replaced dists call from
62            package cba by new dist call from package proxy.
63    
64    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
65    
66            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
67    
68    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
69    
70            * R/termdocmatrix.R: require() uses the quietly option to suppress
71            loading messages.
72    
73    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
74    
75            * R/dictionary.R: Added dictionary support.
76    
77    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
78    
79            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
80            documents. This simplifies some functions, e.g., asPlain.
81    
82    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
83    
84            * inst/doc/tm.Rnw: Fixed some typos in vignette.
85    
86    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
87    
88            * R/textdoccol.R (replaceWords): Added method to replace a set of
89            words by a single word. Useful for synonyms.
90    
91    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
92    
93            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
94    
95    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
96    
97            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
98            vectors. Thanks to Ariel Maguyon for his error report.
99            (removeSparseTerms): New function to remove columns from a
100            term-document matrix exceeding a sparse factor.
101    
102    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
103    
104            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
105    
106    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
107    
108            * man/sFilter.Rd: Corrected documentation on statement format (use
109            '==' instead of '=').
110    
111    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
112    
113            * R/aobjects.R (StructuredTextDocument): Inherits from
114            TextDocument.
115    
116    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
117    
118            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
119            on sparse matrices as proposed by Martin Maechler.
120    
121    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
122    
123            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
124            \pkg{filehash} version makes them deprecated.
125    
126    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
127    
128            * R/termdocmatrix.R (textvector): Stemming is now performed before
129            erasing stopwords.
130            (weightMatrix): Adapted to handle sparse matrices.
131            (TermDocMatrix): Sparse matrix is now efficiently built by
132            direct stepwise insertion of row values into it.
133    
134    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
135    
136            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
137            due to ongoing problems. For our purposes the latter is as useful
138            as the replaced package.
139    
140    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
141    
142            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
143    
144            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
145    
146    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
147    
148            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
149            languages with available stopwords.
150    
151    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
152    
153            * inst/doc/tm.Rnw: Minor corrections in the vignette.
154    
155    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
156    
157            * DESCRIPTION: Update to version 0.2, since a lot of new features
158            have been integrated.
159    
160            * inst/stopwords: Updated existing stopwords and added stopwords
161            for various other languages.
162    
163    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
164    
165            * man/: Updated documentation.
166    
167            * Work/testDb.R: Script to test database stuff.
168    
169            * R/: Fixed various database related bugs. Seems to be rather
170            useable now, i.e., consider as alpha status for now.
171    
172    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
173    
174            * R/: Fixed some bugs related to database support.
175    
176    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
177    
178            * man/: Added a lot of examples to the manuals.
179    
180    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
181    
182            * man/: Updated parts of the documentation.
183    
184            * R/textdoccol.R (asPlain): Added conversion from newsgroup
185            documents to plain text documents.
186    
187    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
188    
189            * R/textdoccol.R: Finished experimental database support. Not yet
190            intensively tested.
191    
192            * R/source.R: Now each source has a default reader.
193    
194            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
195            class anymore.
196    
197            * R/plaintextdoc.R: Custom show method for plain text documents.
198    
199            * R/aobjects.R: Added a class for structured text documents.
200    
201            * R/reader.R: Replaced remaining \code{parser} occurrences with
202            \code{reader}.
203    
204            * R/textdoccol.R (summary): Indent tags.
205    
206            * R/textdoccol.R (removePunctuation): Transform method to remove
207            punctuation marks.
208    
209    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
210    
211            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
212            using prescindMeta().
213    
214    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
215    
216            * R/textdoccol.R: Improved database support.
217    
218    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
219    
220            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
221    
222            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
223            language code.
224    
225            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
226            into parserControl argument.
227    
228            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
229    
230    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
231    
232            * Work/tmDataSetup.R: The datasets acq and crude can now be
233            created on the fly.
234    
235            * R/stopwords.R: Introduced a function returning the stopwords for
236            a given language (English, German and French at the moment)
237    
238            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
239            otherwise falls back to Snowball package.
240    
241    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
242    
243            * man/dissimilarity-methods.Rd: Make clear that any method offered
244            by "dists" from package "cba" can be used.
245    
246    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
247    
248            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
249            to Kurt's latex suggestion. Removed points and underscores in
250            variable names for consistent naming.
251    
252            * DESCRIPTION: Update to version 0.1-2.
253    
254            * man/TextRepository.Rd: Fixed bug in documentation.
255    
256    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * DESCRIPTION: Update to version 0.1-1.
259    
260    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
261    
262            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
263            wordStem.
264    
265    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
266    
267            * R/: Changes due to Kurt's review.
268    
269    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
270    
271            * R/: Implemented improvements based upon comments by David
272            Meyer.
273    
274    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
275    
276            * inst/doc/: Rewrote vignette.
277    
278            * man/: Improved documentation.
279    
280    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
281    
282            * man/: Updated documentation.
283    
284            * DESCRIPTION: Changed package name to "tm". Updated version to
285            0.1 for first CRAN release.
286    
287            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
288            list archive example.
289    
290            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
291            archive example.
292    
293            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
294            from (several mails per box) mbox format to (single mail per file)
295            eml format.
296    
297    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
298    
299            * data/crude.rda: Rebuilt.
300    
301            * data/acq.rda: Rebuilt.
302    
303            * R/reader.R: Factored out reader and parser methods from
304            textdoccol.R.
305    
306            * R/source.R: Factored out Source methods from aobjects.R and
307            textdoccol.R.
308            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
309            feeds.
310    
311            * R/textdoccol.R (DirSource): Added support for recursive
312            traversal of directories.
313    
314    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/textdoccol.R ([[): Loads the document corpus automatically
317            into memory upon access.
318            (tm_transform, tm_filter): Removed several checks whether the
319            document is already loaded ([[ ensures this now).
320            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
321            mailing list archive.
322    
323    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
324    
325            * R/aobjects.R (TextDocument): Is now a virtual class.
326            (Source): Is now a virtual class.
327    
328    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
329    
330            * R/textdoccol.R (c): Support for an arbitrary number of document
331            collections.
332    
333    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
334    
335            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
336            append_meta and remove_meta.
337    
338            * R/textdoccol.R: Removed modify_metadata method.
339    
340            * R/textrepo.R: Removed modify_metadata method.
341    
342            * R/textdoccol.R (remove_meta): Supports removal of document
343            collection metadata and document (= in data frame) metadata.
344    
345    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
346    
347            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
348    
349            * data/crude.rda: Rebuilt.
350    
351            * data/acq.rda: Rebuilt.
352    
353            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
354    
355            * R/textdoccol.R ([): Bug fix for subsetting a document
356            collection's data frame.
357    
358    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
359    
360            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
361            to s_filter.
362    
363            * R/textdoccol.R: Local text documents' metadata can now be copied
364            to a document collection's data frame with prescind_meta.
365    
366    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
367    
368            * R/: Text documents' slot metadata is now accessible in s_filter.
369    
370            * R/: Rewrote s_filter function (has still some restrictions).
371    
372    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * R/: Various fixes in handling metadata.
375    
376            * R/: Added update mechanism for text document collections.
377    
378    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/: Merging of document collections now creates a binary tree
381            for reconstructing merged document collections.
382    
383            * R/: Redesign of metadata for document collections.
384    
385    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
386    
387            * R/: Messages now use \code{ngettext}.
388    
389    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/: Added functions for modifying and removing metadata.
392    
393    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * man/: Updated some documentation.
396    
397            * R/: Corrected some connection issues.
398    
399            * inst/doc: Worked on the vignette.
400    
401    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
402    
403            * inst/: Added texts and started vignette.
404    
405            * R/: Final changes based upon David's comments.
406    
407    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * NAMESPACE: Corrected exports (generic methods need exportMethods
410            directives!).
411    
412    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/: Modified the TextDocCol constructur and various parsers. It
415            is now modular and supports various file formats via plugins (see
416            the new "Source" class).
417    
418    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
419    
420            * man/: Revised documentation after previous code changes.
421    
422    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
423    
424            * R/: Remaining changes as discussed with David.
425    
426    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
427    
428            * R/: Some changes as suggested by David. The rest will follow
429            within the next days.
430    
431    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
432    
433            * man/: Finished documentation.
434    
435    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * man/: Wrote some documentation.
438    
439    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441            * R/: Further syntactic sugar in form of additional assignment and
442            accessor methods.
443    
444    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
445    
446            * R/: Syntactic sugar in form of "length", "show" and "summary"
447            operators.
448    
449    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
450    
451            * R/: Diverse updates. Mainly on default operators ("[" or "c")
452            and dissimilarities.
453    
454    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/: Added similarity functions.
457    
458            * data/: Added english stopwords.
459    
460    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * data/: Examples compiled for new features
463    
464            * R/: Changes due to new structure.
465    
466            * NAMESPACE: Corrected namespace to reflect new structure.
467    
468            * R/termdocmatrix.R: Adapted for new naming scheme.
469    
470    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
471    
472            * R/textdoccol.R: Adapted code for new class structure. Wrote
473            several transform and filter functions operating on text document
474            collections (alias text document databases).
475    
476            * R/aobjects.R: Adapted class structure with inheritance,
477            repositories and additional meta data. Loading files on demand is
478            now possible.
479    
480    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
481    
482            * R/: Some cosmetic cleanups.
483    
484            * inst/: Removed vignette on clustering. That and much more is now
485            described in the JSS paper on text mining. Based upon that
486            article an elaborated vignette will be incorporated in the future.
487    
488    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
489    
490            * R/: Updated generic S4 methods to comply with signature changes
491            in newer versions of R (> 2.3)
492    
493    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
494    
495            * ext/R/importRIS.R: Automatic RIS import is now possible.
496    
497    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * R/textdoccol.R: Added RIS HTML input format.
500    
501    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
502    
503            * R/textdoccol.R: Removed bug that caused invalid text document
504            collections when handling many input files.
505    
506    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
507    
508            * R/textdoccol.R: Restructured and extended file import
509            mechanism.
510    
511            * inst/doc/clustering.Rnw: Adapted vignette for use with
512            ReutNews.rda
513    
514            * man/ReutNews.Rd: Documentation for ReutNews.rda
515    
516            * data/ReutNews.rda: A tiny Reuters21578 example data set.
517    
518  2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
519    
520          * inst/doc/clustering.Rnw: Wrote a small vignette to present the          * inst/doc/clustering.Rnw: Wrote a small vignette to present the

Legend:
Removed from v.34  
changed lines
  Added in v.785

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge