SCM

SCM Repository

[tm] Diff of /trunk/tm/ChangeLog
ViewVC logotype

Diff of /trunk/tm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 40, Tue Feb 14 15:02:45 2006 UTC trunk/tm/ChangeLog revision 765, Fri Jul 13 15:53:45 2007 UTC
# Line 1  Line 1 
1    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
4    
5            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
6    
7            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
8    
9            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
10    
11    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
12    
13            * R/distmeasure.R (dissimilarity): Replaced dists call from
14            package cba by new dist call from package proxy.
15    
16    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
17    
18            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
19    
20    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
21    
22            * R/termdocmatrix.R: require() uses the quietly option to suppress
23            loading messages.
24    
25    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
26    
27            * R/dictionary.R: Added dictionary support.
28    
29    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
30    
31            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
32            documents. This simplifies some functions, e.g., asPlain.
33    
34    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
35    
36            * inst/doc/tm.Rnw: Fixed some typos in vignette.
37    
38    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
39    
40            * R/textdoccol.R (replaceWords): Added method to replace a set of
41            words by a single word. Useful for synonyms.
42    
43    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
44    
45            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
46    
47    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
48    
49            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
50            vectors. Thanks to Ariel Maguyon for his error report.
51            (removeSparseTerms): New function to remove columns from a
52            term-document matrix exceeding a sparse factor.
53    
54    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
55    
56            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
57    
58    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
59    
60            * man/sFilter.Rd: Corrected documentation on statement format (use
61            '==' instead of '=').
62    
63    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
64    
65            * R/aobjects.R (StructuredTextDocument): Inherits from
66            TextDocument.
67    
68    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
69    
70            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
71            on sparse matrices as proposed by Martin Maechler.
72    
73    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
74    
75            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
76            \pkg{filehash} version makes them deprecated.
77    
78    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
79    
80            * R/termdocmatrix.R (textvector): Stemming is now performed before
81            erasing stopwords.
82            (weightMatrix): Adapted to handle sparse matrices.
83            (TermDocMatrix): Sparse matrix is now efficiently built by
84            direct stepwise insertion of row values into it.
85    
86    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
87    
88            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
89            due to ongoing problems. For our purposes the latter is as useful
90            as the replaced package.
91    
92    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
93    
94            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
95    
96            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
97    
98    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
99    
100            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
101            languages with available stopwords.
102    
103    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
104    
105            * inst/doc/tm.Rnw: Minor corrections in the vignette.
106    
107    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
108    
109            * DESCRIPTION: Update to version 0.2, since a lot of new features
110            have been integrated.
111    
112            * inst/stopwords: Updated existing stopwords and added stopwords
113            for various other languages.
114    
115    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
116    
117            * man/: Updated documentation.
118    
119            * Work/testDb.R: Script to test database stuff.
120    
121            * R/: Fixed various database related bugs. Seems to be rather
122            useable now, i.e., consider as alpha status for now.
123    
124    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
125    
126            * R/: Fixed some bugs related to database support.
127    
128    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
129    
130            * man/: Added a lot of examples to the manuals.
131    
132    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
133    
134            * man/: Updated parts of the documentation.
135    
136            * R/textdoccol.R (asPlain): Added conversion from newsgroup
137            documents to plain text documents.
138    
139    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
140    
141            * R/textdoccol.R: Finished experimental database support. Not yet
142            intensively tested.
143    
144            * R/source.R: Now each source has a default reader.
145    
146            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
147            class anymore.
148    
149            * R/plaintextdoc.R: Custom show method for plain text documents.
150    
151            * R/aobjects.R: Added a class for structured text documents.
152    
153            * R/reader.R: Replaced remaining \code{parser} occurrences with
154            \code{reader}.
155    
156            * R/textdoccol.R (summary): Indent tags.
157    
158            * R/textdoccol.R (removePunctuation): Transform method to remove
159            punctuation marks.
160    
161    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
162    
163            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
164            using prescindMeta().
165    
166    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
167    
168            * R/textdoccol.R: Improved database support.
169    
170    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
171    
172            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
173    
174            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
175            language code.
176    
177            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
178            into parserControl argument.
179    
180            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
181    
182    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
183    
184            * Work/tmDataSetup.R: The datasets acq and crude can now be
185            created on the fly.
186    
187            * R/stopwords.R: Introduced a function returning the stopwords for
188            a given language (English, German and French at the moment)
189    
190            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
191            otherwise falls back to Snowball package.
192    
193    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * man/dissimilarity-methods.Rd: Make clear that any method offered
196            by "dists" from package "cba" can be used.
197    
198    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
199    
200            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
201            to Kurt's latex suggestion. Removed points and underscores in
202            variable names for consistent naming.
203    
204            * DESCRIPTION: Update to version 0.1-2.
205    
206            * man/TextRepository.Rd: Fixed bug in documentation.
207    
208    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
209    
210            * DESCRIPTION: Update to version 0.1-1.
211    
212    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
213    
214            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
215            wordStem.
216    
217    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
218    
219            * R/: Changes due to Kurt's review.
220    
221    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
222    
223            * R/: Implemented improvements based upon comments by David
224            Meyer.
225    
226    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
227    
228            * inst/doc/: Rewrote vignette.
229    
230            * man/: Improved documentation.
231    
232    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
233    
234            * man/: Updated documentation.
235    
236            * DESCRIPTION: Changed package name to "tm". Updated version to
237            0.1 for first CRAN release.
238    
239            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
240            list archive example.
241    
242            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
243            archive example.
244    
245            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
246            from (several mails per box) mbox format to (single mail per file)
247            eml format.
248    
249    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
250    
251            * data/crude.rda: Rebuilt.
252    
253            * data/acq.rda: Rebuilt.
254    
255            * R/reader.R: Factored out reader and parser methods from
256            textdoccol.R.
257    
258            * R/source.R: Factored out Source methods from aobjects.R and
259            textdoccol.R.
260            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
261            feeds.
262    
263            * R/textdoccol.R (DirSource): Added support for recursive
264            traversal of directories.
265    
266    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
267    
268            * R/textdoccol.R ([[): Loads the document corpus automatically
269            into memory upon access.
270            (tm_transform, tm_filter): Removed several checks whether the
271            document is already loaded ([[ ensures this now).
272            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
273            mailing list archive.
274    
275    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
276    
277            * R/aobjects.R (TextDocument): Is now a virtual class.
278            (Source): Is now a virtual class.
279    
280    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
281    
282            * R/textdoccol.R (c): Support for an arbitrary number of document
283            collections.
284    
285    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
286    
287            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
288            append_meta and remove_meta.
289    
290            * R/textdoccol.R: Removed modify_metadata method.
291    
292            * R/textrepo.R: Removed modify_metadata method.
293    
294            * R/textdoccol.R (remove_meta): Supports removal of document
295            collection metadata and document (= in data frame) metadata.
296    
297    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
298    
299            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
300    
301            * data/crude.rda: Rebuilt.
302    
303            * data/acq.rda: Rebuilt.
304    
305            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
306    
307            * R/textdoccol.R ([): Bug fix for subsetting a document
308            collection's data frame.
309    
310    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
311    
312            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
313            to s_filter.
314    
315            * R/textdoccol.R: Local text documents' metadata can now be copied
316            to a document collection's data frame with prescind_meta.
317    
318    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
319    
320            * R/: Text documents' slot metadata is now accessible in s_filter.
321    
322            * R/: Rewrote s_filter function (has still some restrictions).
323    
324    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/: Various fixes in handling metadata.
327    
328            * R/: Added update mechanism for text document collections.
329    
330    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
331    
332            * R/: Merging of document collections now creates a binary tree
333            for reconstructing merged document collections.
334    
335            * R/: Redesign of metadata for document collections.
336    
337    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * R/: Messages now use \code{ngettext}.
340    
341    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
342    
343            * R/: Added functions for modifying and removing metadata.
344    
345    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
346    
347            * man/: Updated some documentation.
348    
349            * R/: Corrected some connection issues.
350    
351            * inst/doc: Worked on the vignette.
352    
353    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
354    
355            * inst/: Added texts and started vignette.
356    
357            * R/: Final changes based upon David's comments.
358    
359    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * NAMESPACE: Corrected exports (generic methods need exportMethods
362            directives!).
363    
364    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * R/: Modified the TextDocCol constructur and various parsers. It
367            is now modular and supports various file formats via plugins (see
368            the new "Source" class).
369    
370    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * man/: Revised documentation after previous code changes.
373    
374    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * R/: Remaining changes as discussed with David.
377    
378    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/: Some changes as suggested by David. The rest will follow
381            within the next days.
382    
383    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
384    
385            * man/: Finished documentation.
386    
387    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
388    
389            * man/: Wrote some documentation.
390    
391    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
392    
393            * R/: Further syntactic sugar in form of additional assignment and
394            accessor methods.
395    
396    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
397    
398            * R/: Syntactic sugar in form of "length", "show" and "summary"
399            operators.
400    
401    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
402    
403            * R/: Diverse updates. Mainly on default operators ("[" or "c")
404            and dissimilarities.
405    
406    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
407    
408            * R/: Added similarity functions.
409    
410            * data/: Added english stopwords.
411    
412    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * data/: Examples compiled for new features
415    
416            * R/: Changes due to new structure.
417    
418            * NAMESPACE: Corrected namespace to reflect new structure.
419    
420            * R/termdocmatrix.R: Adapted for new naming scheme.
421    
422    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
423    
424            * R/textdoccol.R: Adapted code for new class structure. Wrote
425            several transform and filter functions operating on text document
426            collections (alias text document databases).
427    
428            * R/aobjects.R: Adapted class structure with inheritance,
429            repositories and additional meta data. Loading files on demand is
430            now possible.
431    
432    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
433    
434            * R/: Some cosmetic cleanups.
435    
436            * inst/: Removed vignette on clustering. That and much more is now
437            described in the JSS paper on text mining. Based upon that
438            article an elaborated vignette will be incorporated in the future.
439    
440    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
441    
442            * R/: Updated generic S4 methods to comply with signature changes
443            in newer versions of R (> 2.3)
444    
445    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * ext/R/importRIS.R: Automatic RIS import is now possible.
448    
449  2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
450    
451          * R/textdoccol.R: Added RIS HTML input format.          * R/textdoccol.R: Added RIS HTML input format.

Legend:
Removed from v.40  
changed lines
  Added in v.765

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge