SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC trunk/tm/ChangeLog revision 756, Wed Jun 6 17:12:11 2007 UTC
# Line 1  Line 1 
1    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * inst/doc/tm.Rnw: Fixed some typos in vignette.
4    
5    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
6    
7            * R/textdoccol.R (replaceWords): Added method to replace a set of
8            words by a single word. Useful for synonyms.
9    
10    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
11    
12            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
13    
14    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
15    
16            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
17            vectors. Thanks to Ariel Maguyon for his error report.
18            (removeSparseTerms): New function to remove columns from a
19            term-document matrix exceeding a sparse factor.
20    
21    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
22    
23            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
24    
25    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
26    
27            * man/sFilter.Rd: Corrected documentation on statement format (use
28            '==' instead of '=').
29    
30    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
31    
32            * R/aobjects.R (StructuredTextDocument): Inherits from
33            TextDocument.
34    
35    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
36    
37            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
38            on sparse matrices as proposed by Martin Maechler.
39    
40    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
41    
42            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
43            \pkg{filehash} version makes them deprecated.
44    
45    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
46    
47            * R/termdocmatrix.R (textvector): Stemming is now performed before
48            erasing stopwords.
49            (weightMatrix): Adapted to handle sparse matrices.
50            (TermDocMatrix): Sparse matrix is now efficiently built by
51            direct stepwise insertion of row values into it.
52    
53    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
54    
55            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
56            due to ongoing problems. For our purposes the latter is as useful
57            as the replaced package.
58    
59    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
60    
61            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
62    
63            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
64    
65    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
66    
67            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
68            languages with available stopwords.
69    
70    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
71    
72            * inst/doc/tm.Rnw: Minor corrections in the vignette.
73    
74    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
75    
76            * DESCRIPTION: Update to version 0.2, since a lot of new features
77            have been integrated.
78    
79            * inst/stopwords: Updated existing stopwords and added stopwords
80            for various other languages.
81    
82    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
83    
84            * man/: Updated documentation.
85    
86            * Work/testDb.R: Script to test database stuff.
87    
88            * R/: Fixed various database related bugs. Seems to be rather
89            useable now, i.e., consider as alpha status for now.
90    
91    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
92    
93            * R/: Fixed some bugs related to database support.
94    
95    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
96    
97            * man/: Added a lot of examples to the manuals.
98    
99    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
100    
101            * man/: Updated parts of the documentation.
102    
103            * R/textdoccol.R (asPlain): Added conversion from newsgroup
104            documents to plain text documents.
105    
106    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
107    
108            * R/textdoccol.R: Finished experimental database support. Not yet
109            intensively tested.
110    
111            * R/source.R: Now each source has a default reader.
112    
113            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
114            class anymore.
115    
116            * R/plaintextdoc.R: Custom show method for plain text documents.
117    
118            * R/aobjects.R: Added a class for structured text documents.
119    
120            * R/reader.R: Replaced remaining \code{parser} occurrences with
121            \code{reader}.
122    
123            * R/textdoccol.R (summary): Indent tags.
124    
125            * R/textdoccol.R (removePunctuation): Transform method to remove
126            punctuation marks.
127    
128    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
129    
130            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
131            using prescindMeta().
132    
133    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
134    
135            * R/textdoccol.R: Improved database support.
136    
137    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
138    
139            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
140    
141            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
142            language code.
143    
144            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
145            into parserControl argument.
146    
147            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
148    
149    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
150    
151            * Work/tmDataSetup.R: The datasets acq and crude can now be
152            created on the fly.
153    
154            * R/stopwords.R: Introduced a function returning the stopwords for
155            a given language (English, German and French at the moment)
156    
157            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
158            otherwise falls back to Snowball package.
159    
160    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
161    
162            * man/dissimilarity-methods.Rd: Make clear that any method offered
163            by "dists" from package "cba" can be used.
164    
165    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
166    
167            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
168            to Kurt's latex suggestion. Removed points and underscores in
169            variable names for consistent naming.
170    
171            * DESCRIPTION: Update to version 0.1-2.
172    
173            * man/TextRepository.Rd: Fixed bug in documentation.
174    
175    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
176    
177            * DESCRIPTION: Update to version 0.1-1.
178    
179    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
180    
181            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
182            wordStem.
183    
184    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
185    
186            * R/: Changes due to Kurt's review.
187    
188    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
189    
190            * R/: Implemented improvements based upon comments by David
191            Meyer.
192    
193    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * inst/doc/: Rewrote vignette.
196    
197            * man/: Improved documentation.
198    
199    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
200    
201            * man/: Updated documentation.
202    
203            * DESCRIPTION: Changed package name to "tm". Updated version to
204            0.1 for first CRAN release.
205    
206            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
207            list archive example.
208    
209            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
210            archive example.
211    
212            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
213            from (several mails per box) mbox format to (single mail per file)
214            eml format.
215    
216    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
217    
218            * data/crude.rda: Rebuilt.
219    
220            * data/acq.rda: Rebuilt.
221    
222            * R/reader.R: Factored out reader and parser methods from
223            textdoccol.R.
224    
225            * R/source.R: Factored out Source methods from aobjects.R and
226            textdoccol.R.
227            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
228            feeds.
229    
230            * R/textdoccol.R (DirSource): Added support for recursive
231            traversal of directories.
232    
233    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
234    
235            * R/textdoccol.R ([[): Loads the document corpus automatically
236            into memory upon access.
237            (tm_transform, tm_filter): Removed several checks whether the
238            document is already loaded ([[ ensures this now).
239            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
240            mailing list archive.
241    
242    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
243    
244            * R/aobjects.R (TextDocument): Is now a virtual class.
245            (Source): Is now a virtual class.
246    
247    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
248    
249            * R/textdoccol.R (c): Support for an arbitrary number of document
250            collections.
251    
252    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
253    
254            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
255            append_meta and remove_meta.
256    
257            * R/textdoccol.R: Removed modify_metadata method.
258    
259            * R/textrepo.R: Removed modify_metadata method.
260    
261            * R/textdoccol.R (remove_meta): Supports removal of document
262            collection metadata and document (= in data frame) metadata.
263    
264    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
265    
266            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
267    
268            * data/crude.rda: Rebuilt.
269    
270            * data/acq.rda: Rebuilt.
271    
272            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
273    
274            * R/textdoccol.R ([): Bug fix for subsetting a document
275            collection's data frame.
276    
277    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
278    
279            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
280            to s_filter.
281    
282            * R/textdoccol.R: Local text documents' metadata can now be copied
283            to a document collection's data frame with prescind_meta.
284    
285    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
286    
287            * R/: Text documents' slot metadata is now accessible in s_filter.
288    
289            * R/: Rewrote s_filter function (has still some restrictions).
290    
291    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
292    
293            * R/: Various fixes in handling metadata.
294    
295            * R/: Added update mechanism for text document collections.
296    
297    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
298    
299            * R/: Merging of document collections now creates a binary tree
300            for reconstructing merged document collections.
301    
302            * R/: Redesign of metadata for document collections.
303    
304    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
305    
306            * R/: Messages now use \code{ngettext}.
307    
308    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
309    
310            * R/: Added functions for modifying and removing metadata.
311    
312    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
313    
314            * man/: Updated some documentation.
315    
316            * R/: Corrected some connection issues.
317    
318            * inst/doc: Worked on the vignette.
319    
320    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
321    
322            * inst/: Added texts and started vignette.
323    
324            * R/: Final changes based upon David's comments.
325    
326    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * NAMESPACE: Corrected exports (generic methods need exportMethods
329            directives!).
330    
331    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
332    
333            * R/: Modified the TextDocCol constructur and various parsers. It
334            is now modular and supports various file formats via plugins (see
335            the new "Source" class).
336    
337    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * man/: Revised documentation after previous code changes.
340    
341    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
342    
343            * R/: Remaining changes as discussed with David.
344    
345    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
346    
347            * R/: Some changes as suggested by David. The rest will follow
348            within the next days.
349    
350    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * man/: Finished documentation.
353    
354    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
355    
356            * man/: Wrote some documentation.
357    
358    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
359    
360            * R/: Further syntactic sugar in form of additional assignment and
361            accessor methods.
362    
363    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * R/: Syntactic sugar in form of "length", "show" and "summary"
366            operators.
367    
368    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * R/: Diverse updates. Mainly on default operators ("[" or "c")
371            and dissimilarities.
372    
373    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
374    
375            * R/: Added similarity functions.
376    
377            * data/: Added english stopwords.
378    
379    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
380    
381            * data/: Examples compiled for new features
382    
383            * R/: Changes due to new structure.
384    
385            * NAMESPACE: Corrected namespace to reflect new structure.
386    
387            * R/termdocmatrix.R: Adapted for new naming scheme.
388    
389    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * R/textdoccol.R: Adapted code for new class structure. Wrote
392            several transform and filter functions operating on text document
393            collections (alias text document databases).
394    
395            * R/aobjects.R: Adapted class structure with inheritance,
396            repositories and additional meta data. Loading files on demand is
397            now possible.
398    
399    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
400    
401            * R/: Some cosmetic cleanups.
402    
403            * inst/: Removed vignette on clustering. That and much more is now
404            described in the JSS paper on text mining. Based upon that
405            article an elaborated vignette will be incorporated in the future.
406    
407    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/: Updated generic S4 methods to comply with signature changes
410            in newer versions of R (> 2.3)
411    
412    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * ext/R/importRIS.R: Automatic RIS import is now possible.
415    
416    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * R/textdoccol.R: Added RIS HTML input format.
419    
420    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * R/textdoccol.R: Removed bug that caused invalid text document
423            collections when handling many input files.
424    
425    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/textdoccol.R: Restructured and extended file import
428            mechanism.
429    
430            * inst/doc/clustering.Rnw: Adapted vignette for use with
431            ReutNews.rda
432    
433            * man/ReutNews.Rd: Documentation for ReutNews.rda
434    
435            * data/ReutNews.rda: A tiny Reuters21578 example data set.
436    
437    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
438    
439            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
440            clustering facilities of this package.
441    
442    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
443    
444            * R/aobjects.R: Changed package document structure to avoid class
445            dependency problems.
446    
447  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
448    
449            * Wrote a script for the ModLewis Split for the Reuters-21578 XML
450            data set.
451    
452          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
453          CMD check textmin" works without errors.          CMD check textmin" works without errors.
454    

Legend:
Removed from v.28  
changed lines
  Added in v.756

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge