SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC trunk/tm/ChangeLog revision 757, Thu Jun 7 17:41:56 2007 UTC
# Line 1  Line 1 
1    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
4            documents. This simplifies some functions, e.g., asPlain.
5    
6    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
7    
8            * inst/doc/tm.Rnw: Fixed some typos in vignette.
9    
10    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
11    
12            * R/textdoccol.R (replaceWords): Added method to replace a set of
13            words by a single word. Useful for synonyms.
14    
15    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
16    
17            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
18    
19    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
20    
21            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
22            vectors. Thanks to Ariel Maguyon for his error report.
23            (removeSparseTerms): New function to remove columns from a
24            term-document matrix exceeding a sparse factor.
25    
26    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
27    
28            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
29    
30    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
31    
32            * man/sFilter.Rd: Corrected documentation on statement format (use
33            '==' instead of '=').
34    
35    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
36    
37            * R/aobjects.R (StructuredTextDocument): Inherits from
38            TextDocument.
39    
40    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
41    
42            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
43            on sparse matrices as proposed by Martin Maechler.
44    
45    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
46    
47            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
48            \pkg{filehash} version makes them deprecated.
49    
50    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
51    
52            * R/termdocmatrix.R (textvector): Stemming is now performed before
53            erasing stopwords.
54            (weightMatrix): Adapted to handle sparse matrices.
55            (TermDocMatrix): Sparse matrix is now efficiently built by
56            direct stepwise insertion of row values into it.
57    
58    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
59    
60            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
61            due to ongoing problems. For our purposes the latter is as useful
62            as the replaced package.
63    
64    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
65    
66            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
67    
68            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
69    
70    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
71    
72            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
73            languages with available stopwords.
74    
75    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
76    
77            * inst/doc/tm.Rnw: Minor corrections in the vignette.
78    
79    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
80    
81            * DESCRIPTION: Update to version 0.2, since a lot of new features
82            have been integrated.
83    
84            * inst/stopwords: Updated existing stopwords and added stopwords
85            for various other languages.
86    
87    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
88    
89            * man/: Updated documentation.
90    
91            * Work/testDb.R: Script to test database stuff.
92    
93            * R/: Fixed various database related bugs. Seems to be rather
94            useable now, i.e., consider as alpha status for now.
95    
96    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
97    
98            * R/: Fixed some bugs related to database support.
99    
100    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
101    
102            * man/: Added a lot of examples to the manuals.
103    
104    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
105    
106            * man/: Updated parts of the documentation.
107    
108            * R/textdoccol.R (asPlain): Added conversion from newsgroup
109            documents to plain text documents.
110    
111    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
112    
113            * R/textdoccol.R: Finished experimental database support. Not yet
114            intensively tested.
115    
116            * R/source.R: Now each source has a default reader.
117    
118            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
119            class anymore.
120    
121            * R/plaintextdoc.R: Custom show method for plain text documents.
122    
123            * R/aobjects.R: Added a class for structured text documents.
124    
125            * R/reader.R: Replaced remaining \code{parser} occurrences with
126            \code{reader}.
127    
128            * R/textdoccol.R (summary): Indent tags.
129    
130            * R/textdoccol.R (removePunctuation): Transform method to remove
131            punctuation marks.
132    
133    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
134    
135            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
136            using prescindMeta().
137    
138    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
139    
140            * R/textdoccol.R: Improved database support.
141    
142    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
143    
144            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
145    
146            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
147            language code.
148    
149            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
150            into parserControl argument.
151    
152            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
153    
154    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
155    
156            * Work/tmDataSetup.R: The datasets acq and crude can now be
157            created on the fly.
158    
159            * R/stopwords.R: Introduced a function returning the stopwords for
160            a given language (English, German and French at the moment)
161    
162            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
163            otherwise falls back to Snowball package.
164    
165    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
166    
167            * man/dissimilarity-methods.Rd: Make clear that any method offered
168            by "dists" from package "cba" can be used.
169    
170    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
171    
172            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
173            to Kurt's latex suggestion. Removed points and underscores in
174            variable names for consistent naming.
175    
176            * DESCRIPTION: Update to version 0.1-2.
177    
178            * man/TextRepository.Rd: Fixed bug in documentation.
179    
180    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
181    
182            * DESCRIPTION: Update to version 0.1-1.
183    
184    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
185    
186            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
187            wordStem.
188    
189    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
190    
191            * R/: Changes due to Kurt's review.
192    
193    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * R/: Implemented improvements based upon comments by David
196            Meyer.
197    
198    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
199    
200            * inst/doc/: Rewrote vignette.
201    
202            * man/: Improved documentation.
203    
204    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
205    
206            * man/: Updated documentation.
207    
208            * DESCRIPTION: Changed package name to "tm". Updated version to
209            0.1 for first CRAN release.
210    
211            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
212            list archive example.
213    
214            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
215            archive example.
216    
217            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
218            from (several mails per box) mbox format to (single mail per file)
219            eml format.
220    
221    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
222    
223            * data/crude.rda: Rebuilt.
224    
225            * data/acq.rda: Rebuilt.
226    
227            * R/reader.R: Factored out reader and parser methods from
228            textdoccol.R.
229    
230            * R/source.R: Factored out Source methods from aobjects.R and
231            textdoccol.R.
232            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
233            feeds.
234    
235            * R/textdoccol.R (DirSource): Added support for recursive
236            traversal of directories.
237    
238    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
239    
240            * R/textdoccol.R ([[): Loads the document corpus automatically
241            into memory upon access.
242            (tm_transform, tm_filter): Removed several checks whether the
243            document is already loaded ([[ ensures this now).
244            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
245            mailing list archive.
246    
247    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
248    
249            * R/aobjects.R (TextDocument): Is now a virtual class.
250            (Source): Is now a virtual class.
251    
252    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
253    
254            * R/textdoccol.R (c): Support for an arbitrary number of document
255            collections.
256    
257    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
258    
259            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
260            append_meta and remove_meta.
261    
262            * R/textdoccol.R: Removed modify_metadata method.
263    
264            * R/textrepo.R: Removed modify_metadata method.
265    
266            * R/textdoccol.R (remove_meta): Supports removal of document
267            collection metadata and document (= in data frame) metadata.
268    
269    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
270    
271            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
272    
273            * data/crude.rda: Rebuilt.
274    
275            * data/acq.rda: Rebuilt.
276    
277            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
278    
279            * R/textdoccol.R ([): Bug fix for subsetting a document
280            collection's data frame.
281    
282    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
283    
284            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
285            to s_filter.
286    
287            * R/textdoccol.R: Local text documents' metadata can now be copied
288            to a document collection's data frame with prescind_meta.
289    
290    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
291    
292            * R/: Text documents' slot metadata is now accessible in s_filter.
293    
294            * R/: Rewrote s_filter function (has still some restrictions).
295    
296    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
297    
298            * R/: Various fixes in handling metadata.
299    
300            * R/: Added update mechanism for text document collections.
301    
302    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
303    
304            * R/: Merging of document collections now creates a binary tree
305            for reconstructing merged document collections.
306    
307            * R/: Redesign of metadata for document collections.
308    
309    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
310    
311            * R/: Messages now use \code{ngettext}.
312    
313    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
314    
315            * R/: Added functions for modifying and removing metadata.
316    
317    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
318    
319            * man/: Updated some documentation.
320    
321            * R/: Corrected some connection issues.
322    
323            * inst/doc: Worked on the vignette.
324    
325    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
326    
327            * inst/: Added texts and started vignette.
328    
329            * R/: Final changes based upon David's comments.
330    
331    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
332    
333            * NAMESPACE: Corrected exports (generic methods need exportMethods
334            directives!).
335    
336    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
337    
338            * R/: Modified the TextDocCol constructur and various parsers. It
339            is now modular and supports various file formats via plugins (see
340            the new "Source" class).
341    
342    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
343    
344            * man/: Revised documentation after previous code changes.
345    
346    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * R/: Remaining changes as discussed with David.
349    
350    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * R/: Some changes as suggested by David. The rest will follow
353            within the next days.
354    
355    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * man/: Finished documentation.
358    
359    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * man/: Wrote some documentation.
362    
363    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * R/: Further syntactic sugar in form of additional assignment and
366            accessor methods.
367    
368    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * R/: Syntactic sugar in form of "length", "show" and "summary"
371            operators.
372    
373    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
374    
375            * R/: Diverse updates. Mainly on default operators ("[" or "c")
376            and dissimilarities.
377    
378    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/: Added similarity functions.
381    
382            * data/: Added english stopwords.
383    
384    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
385    
386            * data/: Examples compiled for new features
387    
388            * R/: Changes due to new structure.
389    
390            * NAMESPACE: Corrected namespace to reflect new structure.
391    
392            * R/termdocmatrix.R: Adapted for new naming scheme.
393    
394    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/textdoccol.R: Adapted code for new class structure. Wrote
397            several transform and filter functions operating on text document
398            collections (alias text document databases).
399    
400            * R/aobjects.R: Adapted class structure with inheritance,
401            repositories and additional meta data. Loading files on demand is
402            now possible.
403    
404    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
405    
406            * R/: Some cosmetic cleanups.
407    
408            * inst/: Removed vignette on clustering. That and much more is now
409            described in the JSS paper on text mining. Based upon that
410            article an elaborated vignette will be incorporated in the future.
411    
412    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/: Updated generic S4 methods to comply with signature changes
415            in newer versions of R (> 2.3)
416    
417    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
418    
419            * ext/R/importRIS.R: Automatic RIS import is now possible.
420    
421    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * R/textdoccol.R: Added RIS HTML input format.
424    
425    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/textdoccol.R: Removed bug that caused invalid text document
428            collections when handling many input files.
429    
430    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * R/textdoccol.R: Restructured and extended file import
433            mechanism.
434    
435            * inst/doc/clustering.Rnw: Adapted vignette for use with
436            ReutNews.rda
437    
438            * man/ReutNews.Rd: Documentation for ReutNews.rda
439    
440            * data/ReutNews.rda: A tiny Reuters21578 example data set.
441    
442    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
443    
444            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
445            clustering facilities of this package.
446    
447    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
448    
449            * R/aobjects.R: Changed package document structure to avoid class
450            dependency problems.
451    
452  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
453    
454            * Wrote a script for the ModLewis Split for the Reuters-21578 XML
455            data set.
456    
457          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
458          CMD check textmin" works without errors.          CMD check textmin" works without errors.
459    

Legend:
Removed from v.28  
changed lines
  Added in v.757

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge