SCM

SCM Repository

[tm] Diff of /trunk/tm/ChangeLog
ViewVC logotype

Diff of /trunk/tm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 23, Sat Nov 19 18:25:41 2005 UTC trunk/tm/ChangeLog revision 747, Fri Apr 27 18:16:53 2007 UTC
# Line 1  Line 1 
1    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
4            \pkg{filehash} version makes them deprecated.
5    
6    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
7    
8            * R/termdocmatrix.R (textvector): Stemming is now performed before
9            erasing stopwords.
10            (weightMatrix): Adapted to handle sparse matrices.
11            (TermDocMatrix): Sparse matrix is now efficiently built by
12            direct stepwise insertion of row values into it.
13    
14    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
15    
16            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
17            due to ongoing problems. For our purposes the latter is as useful
18            as the replaced package.
19    
20    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
21    
22            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
23    
24            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
25    
26    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
27    
28            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
29            languages with available stopwords.
30    
31    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
32    
33            * inst/doc/tm.Rnw: Minor corrections in the vignette.
34    
35    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
36    
37            * DESCRIPTION: Update to version 0.2, since a lot of new features
38            have been integrated.
39    
40            * inst/stopwords: Updated existing stopwords and added stopwords
41            for various other languages.
42    
43    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
44    
45            * man/: Updated documentation.
46    
47            * Work/testDb.R: Script to test database stuff.
48    
49            * R/: Fixed various database related bugs. Seems to be rather
50            useable now, i.e., consider as alpha status for now.
51    
52    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
53    
54            * R/: Fixed some bugs related to database support.
55    
56    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
57    
58            * man/: Added a lot of examples to the manuals.
59    
60    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
61    
62            * man/: Updated parts of the documentation.
63    
64            * R/textdoccol.R (asPlain): Added conversion from newsgroup
65            documents to plain text documents.
66    
67    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
68    
69            * R/textdoccol.R: Finished experimental database support. Not yet
70            intensively tested.
71    
72            * R/source.R: Now each source has a default reader.
73    
74            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
75            class anymore.
76    
77            * R/plaintextdoc.R: Custom show method for plain text documents.
78    
79            * R/aobjects.R: Added a class for structured text documents.
80    
81            * R/reader.R: Replaced remaining \code{parser} occurrences with
82            \code{reader}.
83    
84            * R/textdoccol.R (summary): Indent tags.
85    
86            * R/textdoccol.R (removePunctuation): Transform method to remove
87            punctuation marks.
88    
89    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
90    
91            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
92            using prescindMeta().
93    
94    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
95    
96            * R/textdoccol.R: Improved database support.
97    
98    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
99    
100            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
101    
102            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
103            language code.
104    
105            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
106            into parserControl argument.
107    
108            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
109    
110    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
111    
112            * Work/tmDataSetup.R: The datasets acq and crude can now be
113            created on the fly.
114    
115            * R/stopwords.R: Introduced a function returning the stopwords for
116            a given language (English, German and French at the moment)
117    
118            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
119            otherwise falls back to Snowball package.
120    
121    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
122    
123            * man/dissimilarity-methods.Rd: Make clear that any method offered
124            by "dists" from package "cba" can be used.
125    
126    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
127    
128            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
129            to Kurt's latex suggestion. Removed points and underscores in
130            variable names for consistent naming.
131    
132            * DESCRIPTION: Update to version 0.1-2.
133    
134            * man/TextRepository.Rd: Fixed bug in documentation.
135    
136    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
137    
138            * DESCRIPTION: Update to version 0.1-1.
139    
140    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
141    
142            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
143            wordStem.
144    
145    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
146    
147            * R/: Changes due to Kurt's review.
148    
149    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
150    
151            * R/: Implemented improvements based upon comments by David
152            Meyer.
153    
154    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
155    
156            * inst/doc/: Rewrote vignette.
157    
158            * man/: Improved documentation.
159    
160    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
161    
162            * man/: Updated documentation.
163    
164            * DESCRIPTION: Changed package name to "tm". Updated version to
165            0.1 for first CRAN release.
166    
167            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
168            list archive example.
169    
170            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
171            archive example.
172    
173            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
174            from (several mails per box) mbox format to (single mail per file)
175            eml format.
176    
177    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
178    
179            * data/crude.rda: Rebuilt.
180    
181            * data/acq.rda: Rebuilt.
182    
183            * R/reader.R: Factored out reader and parser methods from
184            textdoccol.R.
185    
186            * R/source.R: Factored out Source methods from aobjects.R and
187            textdoccol.R.
188            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
189            feeds.
190    
191            * R/textdoccol.R (DirSource): Added support for recursive
192            traversal of directories.
193    
194    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
195    
196            * R/textdoccol.R ([[): Loads the document corpus automatically
197            into memory upon access.
198            (tm_transform, tm_filter): Removed several checks whether the
199            document is already loaded ([[ ensures this now).
200            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
201            mailing list archive.
202    
203    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
204    
205            * R/aobjects.R (TextDocument): Is now a virtual class.
206            (Source): Is now a virtual class.
207    
208    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
209    
210            * R/textdoccol.R (c): Support for an arbitrary number of document
211            collections.
212    
213    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
214    
215            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
216            append_meta and remove_meta.
217    
218            * R/textdoccol.R: Removed modify_metadata method.
219    
220            * R/textrepo.R: Removed modify_metadata method.
221    
222            * R/textdoccol.R (remove_meta): Supports removal of document
223            collection metadata and document (= in data frame) metadata.
224    
225    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
226    
227            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
228    
229            * data/crude.rda: Rebuilt.
230    
231            * data/acq.rda: Rebuilt.
232    
233            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
234    
235            * R/textdoccol.R ([): Bug fix for subsetting a document
236            collection's data frame.
237    
238    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
239    
240            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
241            to s_filter.
242    
243            * R/textdoccol.R: Local text documents' metadata can now be copied
244            to a document collection's data frame with prescind_meta.
245    
246    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
247    
248            * R/: Text documents' slot metadata is now accessible in s_filter.
249    
250            * R/: Rewrote s_filter function (has still some restrictions).
251    
252    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
253    
254            * R/: Various fixes in handling metadata.
255    
256            * R/: Added update mechanism for text document collections.
257    
258    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * R/: Merging of document collections now creates a binary tree
261            for reconstructing merged document collections.
262    
263            * R/: Redesign of metadata for document collections.
264    
265    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
266    
267            * R/: Messages now use \code{ngettext}.
268    
269    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
270    
271            * R/: Added functions for modifying and removing metadata.
272    
273    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
274    
275            * man/: Updated some documentation.
276    
277            * R/: Corrected some connection issues.
278    
279            * inst/doc: Worked on the vignette.
280    
281    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
282    
283            * inst/: Added texts and started vignette.
284    
285            * R/: Final changes based upon David's comments.
286    
287    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
288    
289            * NAMESPACE: Corrected exports (generic methods need exportMethods
290            directives!).
291    
292    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
293    
294            * R/: Modified the TextDocCol constructur and various parsers. It
295            is now modular and supports various file formats via plugins (see
296            the new "Source" class).
297    
298    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
299    
300            * man/: Revised documentation after previous code changes.
301    
302    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
303    
304            * R/: Remaining changes as discussed with David.
305    
306    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
307    
308            * R/: Some changes as suggested by David. The rest will follow
309            within the next days.
310    
311    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
312    
313            * man/: Finished documentation.
314    
315    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
316    
317            * man/: Wrote some documentation.
318    
319    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/: Further syntactic sugar in form of additional assignment and
322            accessor methods.
323    
324    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/: Syntactic sugar in form of "length", "show" and "summary"
327            operators.
328    
329    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * R/: Diverse updates. Mainly on default operators ("[" or "c")
332            and dissimilarities.
333    
334    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
335    
336            * R/: Added similarity functions.
337    
338            * data/: Added english stopwords.
339    
340    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
341    
342            * data/: Examples compiled for new features
343    
344            * R/: Changes due to new structure.
345    
346            * NAMESPACE: Corrected namespace to reflect new structure.
347    
348            * R/termdocmatrix.R: Adapted for new naming scheme.
349    
350    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * R/textdoccol.R: Adapted code for new class structure. Wrote
353            several transform and filter functions operating on text document
354            collections (alias text document databases).
355    
356            * R/aobjects.R: Adapted class structure with inheritance,
357            repositories and additional meta data. Loading files on demand is
358            now possible.
359    
360    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * R/: Some cosmetic cleanups.
363    
364            * inst/: Removed vignette on clustering. That and much more is now
365            described in the JSS paper on text mining. Based upon that
366            article an elaborated vignette will be incorporated in the future.
367    
368    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * R/: Updated generic S4 methods to comply with signature changes
371            in newer versions of R (> 2.3)
372    
373    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
374    
375            * ext/R/importRIS.R: Automatic RIS import is now possible.
376    
377    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
378    
379            * R/textdoccol.R: Added RIS HTML input format.
380    
381    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
382    
383            * R/textdoccol.R: Removed bug that caused invalid text document
384            collections when handling many input files.
385    
386    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
387    
388            * R/textdoccol.R: Restructured and extended file import
389            mechanism.
390    
391            * inst/doc/clustering.Rnw: Adapted vignette for use with
392            ReutNews.rda
393    
394            * man/ReutNews.Rd: Documentation for ReutNews.rda
395    
396            * data/ReutNews.rda: A tiny Reuters21578 example data set.
397    
398    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
401            clustering facilities of this package.
402    
403    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/aobjects.R: Changed package document structure to avoid class
406            dependency problems.
407    
408    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
409    
410            * Wrote a script for the ModLewis Split for the Reuters-21578 XML
411            data set.
412    
413            * Finished documentation and reordered directory structure. Now "R
414            CMD check textmin" works without errors.
415    
416    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * src/: Various splits can now be easily created for the
419            Reuters21578 data set.
420    
421    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * Updated documentation
424    
425    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * Wrote R documentation for some classes and methods.
428    
429  2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
430    
431          * R/textdoccol.R: Constructor of textdoccol allows import of CSV          * R/textdoccol.R: Constructor of textdoccol allows import of CSV

Legend:
Removed from v.23  
changed lines
  Added in v.747

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge