SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 28, Tue Dec 6 13:46:33 2005 UTC trunk/tm/ChangeLog revision 752, Sat May 19 22:39:04 2007 UTC
# Line 1  Line 1 
1    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
4            vectors. Thanks to Ariel Maguyon for his error report.
5            (removeSparseTerms): New function to remove columns from a
6            term-document matrix exceeding a sparse factor.
7    
8    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
9    
10            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
11    
12    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
13    
14            * man/sFilter.Rd: Corrected documentation on statement format (use
15            '==' instead of '=').
16    
17    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
18    
19            * R/aobjects.R (StructuredTextDocument): Inherits from
20            TextDocument.
21    
22    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
23    
24            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
25            on sparse matrices as proposed by Martin Maechler.
26    
27    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
28    
29            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
30            \pkg{filehash} version makes them deprecated.
31    
32    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
33    
34            * R/termdocmatrix.R (textvector): Stemming is now performed before
35            erasing stopwords.
36            (weightMatrix): Adapted to handle sparse matrices.
37            (TermDocMatrix): Sparse matrix is now efficiently built by
38            direct stepwise insertion of row values into it.
39    
40    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
41    
42            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
43            due to ongoing problems. For our purposes the latter is as useful
44            as the replaced package.
45    
46    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
47    
48            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
49    
50            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
51    
52    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
53    
54            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
55            languages with available stopwords.
56    
57    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
58    
59            * inst/doc/tm.Rnw: Minor corrections in the vignette.
60    
61    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
62    
63            * DESCRIPTION: Update to version 0.2, since a lot of new features
64            have been integrated.
65    
66            * inst/stopwords: Updated existing stopwords and added stopwords
67            for various other languages.
68    
69    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
70    
71            * man/: Updated documentation.
72    
73            * Work/testDb.R: Script to test database stuff.
74    
75            * R/: Fixed various database related bugs. Seems to be rather
76            useable now, i.e., consider as alpha status for now.
77    
78    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
79    
80            * R/: Fixed some bugs related to database support.
81    
82    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
83    
84            * man/: Added a lot of examples to the manuals.
85    
86    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
87    
88            * man/: Updated parts of the documentation.
89    
90            * R/textdoccol.R (asPlain): Added conversion from newsgroup
91            documents to plain text documents.
92    
93    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
94    
95            * R/textdoccol.R: Finished experimental database support. Not yet
96            intensively tested.
97    
98            * R/source.R: Now each source has a default reader.
99    
100            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
101            class anymore.
102    
103            * R/plaintextdoc.R: Custom show method for plain text documents.
104    
105            * R/aobjects.R: Added a class for structured text documents.
106    
107            * R/reader.R: Replaced remaining \code{parser} occurrences with
108            \code{reader}.
109    
110            * R/textdoccol.R (summary): Indent tags.
111    
112            * R/textdoccol.R (removePunctuation): Transform method to remove
113            punctuation marks.
114    
115    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
116    
117            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
118            using prescindMeta().
119    
120    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
121    
122            * R/textdoccol.R: Improved database support.
123    
124    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
125    
126            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
127    
128            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
129            language code.
130    
131            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
132            into parserControl argument.
133    
134            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
135    
136    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
137    
138            * Work/tmDataSetup.R: The datasets acq and crude can now be
139            created on the fly.
140    
141            * R/stopwords.R: Introduced a function returning the stopwords for
142            a given language (English, German and French at the moment)
143    
144            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
145            otherwise falls back to Snowball package.
146    
147    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
148    
149            * man/dissimilarity-methods.Rd: Make clear that any method offered
150            by "dists" from package "cba" can be used.
151    
152    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
153    
154            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
155            to Kurt's latex suggestion. Removed points and underscores in
156            variable names for consistent naming.
157    
158            * DESCRIPTION: Update to version 0.1-2.
159    
160            * man/TextRepository.Rd: Fixed bug in documentation.
161    
162    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
163    
164            * DESCRIPTION: Update to version 0.1-1.
165    
166    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
167    
168            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
169            wordStem.
170    
171    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
172    
173            * R/: Changes due to Kurt's review.
174    
175    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
176    
177            * R/: Implemented improvements based upon comments by David
178            Meyer.
179    
180    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
181    
182            * inst/doc/: Rewrote vignette.
183    
184            * man/: Improved documentation.
185    
186    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
187    
188            * man/: Updated documentation.
189    
190            * DESCRIPTION: Changed package name to "tm". Updated version to
191            0.1 for first CRAN release.
192    
193            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
194            list archive example.
195    
196            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
197            archive example.
198    
199            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
200            from (several mails per box) mbox format to (single mail per file)
201            eml format.
202    
203    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
204    
205            * data/crude.rda: Rebuilt.
206    
207            * data/acq.rda: Rebuilt.
208    
209            * R/reader.R: Factored out reader and parser methods from
210            textdoccol.R.
211    
212            * R/source.R: Factored out Source methods from aobjects.R and
213            textdoccol.R.
214            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
215            feeds.
216    
217            * R/textdoccol.R (DirSource): Added support for recursive
218            traversal of directories.
219    
220    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
221    
222            * R/textdoccol.R ([[): Loads the document corpus automatically
223            into memory upon access.
224            (tm_transform, tm_filter): Removed several checks whether the
225            document is already loaded ([[ ensures this now).
226            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
227            mailing list archive.
228    
229    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
230    
231            * R/aobjects.R (TextDocument): Is now a virtual class.
232            (Source): Is now a virtual class.
233    
234    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
235    
236            * R/textdoccol.R (c): Support for an arbitrary number of document
237            collections.
238    
239    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
240    
241            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
242            append_meta and remove_meta.
243    
244            * R/textdoccol.R: Removed modify_metadata method.
245    
246            * R/textrepo.R: Removed modify_metadata method.
247    
248            * R/textdoccol.R (remove_meta): Supports removal of document
249            collection metadata and document (= in data frame) metadata.
250    
251    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
252    
253            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
254    
255            * data/crude.rda: Rebuilt.
256    
257            * data/acq.rda: Rebuilt.
258    
259            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
260    
261            * R/textdoccol.R ([): Bug fix for subsetting a document
262            collection's data frame.
263    
264    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
265    
266            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
267            to s_filter.
268    
269            * R/textdoccol.R: Local text documents' metadata can now be copied
270            to a document collection's data frame with prescind_meta.
271    
272    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
273    
274            * R/: Text documents' slot metadata is now accessible in s_filter.
275    
276            * R/: Rewrote s_filter function (has still some restrictions).
277    
278    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
279    
280            * R/: Various fixes in handling metadata.
281    
282            * R/: Added update mechanism for text document collections.
283    
284    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
285    
286            * R/: Merging of document collections now creates a binary tree
287            for reconstructing merged document collections.
288    
289            * R/: Redesign of metadata for document collections.
290    
291    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
292    
293            * R/: Messages now use \code{ngettext}.
294    
295    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
296    
297            * R/: Added functions for modifying and removing metadata.
298    
299    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
300    
301            * man/: Updated some documentation.
302    
303            * R/: Corrected some connection issues.
304    
305            * inst/doc: Worked on the vignette.
306    
307    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
308    
309            * inst/: Added texts and started vignette.
310    
311            * R/: Final changes based upon David's comments.
312    
313    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
314    
315            * NAMESPACE: Corrected exports (generic methods need exportMethods
316            directives!).
317    
318    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
319    
320            * R/: Modified the TextDocCol constructur and various parsers. It
321            is now modular and supports various file formats via plugins (see
322            the new "Source" class).
323    
324    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * man/: Revised documentation after previous code changes.
327    
328    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
329    
330            * R/: Remaining changes as discussed with David.
331    
332    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
333    
334            * R/: Some changes as suggested by David. The rest will follow
335            within the next days.
336    
337    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * man/: Finished documentation.
340    
341    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
342    
343            * man/: Wrote some documentation.
344    
345    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
346    
347            * R/: Further syntactic sugar in form of additional assignment and
348            accessor methods.
349    
350    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * R/: Syntactic sugar in form of "length", "show" and "summary"
353            operators.
354    
355    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/: Diverse updates. Mainly on default operators ("[" or "c")
358            and dissimilarities.
359    
360    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * R/: Added similarity functions.
363    
364            * data/: Added english stopwords.
365    
366    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
367    
368            * data/: Examples compiled for new features
369    
370            * R/: Changes due to new structure.
371    
372            * NAMESPACE: Corrected namespace to reflect new structure.
373    
374            * R/termdocmatrix.R: Adapted for new naming scheme.
375    
376    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * R/textdoccol.R: Adapted code for new class structure. Wrote
379            several transform and filter functions operating on text document
380            collections (alias text document databases).
381    
382            * R/aobjects.R: Adapted class structure with inheritance,
383            repositories and additional meta data. Loading files on demand is
384            now possible.
385    
386    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
387    
388            * R/: Some cosmetic cleanups.
389    
390            * inst/: Removed vignette on clustering. That and much more is now
391            described in the JSS paper on text mining. Based upon that
392            article an elaborated vignette will be incorporated in the future.
393    
394    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/: Updated generic S4 methods to comply with signature changes
397            in newer versions of R (> 2.3)
398    
399    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
400    
401            * ext/R/importRIS.R: Automatic RIS import is now possible.
402    
403    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/textdoccol.R: Added RIS HTML input format.
406    
407    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/textdoccol.R: Removed bug that caused invalid text document
410            collections when handling many input files.
411    
412    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/textdoccol.R: Restructured and extended file import
415            mechanism.
416    
417            * inst/doc/clustering.Rnw: Adapted vignette for use with
418            ReutNews.rda
419    
420            * man/ReutNews.Rd: Documentation for ReutNews.rda
421    
422            * data/ReutNews.rda: A tiny Reuters21578 example data set.
423    
424    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
425    
426            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
427            clustering facilities of this package.
428    
429    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
430    
431            * R/aobjects.R: Changed package document structure to avoid class
432            dependency problems.
433    
434  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
435    
436            * Wrote a script for the ModLewis Split for the Reuters-21578 XML
437            data set.
438    
439          * Finished documentation and reordered directory structure. Now "R          * Finished documentation and reordered directory structure. Now "R
440          CMD check textmin" works without errors.          CMD check textmin" works without errors.
441    

Legend:
Removed from v.28  
changed lines
  Added in v.752

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge