SCM

SCM Repository

[tm] Diff of /trunk/tm/ChangeLog
ViewVC logotype

Diff of /trunk/tm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 40, Tue Feb 14 15:02:45 2006 UTC trunk/tm/ChangeLog revision 763, Wed Jul 11 11:56:44 2007 UTC
# Line 1  Line 1 
1    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/distmeasure.R (dissimilarity): Replaced dists call from
4            package cba by new dist call from package proxy.
5    
6    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
7    
8            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
9    
10    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
11    
12            * R/termdocmatrix.R: require() uses the quietly option to suppress
13            loading messages.
14    
15    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
16    
17            * R/dictionary.R: Added dictionary support.
18    
19    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
20    
21            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
22            documents. This simplifies some functions, e.g., asPlain.
23    
24    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
25    
26            * inst/doc/tm.Rnw: Fixed some typos in vignette.
27    
28    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
29    
30            * R/textdoccol.R (replaceWords): Added method to replace a set of
31            words by a single word. Useful for synonyms.
32    
33    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
34    
35            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
36    
37    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
38    
39            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
40            vectors. Thanks to Ariel Maguyon for his error report.
41            (removeSparseTerms): New function to remove columns from a
42            term-document matrix exceeding a sparse factor.
43    
44    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
45    
46            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
47    
48    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
49    
50            * man/sFilter.Rd: Corrected documentation on statement format (use
51            '==' instead of '=').
52    
53    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
54    
55            * R/aobjects.R (StructuredTextDocument): Inherits from
56            TextDocument.
57    
58    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
59    
60            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
61            on sparse matrices as proposed by Martin Maechler.
62    
63    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
64    
65            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
66            \pkg{filehash} version makes them deprecated.
67    
68    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
69    
70            * R/termdocmatrix.R (textvector): Stemming is now performed before
71            erasing stopwords.
72            (weightMatrix): Adapted to handle sparse matrices.
73            (TermDocMatrix): Sparse matrix is now efficiently built by
74            direct stepwise insertion of row values into it.
75    
76    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
77    
78            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
79            due to ongoing problems. For our purposes the latter is as useful
80            as the replaced package.
81    
82    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
83    
84            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
85    
86            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
87    
88    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
89    
90            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
91            languages with available stopwords.
92    
93    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
94    
95            * inst/doc/tm.Rnw: Minor corrections in the vignette.
96    
97    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
98    
99            * DESCRIPTION: Update to version 0.2, since a lot of new features
100            have been integrated.
101    
102            * inst/stopwords: Updated existing stopwords and added stopwords
103            for various other languages.
104    
105    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
106    
107            * man/: Updated documentation.
108    
109            * Work/testDb.R: Script to test database stuff.
110    
111            * R/: Fixed various database related bugs. Seems to be rather
112            useable now, i.e., consider as alpha status for now.
113    
114    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
115    
116            * R/: Fixed some bugs related to database support.
117    
118    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
119    
120            * man/: Added a lot of examples to the manuals.
121    
122    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
123    
124            * man/: Updated parts of the documentation.
125    
126            * R/textdoccol.R (asPlain): Added conversion from newsgroup
127            documents to plain text documents.
128    
129    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
130    
131            * R/textdoccol.R: Finished experimental database support. Not yet
132            intensively tested.
133    
134            * R/source.R: Now each source has a default reader.
135    
136            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
137            class anymore.
138    
139            * R/plaintextdoc.R: Custom show method for plain text documents.
140    
141            * R/aobjects.R: Added a class for structured text documents.
142    
143            * R/reader.R: Replaced remaining \code{parser} occurrences with
144            \code{reader}.
145    
146            * R/textdoccol.R (summary): Indent tags.
147    
148            * R/textdoccol.R (removePunctuation): Transform method to remove
149            punctuation marks.
150    
151    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
152    
153            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
154            using prescindMeta().
155    
156    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
157    
158            * R/textdoccol.R: Improved database support.
159    
160    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
161    
162            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
163    
164            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
165            language code.
166    
167            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
168            into parserControl argument.
169    
170            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
171    
172    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
173    
174            * Work/tmDataSetup.R: The datasets acq and crude can now be
175            created on the fly.
176    
177            * R/stopwords.R: Introduced a function returning the stopwords for
178            a given language (English, German and French at the moment)
179    
180            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
181            otherwise falls back to Snowball package.
182    
183    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
184    
185            * man/dissimilarity-methods.Rd: Make clear that any method offered
186            by "dists" from package "cba" can be used.
187    
188    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
189    
190            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
191            to Kurt's latex suggestion. Removed points and underscores in
192            variable names for consistent naming.
193    
194            * DESCRIPTION: Update to version 0.1-2.
195    
196            * man/TextRepository.Rd: Fixed bug in documentation.
197    
198    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
199    
200            * DESCRIPTION: Update to version 0.1-1.
201    
202    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
203    
204            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
205            wordStem.
206    
207    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
208    
209            * R/: Changes due to Kurt's review.
210    
211    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
212    
213            * R/: Implemented improvements based upon comments by David
214            Meyer.
215    
216    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
217    
218            * inst/doc/: Rewrote vignette.
219    
220            * man/: Improved documentation.
221    
222    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
223    
224            * man/: Updated documentation.
225    
226            * DESCRIPTION: Changed package name to "tm". Updated version to
227            0.1 for first CRAN release.
228    
229            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
230            list archive example.
231    
232            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
233            archive example.
234    
235            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
236            from (several mails per box) mbox format to (single mail per file)
237            eml format.
238    
239    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
240    
241            * data/crude.rda: Rebuilt.
242    
243            * data/acq.rda: Rebuilt.
244    
245            * R/reader.R: Factored out reader and parser methods from
246            textdoccol.R.
247    
248            * R/source.R: Factored out Source methods from aobjects.R and
249            textdoccol.R.
250            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
251            feeds.
252    
253            * R/textdoccol.R (DirSource): Added support for recursive
254            traversal of directories.
255    
256    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * R/textdoccol.R ([[): Loads the document corpus automatically
259            into memory upon access.
260            (tm_transform, tm_filter): Removed several checks whether the
261            document is already loaded ([[ ensures this now).
262            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
263            mailing list archive.
264    
265    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
266    
267            * R/aobjects.R (TextDocument): Is now a virtual class.
268            (Source): Is now a virtual class.
269    
270    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
271    
272            * R/textdoccol.R (c): Support for an arbitrary number of document
273            collections.
274    
275    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
276    
277            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
278            append_meta and remove_meta.
279    
280            * R/textdoccol.R: Removed modify_metadata method.
281    
282            * R/textrepo.R: Removed modify_metadata method.
283    
284            * R/textdoccol.R (remove_meta): Supports removal of document
285            collection metadata and document (= in data frame) metadata.
286    
287    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
288    
289            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
290    
291            * data/crude.rda: Rebuilt.
292    
293            * data/acq.rda: Rebuilt.
294    
295            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
296    
297            * R/textdoccol.R ([): Bug fix for subsetting a document
298            collection's data frame.
299    
300    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
301    
302            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
303            to s_filter.
304    
305            * R/textdoccol.R: Local text documents' metadata can now be copied
306            to a document collection's data frame with prescind_meta.
307    
308    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
309    
310            * R/: Text documents' slot metadata is now accessible in s_filter.
311    
312            * R/: Rewrote s_filter function (has still some restrictions).
313    
314    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/: Various fixes in handling metadata.
317    
318            * R/: Added update mechanism for text document collections.
319    
320    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
321    
322            * R/: Merging of document collections now creates a binary tree
323            for reconstructing merged document collections.
324    
325            * R/: Redesign of metadata for document collections.
326    
327    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
328    
329            * R/: Messages now use \code{ngettext}.
330    
331    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
332    
333            * R/: Added functions for modifying and removing metadata.
334    
335    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
336    
337            * man/: Updated some documentation.
338    
339            * R/: Corrected some connection issues.
340    
341            * inst/doc: Worked on the vignette.
342    
343    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
344    
345            * inst/: Added texts and started vignette.
346    
347            * R/: Final changes based upon David's comments.
348    
349    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
350    
351            * NAMESPACE: Corrected exports (generic methods need exportMethods
352            directives!).
353    
354    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
355    
356            * R/: Modified the TextDocCol constructur and various parsers. It
357            is now modular and supports various file formats via plugins (see
358            the new "Source" class).
359    
360    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * man/: Revised documentation after previous code changes.
363    
364    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * R/: Remaining changes as discussed with David.
367    
368    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * R/: Some changes as suggested by David. The rest will follow
371            within the next days.
372    
373    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
374    
375            * man/: Finished documentation.
376    
377    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
378    
379            * man/: Wrote some documentation.
380    
381    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
382    
383            * R/: Further syntactic sugar in form of additional assignment and
384            accessor methods.
385    
386    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
387    
388            * R/: Syntactic sugar in form of "length", "show" and "summary"
389            operators.
390    
391    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
392    
393            * R/: Diverse updates. Mainly on default operators ("[" or "c")
394            and dissimilarities.
395    
396    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
397    
398            * R/: Added similarity functions.
399    
400            * data/: Added english stopwords.
401    
402    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * data/: Examples compiled for new features
405    
406            * R/: Changes due to new structure.
407    
408            * NAMESPACE: Corrected namespace to reflect new structure.
409    
410            * R/termdocmatrix.R: Adapted for new naming scheme.
411    
412    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/textdoccol.R: Adapted code for new class structure. Wrote
415            several transform and filter functions operating on text document
416            collections (alias text document databases).
417    
418            * R/aobjects.R: Adapted class structure with inheritance,
419            repositories and additional meta data. Loading files on demand is
420            now possible.
421    
422    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
423    
424            * R/: Some cosmetic cleanups.
425    
426            * inst/: Removed vignette on clustering. That and much more is now
427            described in the JSS paper on text mining. Based upon that
428            article an elaborated vignette will be incorporated in the future.
429    
430    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * R/: Updated generic S4 methods to comply with signature changes
433            in newer versions of R (> 2.3)
434    
435    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * ext/R/importRIS.R: Automatic RIS import is now possible.
438    
439  2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441          * R/textdoccol.R: Added RIS HTML input format.          * R/textdoccol.R: Added RIS HTML input format.

Legend:
Removed from v.40  
changed lines
  Added in v.763

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge