SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/tm/ChangeLog revision 46, Wed Jul 5 18:08:41 2006 UTC trunk/tm/ChangeLog revision 760, Thu Jun 21 22:40:15 2007 UTC
# Line 1  Line 1 
1    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/termdocmatrix.R: require() uses the quietly option to suppress
4            loading messages.
5    
6    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
7    
8            * R/dictionary.R: Added dictionary support.
9    
10    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
11    
12            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
13            documents. This simplifies some functions, e.g., asPlain.
14    
15    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
16    
17            * inst/doc/tm.Rnw: Fixed some typos in vignette.
18    
19    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
20    
21            * R/textdoccol.R (replaceWords): Added method to replace a set of
22            words by a single word. Useful for synonyms.
23    
24    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
25    
26            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
27    
28    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
29    
30            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
31            vectors. Thanks to Ariel Maguyon for his error report.
32            (removeSparseTerms): New function to remove columns from a
33            term-document matrix exceeding a sparse factor.
34    
35    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
36    
37            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
38    
39    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
40    
41            * man/sFilter.Rd: Corrected documentation on statement format (use
42            '==' instead of '=').
43    
44    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
45    
46            * R/aobjects.R (StructuredTextDocument): Inherits from
47            TextDocument.
48    
49    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
50    
51            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
52            on sparse matrices as proposed by Martin Maechler.
53    
54    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
55    
56            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
57            \pkg{filehash} version makes them deprecated.
58    
59    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
60    
61            * R/termdocmatrix.R (textvector): Stemming is now performed before
62            erasing stopwords.
63            (weightMatrix): Adapted to handle sparse matrices.
64            (TermDocMatrix): Sparse matrix is now efficiently built by
65            direct stepwise insertion of row values into it.
66    
67    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
68    
69            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
70            due to ongoing problems. For our purposes the latter is as useful
71            as the replaced package.
72    
73    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
74    
75            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
76    
77            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
78    
79    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
80    
81            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
82            languages with available stopwords.
83    
84    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
85    
86            * inst/doc/tm.Rnw: Minor corrections in the vignette.
87    
88    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
89    
90            * DESCRIPTION: Update to version 0.2, since a lot of new features
91            have been integrated.
92    
93            * inst/stopwords: Updated existing stopwords and added stopwords
94            for various other languages.
95    
96    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
97    
98            * man/: Updated documentation.
99    
100            * Work/testDb.R: Script to test database stuff.
101    
102            * R/: Fixed various database related bugs. Seems to be rather
103            useable now, i.e., consider as alpha status for now.
104    
105    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
106    
107            * R/: Fixed some bugs related to database support.
108    
109    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
110    
111            * man/: Added a lot of examples to the manuals.
112    
113    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
114    
115            * man/: Updated parts of the documentation.
116    
117            * R/textdoccol.R (asPlain): Added conversion from newsgroup
118            documents to plain text documents.
119    
120    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
121    
122            * R/textdoccol.R: Finished experimental database support. Not yet
123            intensively tested.
124    
125            * R/source.R: Now each source has a default reader.
126    
127            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
128            class anymore.
129    
130            * R/plaintextdoc.R: Custom show method for plain text documents.
131    
132            * R/aobjects.R: Added a class for structured text documents.
133    
134            * R/reader.R: Replaced remaining \code{parser} occurrences with
135            \code{reader}.
136    
137            * R/textdoccol.R (summary): Indent tags.
138    
139            * R/textdoccol.R (removePunctuation): Transform method to remove
140            punctuation marks.
141    
142    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
143    
144            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
145            using prescindMeta().
146    
147    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
148    
149            * R/textdoccol.R: Improved database support.
150    
151    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
152    
153            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
154    
155            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
156            language code.
157    
158            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
159            into parserControl argument.
160    
161            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
162    
163    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
164    
165            * Work/tmDataSetup.R: The datasets acq and crude can now be
166            created on the fly.
167    
168            * R/stopwords.R: Introduced a function returning the stopwords for
169            a given language (English, German and French at the moment)
170    
171            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
172            otherwise falls back to Snowball package.
173    
174    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
175    
176            * man/dissimilarity-methods.Rd: Make clear that any method offered
177            by "dists" from package "cba" can be used.
178    
179    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
180    
181            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
182            to Kurt's latex suggestion. Removed points and underscores in
183            variable names for consistent naming.
184    
185            * DESCRIPTION: Update to version 0.1-2.
186    
187            * man/TextRepository.Rd: Fixed bug in documentation.
188    
189    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
190    
191            * DESCRIPTION: Update to version 0.1-1.
192    
193    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
196            wordStem.
197    
198    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
199    
200            * R/: Changes due to Kurt's review.
201    
202    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
203    
204            * R/: Implemented improvements based upon comments by David
205            Meyer.
206    
207    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
208    
209            * inst/doc/: Rewrote vignette.
210    
211            * man/: Improved documentation.
212    
213    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
214    
215            * man/: Updated documentation.
216    
217            * DESCRIPTION: Changed package name to "tm". Updated version to
218            0.1 for first CRAN release.
219    
220            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
221            list archive example.
222    
223            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
224            archive example.
225    
226            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
227            from (several mails per box) mbox format to (single mail per file)
228            eml format.
229    
230    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
231    
232            * data/crude.rda: Rebuilt.
233    
234            * data/acq.rda: Rebuilt.
235    
236            * R/reader.R: Factored out reader and parser methods from
237            textdoccol.R.
238    
239            * R/source.R: Factored out Source methods from aobjects.R and
240            textdoccol.R.
241            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
242            feeds.
243    
244            * R/textdoccol.R (DirSource): Added support for recursive
245            traversal of directories.
246    
247    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
248    
249            * R/textdoccol.R ([[): Loads the document corpus automatically
250            into memory upon access.
251            (tm_transform, tm_filter): Removed several checks whether the
252            document is already loaded ([[ ensures this now).
253            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
254            mailing list archive.
255    
256    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * R/aobjects.R (TextDocument): Is now a virtual class.
259            (Source): Is now a virtual class.
260    
261    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
262    
263            * R/textdoccol.R (c): Support for an arbitrary number of document
264            collections.
265    
266    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
267    
268            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
269            append_meta and remove_meta.
270    
271            * R/textdoccol.R: Removed modify_metadata method.
272    
273            * R/textrepo.R: Removed modify_metadata method.
274    
275            * R/textdoccol.R (remove_meta): Supports removal of document
276            collection metadata and document (= in data frame) metadata.
277    
278    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
279    
280            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
281    
282            * data/crude.rda: Rebuilt.
283    
284            * data/acq.rda: Rebuilt.
285    
286            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
287    
288            * R/textdoccol.R ([): Bug fix for subsetting a document
289            collection's data frame.
290    
291    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
292    
293            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
294            to s_filter.
295    
296            * R/textdoccol.R: Local text documents' metadata can now be copied
297            to a document collection's data frame with prescind_meta.
298    
299    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
300    
301            * R/: Text documents' slot metadata is now accessible in s_filter.
302    
303            * R/: Rewrote s_filter function (has still some restrictions).
304    
305    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
306    
307            * R/: Various fixes in handling metadata.
308    
309            * R/: Added update mechanism for text document collections.
310    
311    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
312    
313            * R/: Merging of document collections now creates a binary tree
314            for reconstructing merged document collections.
315    
316            * R/: Redesign of metadata for document collections.
317    
318    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
319    
320            * R/: Messages now use \code{ngettext}.
321    
322    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/: Added functions for modifying and removing metadata.
325    
326    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * man/: Updated some documentation.
329    
330            * R/: Corrected some connection issues.
331    
332            * inst/doc: Worked on the vignette.
333    
334    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
335    
336            * inst/: Added texts and started vignette.
337    
338            * R/: Final changes based upon David's comments.
339    
340    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
341    
342            * NAMESPACE: Corrected exports (generic methods need exportMethods
343            directives!).
344    
345    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
346    
347            * R/: Modified the TextDocCol constructur and various parsers. It
348            is now modular and supports various file formats via plugins (see
349            the new "Source" class).
350    
351    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * man/: Revised documentation after previous code changes.
354    
355    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/: Remaining changes as discussed with David.
358    
359    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * R/: Some changes as suggested by David. The rest will follow
362            within the next days.
363    
364    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * man/: Finished documentation.
367    
368    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * man/: Wrote some documentation.
371    
372    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * R/: Further syntactic sugar in form of additional assignment and
375            accessor methods.
376    
377    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
378    
379            * R/: Syntactic sugar in form of "length", "show" and "summary"
380            operators.
381    
382    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
383    
384            * R/: Diverse updates. Mainly on default operators ("[" or "c")
385            and dissimilarities.
386    
387    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
388    
389            * R/: Added similarity functions.
390    
391            * data/: Added english stopwords.
392    
393    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * data/: Examples compiled for new features
396    
397            * R/: Changes due to new structure.
398    
399            * NAMESPACE: Corrected namespace to reflect new structure.
400    
401            * R/termdocmatrix.R: Adapted for new naming scheme.
402    
403    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * R/textdoccol.R: Adapted code for new class structure. Wrote
406            several transform and filter functions operating on text document
407            collections (alias text document databases).
408    
409            * R/aobjects.R: Adapted class structure with inheritance,
410            repositories and additional meta data. Loading files on demand is
411            now possible.
412    
413    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
414    
415            * R/: Some cosmetic cleanups.
416    
417            * inst/: Removed vignette on clustering. That and much more is now
418            described in the JSS paper on text mining. Based upon that
419            article an elaborated vignette will be incorporated in the future.
420    
421  2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423          * R/: Updated generic S4 methods to comply with signature changes          * R/: Updated generic S4 methods to comply with signature changes

Legend:
Removed from v.46  
changed lines
  Added in v.760

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge