SCM

SCM Repository

[tm] Diff of /trunk/tm/ChangeLog
ViewVC logotype

Diff of /trunk/tm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 40, Tue Feb 14 15:02:45 2006 UTC trunk/tm/ChangeLog revision 761, Tue Jul 10 14:59:57 2007 UTC
# Line 1  Line 1 
1    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
4    
5    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
6    
7            * R/termdocmatrix.R: require() uses the quietly option to suppress
8            loading messages.
9    
10    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
11    
12            * R/dictionary.R: Added dictionary support.
13    
14    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
15    
16            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
17            documents. This simplifies some functions, e.g., asPlain.
18    
19    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
20    
21            * inst/doc/tm.Rnw: Fixed some typos in vignette.
22    
23    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
24    
25            * R/textdoccol.R (replaceWords): Added method to replace a set of
26            words by a single word. Useful for synonyms.
27    
28    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
29    
30            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
31    
32    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
33    
34            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
35            vectors. Thanks to Ariel Maguyon for his error report.
36            (removeSparseTerms): New function to remove columns from a
37            term-document matrix exceeding a sparse factor.
38    
39    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
40    
41            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
42    
43    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
44    
45            * man/sFilter.Rd: Corrected documentation on statement format (use
46            '==' instead of '=').
47    
48    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
49    
50            * R/aobjects.R (StructuredTextDocument): Inherits from
51            TextDocument.
52    
53    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
54    
55            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
56            on sparse matrices as proposed by Martin Maechler.
57    
58    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
59    
60            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
61            \pkg{filehash} version makes them deprecated.
62    
63    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
64    
65            * R/termdocmatrix.R (textvector): Stemming is now performed before
66            erasing stopwords.
67            (weightMatrix): Adapted to handle sparse matrices.
68            (TermDocMatrix): Sparse matrix is now efficiently built by
69            direct stepwise insertion of row values into it.
70    
71    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
72    
73            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
74            due to ongoing problems. For our purposes the latter is as useful
75            as the replaced package.
76    
77    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
78    
79            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
80    
81            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
82    
83    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
84    
85            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
86            languages with available stopwords.
87    
88    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
89    
90            * inst/doc/tm.Rnw: Minor corrections in the vignette.
91    
92    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
93    
94            * DESCRIPTION: Update to version 0.2, since a lot of new features
95            have been integrated.
96    
97            * inst/stopwords: Updated existing stopwords and added stopwords
98            for various other languages.
99    
100    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
101    
102            * man/: Updated documentation.
103    
104            * Work/testDb.R: Script to test database stuff.
105    
106            * R/: Fixed various database related bugs. Seems to be rather
107            useable now, i.e., consider as alpha status for now.
108    
109    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
110    
111            * R/: Fixed some bugs related to database support.
112    
113    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
114    
115            * man/: Added a lot of examples to the manuals.
116    
117    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
118    
119            * man/: Updated parts of the documentation.
120    
121            * R/textdoccol.R (asPlain): Added conversion from newsgroup
122            documents to plain text documents.
123    
124    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
125    
126            * R/textdoccol.R: Finished experimental database support. Not yet
127            intensively tested.
128    
129            * R/source.R: Now each source has a default reader.
130    
131            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
132            class anymore.
133    
134            * R/plaintextdoc.R: Custom show method for plain text documents.
135    
136            * R/aobjects.R: Added a class for structured text documents.
137    
138            * R/reader.R: Replaced remaining \code{parser} occurrences with
139            \code{reader}.
140    
141            * R/textdoccol.R (summary): Indent tags.
142    
143            * R/textdoccol.R (removePunctuation): Transform method to remove
144            punctuation marks.
145    
146    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
147    
148            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
149            using prescindMeta().
150    
151    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
152    
153            * R/textdoccol.R: Improved database support.
154    
155    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
156    
157            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
158    
159            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
160            language code.
161    
162            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
163            into parserControl argument.
164    
165            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
166    
167    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
168    
169            * Work/tmDataSetup.R: The datasets acq and crude can now be
170            created on the fly.
171    
172            * R/stopwords.R: Introduced a function returning the stopwords for
173            a given language (English, German and French at the moment)
174    
175            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
176            otherwise falls back to Snowball package.
177    
178    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
179    
180            * man/dissimilarity-methods.Rd: Make clear that any method offered
181            by "dists" from package "cba" can be used.
182    
183    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
184    
185            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
186            to Kurt's latex suggestion. Removed points and underscores in
187            variable names for consistent naming.
188    
189            * DESCRIPTION: Update to version 0.1-2.
190    
191            * man/TextRepository.Rd: Fixed bug in documentation.
192    
193    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * DESCRIPTION: Update to version 0.1-1.
196    
197    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
198    
199            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
200            wordStem.
201    
202    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
203    
204            * R/: Changes due to Kurt's review.
205    
206    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
207    
208            * R/: Implemented improvements based upon comments by David
209            Meyer.
210    
211    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
212    
213            * inst/doc/: Rewrote vignette.
214    
215            * man/: Improved documentation.
216    
217    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
218    
219            * man/: Updated documentation.
220    
221            * DESCRIPTION: Changed package name to "tm". Updated version to
222            0.1 for first CRAN release.
223    
224            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
225            list archive example.
226    
227            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
228            archive example.
229    
230            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
231            from (several mails per box) mbox format to (single mail per file)
232            eml format.
233    
234    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
235    
236            * data/crude.rda: Rebuilt.
237    
238            * data/acq.rda: Rebuilt.
239    
240            * R/reader.R: Factored out reader and parser methods from
241            textdoccol.R.
242    
243            * R/source.R: Factored out Source methods from aobjects.R and
244            textdoccol.R.
245            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
246            feeds.
247    
248            * R/textdoccol.R (DirSource): Added support for recursive
249            traversal of directories.
250    
251    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
252    
253            * R/textdoccol.R ([[): Loads the document corpus automatically
254            into memory upon access.
255            (tm_transform, tm_filter): Removed several checks whether the
256            document is already loaded ([[ ensures this now).
257            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
258            mailing list archive.
259    
260    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
261    
262            * R/aobjects.R (TextDocument): Is now a virtual class.
263            (Source): Is now a virtual class.
264    
265    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
266    
267            * R/textdoccol.R (c): Support for an arbitrary number of document
268            collections.
269    
270    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
271    
272            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
273            append_meta and remove_meta.
274    
275            * R/textdoccol.R: Removed modify_metadata method.
276    
277            * R/textrepo.R: Removed modify_metadata method.
278    
279            * R/textdoccol.R (remove_meta): Supports removal of document
280            collection metadata and document (= in data frame) metadata.
281    
282    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
283    
284            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
285    
286            * data/crude.rda: Rebuilt.
287    
288            * data/acq.rda: Rebuilt.
289    
290            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
291    
292            * R/textdoccol.R ([): Bug fix for subsetting a document
293            collection's data frame.
294    
295    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
296    
297            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
298            to s_filter.
299    
300            * R/textdoccol.R: Local text documents' metadata can now be copied
301            to a document collection's data frame with prescind_meta.
302    
303    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
304    
305            * R/: Text documents' slot metadata is now accessible in s_filter.
306    
307            * R/: Rewrote s_filter function (has still some restrictions).
308    
309    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
310    
311            * R/: Various fixes in handling metadata.
312    
313            * R/: Added update mechanism for text document collections.
314    
315    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
316    
317            * R/: Merging of document collections now creates a binary tree
318            for reconstructing merged document collections.
319    
320            * R/: Redesign of metadata for document collections.
321    
322    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/: Messages now use \code{ngettext}.
325    
326    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * R/: Added functions for modifying and removing metadata.
329    
330    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
331    
332            * man/: Updated some documentation.
333    
334            * R/: Corrected some connection issues.
335    
336            * inst/doc: Worked on the vignette.
337    
338    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
339    
340            * inst/: Added texts and started vignette.
341    
342            * R/: Final changes based upon David's comments.
343    
344    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
345    
346            * NAMESPACE: Corrected exports (generic methods need exportMethods
347            directives!).
348    
349    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
350    
351            * R/: Modified the TextDocCol constructur and various parsers. It
352            is now modular and supports various file formats via plugins (see
353            the new "Source" class).
354    
355    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * man/: Revised documentation after previous code changes.
358    
359    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * R/: Remaining changes as discussed with David.
362    
363    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * R/: Some changes as suggested by David. The rest will follow
366            within the next days.
367    
368    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * man/: Finished documentation.
371    
372    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * man/: Wrote some documentation.
375    
376    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * R/: Further syntactic sugar in form of additional assignment and
379            accessor methods.
380    
381    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
382    
383            * R/: Syntactic sugar in form of "length", "show" and "summary"
384            operators.
385    
386    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
387    
388            * R/: Diverse updates. Mainly on default operators ("[" or "c")
389            and dissimilarities.
390    
391    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
392    
393            * R/: Added similarity functions.
394    
395            * data/: Added english stopwords.
396    
397    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
398    
399            * data/: Examples compiled for new features
400    
401            * R/: Changes due to new structure.
402    
403            * NAMESPACE: Corrected namespace to reflect new structure.
404    
405            * R/termdocmatrix.R: Adapted for new naming scheme.
406    
407    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/textdoccol.R: Adapted code for new class structure. Wrote
410            several transform and filter functions operating on text document
411            collections (alias text document databases).
412    
413            * R/aobjects.R: Adapted class structure with inheritance,
414            repositories and additional meta data. Loading files on demand is
415            now possible.
416    
417    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
418    
419            * R/: Some cosmetic cleanups.
420    
421            * inst/: Removed vignette on clustering. That and much more is now
422            described in the JSS paper on text mining. Based upon that
423            article an elaborated vignette will be incorporated in the future.
424    
425    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/: Updated generic S4 methods to comply with signature changes
428            in newer versions of R (> 2.3)
429    
430    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * ext/R/importRIS.R: Automatic RIS import is now possible.
433    
434  2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
435    
436          * R/textdoccol.R: Added RIS HTML input format.          * R/textdoccol.R: Added RIS HTML input format.

Legend:
Removed from v.40  
changed lines
  Added in v.761

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge