SCM

SCM Repository

[tm] Diff of /trunk/tm/ChangeLog
ViewVC logotype

Diff of /trunk/tm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 39, Sat Jan 21 09:37:39 2006 UTC trunk/tm/ChangeLog revision 749, Tue May 8 17:26:09 2007 UTC
# Line 1  Line 1 
1    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/aobjects.R (StructuredTextDocument): Inherits from
4            TextDocument.
5    
6    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
7    
8            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
9            on sparse matrices as proposed by Martin Maechler.
10    
11    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
12    
13            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
14            \pkg{filehash} version makes them deprecated.
15    
16    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
17    
18            * R/termdocmatrix.R (textvector): Stemming is now performed before
19            erasing stopwords.
20            (weightMatrix): Adapted to handle sparse matrices.
21            (TermDocMatrix): Sparse matrix is now efficiently built by
22            direct stepwise insertion of row values into it.
23    
24    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
25    
26            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
27            due to ongoing problems. For our purposes the latter is as useful
28            as the replaced package.
29    
30    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
31    
32            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
33    
34            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
35    
36    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
37    
38            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
39            languages with available stopwords.
40    
41    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
42    
43            * inst/doc/tm.Rnw: Minor corrections in the vignette.
44    
45    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
46    
47            * DESCRIPTION: Update to version 0.2, since a lot of new features
48            have been integrated.
49    
50            * inst/stopwords: Updated existing stopwords and added stopwords
51            for various other languages.
52    
53    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
54    
55            * man/: Updated documentation.
56    
57            * Work/testDb.R: Script to test database stuff.
58    
59            * R/: Fixed various database related bugs. Seems to be rather
60            useable now, i.e., consider as alpha status for now.
61    
62    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
63    
64            * R/: Fixed some bugs related to database support.
65    
66    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
67    
68            * man/: Added a lot of examples to the manuals.
69    
70    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
71    
72            * man/: Updated parts of the documentation.
73    
74            * R/textdoccol.R (asPlain): Added conversion from newsgroup
75            documents to plain text documents.
76    
77    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
78    
79            * R/textdoccol.R: Finished experimental database support. Not yet
80            intensively tested.
81    
82            * R/source.R: Now each source has a default reader.
83    
84            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
85            class anymore.
86    
87            * R/plaintextdoc.R: Custom show method for plain text documents.
88    
89            * R/aobjects.R: Added a class for structured text documents.
90    
91            * R/reader.R: Replaced remaining \code{parser} occurrences with
92            \code{reader}.
93    
94            * R/textdoccol.R (summary): Indent tags.
95    
96            * R/textdoccol.R (removePunctuation): Transform method to remove
97            punctuation marks.
98    
99    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
100    
101            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
102            using prescindMeta().
103    
104    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
105    
106            * R/textdoccol.R: Improved database support.
107    
108    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
109    
110            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
111    
112            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
113            language code.
114    
115            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
116            into parserControl argument.
117    
118            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
119    
120    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
121    
122            * Work/tmDataSetup.R: The datasets acq and crude can now be
123            created on the fly.
124    
125            * R/stopwords.R: Introduced a function returning the stopwords for
126            a given language (English, German and French at the moment)
127    
128            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
129            otherwise falls back to Snowball package.
130    
131    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
132    
133            * man/dissimilarity-methods.Rd: Make clear that any method offered
134            by "dists" from package "cba" can be used.
135    
136    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
137    
138            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
139            to Kurt's latex suggestion. Removed points and underscores in
140            variable names for consistent naming.
141    
142            * DESCRIPTION: Update to version 0.1-2.
143    
144            * man/TextRepository.Rd: Fixed bug in documentation.
145    
146    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
147    
148            * DESCRIPTION: Update to version 0.1-1.
149    
150    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
151    
152            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
153            wordStem.
154    
155    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
156    
157            * R/: Changes due to Kurt's review.
158    
159    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
160    
161            * R/: Implemented improvements based upon comments by David
162            Meyer.
163    
164    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
165    
166            * inst/doc/: Rewrote vignette.
167    
168            * man/: Improved documentation.
169    
170    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
171    
172            * man/: Updated documentation.
173    
174            * DESCRIPTION: Changed package name to "tm". Updated version to
175            0.1 for first CRAN release.
176    
177            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
178            list archive example.
179    
180            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
181            archive example.
182    
183            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
184            from (several mails per box) mbox format to (single mail per file)
185            eml format.
186    
187    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
188    
189            * data/crude.rda: Rebuilt.
190    
191            * data/acq.rda: Rebuilt.
192    
193            * R/reader.R: Factored out reader and parser methods from
194            textdoccol.R.
195    
196            * R/source.R: Factored out Source methods from aobjects.R and
197            textdoccol.R.
198            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
199            feeds.
200    
201            * R/textdoccol.R (DirSource): Added support for recursive
202            traversal of directories.
203    
204    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
205    
206            * R/textdoccol.R ([[): Loads the document corpus automatically
207            into memory upon access.
208            (tm_transform, tm_filter): Removed several checks whether the
209            document is already loaded ([[ ensures this now).
210            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
211            mailing list archive.
212    
213    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
214    
215            * R/aobjects.R (TextDocument): Is now a virtual class.
216            (Source): Is now a virtual class.
217    
218    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
219    
220            * R/textdoccol.R (c): Support for an arbitrary number of document
221            collections.
222    
223    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
224    
225            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
226            append_meta and remove_meta.
227    
228            * R/textdoccol.R: Removed modify_metadata method.
229    
230            * R/textrepo.R: Removed modify_metadata method.
231    
232            * R/textdoccol.R (remove_meta): Supports removal of document
233            collection metadata and document (= in data frame) metadata.
234    
235    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
236    
237            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
238    
239            * data/crude.rda: Rebuilt.
240    
241            * data/acq.rda: Rebuilt.
242    
243            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
244    
245            * R/textdoccol.R ([): Bug fix for subsetting a document
246            collection's data frame.
247    
248    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
249    
250            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
251            to s_filter.
252    
253            * R/textdoccol.R: Local text documents' metadata can now be copied
254            to a document collection's data frame with prescind_meta.
255    
256    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * R/: Text documents' slot metadata is now accessible in s_filter.
259    
260            * R/: Rewrote s_filter function (has still some restrictions).
261    
262    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
263    
264            * R/: Various fixes in handling metadata.
265    
266            * R/: Added update mechanism for text document collections.
267    
268    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
269    
270            * R/: Merging of document collections now creates a binary tree
271            for reconstructing merged document collections.
272    
273            * R/: Redesign of metadata for document collections.
274    
275    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
276    
277            * R/: Messages now use \code{ngettext}.
278    
279    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
280    
281            * R/: Added functions for modifying and removing metadata.
282    
283    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
284    
285            * man/: Updated some documentation.
286    
287            * R/: Corrected some connection issues.
288    
289            * inst/doc: Worked on the vignette.
290    
291    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
292    
293            * inst/: Added texts and started vignette.
294    
295            * R/: Final changes based upon David's comments.
296    
297    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
298    
299            * NAMESPACE: Corrected exports (generic methods need exportMethods
300            directives!).
301    
302    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
303    
304            * R/: Modified the TextDocCol constructur and various parsers. It
305            is now modular and supports various file formats via plugins (see
306            the new "Source" class).
307    
308    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
309    
310            * man/: Revised documentation after previous code changes.
311    
312    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
313    
314            * R/: Remaining changes as discussed with David.
315    
316    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
317    
318            * R/: Some changes as suggested by David. The rest will follow
319            within the next days.
320    
321    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
322    
323            * man/: Finished documentation.
324    
325    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
326    
327            * man/: Wrote some documentation.
328    
329    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * R/: Further syntactic sugar in form of additional assignment and
332            accessor methods.
333    
334    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
335    
336            * R/: Syntactic sugar in form of "length", "show" and "summary"
337            operators.
338    
339    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
340    
341            * R/: Diverse updates. Mainly on default operators ("[" or "c")
342            and dissimilarities.
343    
344    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
345    
346            * R/: Added similarity functions.
347    
348            * data/: Added english stopwords.
349    
350    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * data/: Examples compiled for new features
353    
354            * R/: Changes due to new structure.
355    
356            * NAMESPACE: Corrected namespace to reflect new structure.
357    
358            * R/termdocmatrix.R: Adapted for new naming scheme.
359    
360    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * R/textdoccol.R: Adapted code for new class structure. Wrote
363            several transform and filter functions operating on text document
364            collections (alias text document databases).
365    
366            * R/aobjects.R: Adapted class structure with inheritance,
367            repositories and additional meta data. Loading files on demand is
368            now possible.
369    
370    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
371    
372            * R/: Some cosmetic cleanups.
373    
374            * inst/: Removed vignette on clustering. That and much more is now
375            described in the JSS paper on text mining. Based upon that
376            article an elaborated vignette will be incorporated in the future.
377    
378    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/: Updated generic S4 methods to comply with signature changes
381            in newer versions of R (> 2.3)
382    
383    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
384    
385            * ext/R/importRIS.R: Automatic RIS import is now possible.
386    
387    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
388    
389            * R/textdoccol.R: Added RIS HTML input format.
390    
391  2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
392    
393          * R/textdoccol.R: Removed bug that caused invalid text document          * R/textdoccol.R: Removed bug that caused invalid text document

Legend:
Removed from v.39  
changed lines
  Added in v.749

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge