SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 17, Sat Nov 5 14:47:12 2005 UTC trunk/tm/ChangeLog revision 744, Mon Apr 23 00:35:10 2007 UTC
# Line 1  Line 1 
1    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/termdocmatrix.R (textvector): Stemming is now performed before
4            erasing stopwords.
5            (weightMatrix): Adapted to handle sparse matrices.
6            (TermDocMatrix): Sparse matrix is now efficiently built by
7            direct stepwise insertion of row values into it.
8    
9    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
10    
11            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
12            due to ongoing problems. For our purposes the latter is as useful
13            as the replaced package.
14    
15    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
16    
17            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
18    
19            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
20    
21    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
22    
23            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
24            languages with available stopwords.
25    
26    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
27    
28            * inst/doc/tm.Rnw: Minor corrections in the vignette.
29    
30    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
31    
32            * DESCRIPTION: Update to version 0.2, since a lot of new features
33            have been integrated.
34    
35            * inst/stopwords: Updated existing stopwords and added stopwords
36            for various other languages.
37    
38    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
39    
40            * man/: Updated documentation.
41    
42            * Work/testDb.R: Script to test database stuff.
43    
44            * R/: Fixed various database related bugs. Seems to be rather
45            useable now, i.e., consider as alpha status for now.
46    
47    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
48    
49            * R/: Fixed some bugs related to database support.
50    
51    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
52    
53            * man/: Added a lot of examples to the manuals.
54    
55    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
56    
57            * man/: Updated parts of the documentation.
58    
59            * R/textdoccol.R (asPlain): Added conversion from newsgroup
60            documents to plain text documents.
61    
62    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
63    
64            * R/textdoccol.R: Finished experimental database support. Not yet
65            intensively tested.
66    
67            * R/source.R: Now each source has a default reader.
68    
69            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
70            class anymore.
71    
72            * R/plaintextdoc.R: Custom show method for plain text documents.
73    
74            * R/aobjects.R: Added a class for structured text documents.
75    
76            * R/reader.R: Replaced remaining \code{parser} occurrences with
77            \code{reader}.
78    
79            * R/textdoccol.R (summary): Indent tags.
80    
81            * R/textdoccol.R (removePunctuation): Transform method to remove
82            punctuation marks.
83    
84    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
85    
86            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
87            using prescindMeta().
88    
89    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
90    
91            * R/textdoccol.R: Improved database support.
92    
93    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
94    
95            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
96    
97            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
98            language code.
99    
100            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
101            into parserControl argument.
102    
103            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
104    
105    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
106    
107            * Work/tmDataSetup.R: The datasets acq and crude can now be
108            created on the fly.
109    
110            * R/stopwords.R: Introduced a function returning the stopwords for
111            a given language (English, German and French at the moment)
112    
113            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
114            otherwise falls back to Snowball package.
115    
116    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
117    
118            * man/dissimilarity-methods.Rd: Make clear that any method offered
119            by "dists" from package "cba" can be used.
120    
121    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
122    
123            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
124            to Kurt's latex suggestion. Removed points and underscores in
125            variable names for consistent naming.
126    
127            * DESCRIPTION: Update to version 0.1-2.
128    
129            * man/TextRepository.Rd: Fixed bug in documentation.
130    
131    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
132    
133            * DESCRIPTION: Update to version 0.1-1.
134    
135    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
136    
137            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
138            wordStem.
139    
140    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
141    
142            * R/: Changes due to Kurt's review.
143    
144    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
145    
146            * R/: Implemented improvements based upon comments by David
147            Meyer.
148    
149    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
150    
151            * inst/doc/: Rewrote vignette.
152    
153            * man/: Improved documentation.
154    
155    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
156    
157            * man/: Updated documentation.
158    
159            * DESCRIPTION: Changed package name to "tm". Updated version to
160            0.1 for first CRAN release.
161    
162            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
163            list archive example.
164    
165            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
166            archive example.
167    
168            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
169            from (several mails per box) mbox format to (single mail per file)
170            eml format.
171    
172    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
173    
174            * data/crude.rda: Rebuilt.
175    
176            * data/acq.rda: Rebuilt.
177    
178            * R/reader.R: Factored out reader and parser methods from
179            textdoccol.R.
180    
181            * R/source.R: Factored out Source methods from aobjects.R and
182            textdoccol.R.
183            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
184            feeds.
185    
186            * R/textdoccol.R (DirSource): Added support for recursive
187            traversal of directories.
188    
189    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
190    
191            * R/textdoccol.R ([[): Loads the document corpus automatically
192            into memory upon access.
193            (tm_transform, tm_filter): Removed several checks whether the
194            document is already loaded ([[ ensures this now).
195            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
196            mailing list archive.
197    
198    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
199    
200            * R/aobjects.R (TextDocument): Is now a virtual class.
201            (Source): Is now a virtual class.
202    
203    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
204    
205            * R/textdoccol.R (c): Support for an arbitrary number of document
206            collections.
207    
208    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
209    
210            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
211            append_meta and remove_meta.
212    
213            * R/textdoccol.R: Removed modify_metadata method.
214    
215            * R/textrepo.R: Removed modify_metadata method.
216    
217            * R/textdoccol.R (remove_meta): Supports removal of document
218            collection metadata and document (= in data frame) metadata.
219    
220    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
221    
222            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
223    
224            * data/crude.rda: Rebuilt.
225    
226            * data/acq.rda: Rebuilt.
227    
228            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
229    
230            * R/textdoccol.R ([): Bug fix for subsetting a document
231            collection's data frame.
232    
233    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
234    
235            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
236            to s_filter.
237    
238            * R/textdoccol.R: Local text documents' metadata can now be copied
239            to a document collection's data frame with prescind_meta.
240    
241    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
242    
243            * R/: Text documents' slot metadata is now accessible in s_filter.
244    
245            * R/: Rewrote s_filter function (has still some restrictions).
246    
247    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
248    
249            * R/: Various fixes in handling metadata.
250    
251            * R/: Added update mechanism for text document collections.
252    
253    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
254    
255            * R/: Merging of document collections now creates a binary tree
256            for reconstructing merged document collections.
257    
258            * R/: Redesign of metadata for document collections.
259    
260    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
261    
262            * R/: Messages now use \code{ngettext}.
263    
264    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
265    
266            * R/: Added functions for modifying and removing metadata.
267    
268    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
269    
270            * man/: Updated some documentation.
271    
272            * R/: Corrected some connection issues.
273    
274            * inst/doc: Worked on the vignette.
275    
276    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
277    
278            * inst/: Added texts and started vignette.
279    
280            * R/: Final changes based upon David's comments.
281    
282    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
283    
284            * NAMESPACE: Corrected exports (generic methods need exportMethods
285            directives!).
286    
287    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
288    
289            * R/: Modified the TextDocCol constructur and various parsers. It
290            is now modular and supports various file formats via plugins (see
291            the new "Source" class).
292    
293    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
294    
295            * man/: Revised documentation after previous code changes.
296    
297    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
298    
299            * R/: Remaining changes as discussed with David.
300    
301    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
302    
303            * R/: Some changes as suggested by David. The rest will follow
304            within the next days.
305    
306    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
307    
308            * man/: Finished documentation.
309    
310    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
311    
312            * man/: Wrote some documentation.
313    
314    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/: Further syntactic sugar in form of additional assignment and
317            accessor methods.
318    
319    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/: Syntactic sugar in form of "length", "show" and "summary"
322            operators.
323    
324    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/: Diverse updates. Mainly on default operators ("[" or "c")
327            and dissimilarities.
328    
329    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * R/: Added similarity functions.
332    
333            * data/: Added english stopwords.
334    
335    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
336    
337            * data/: Examples compiled for new features
338    
339            * R/: Changes due to new structure.
340    
341            * NAMESPACE: Corrected namespace to reflect new structure.
342    
343            * R/termdocmatrix.R: Adapted for new naming scheme.
344    
345    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
346    
347            * R/textdoccol.R: Adapted code for new class structure. Wrote
348            several transform and filter functions operating on text document
349            collections (alias text document databases).
350    
351            * R/aobjects.R: Adapted class structure with inheritance,
352            repositories and additional meta data. Loading files on demand is
353            now possible.
354    
355    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/: Some cosmetic cleanups.
358    
359            * inst/: Removed vignette on clustering. That and much more is now
360            described in the JSS paper on text mining. Based upon that
361            article an elaborated vignette will be incorporated in the future.
362    
363    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * R/: Updated generic S4 methods to comply with signature changes
366            in newer versions of R (> 2.3)
367    
368    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * ext/R/importRIS.R: Automatic RIS import is now possible.
371    
372    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * R/textdoccol.R: Added RIS HTML input format.
375    
376    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
377    
378            * R/textdoccol.R: Removed bug that caused invalid text document
379            collections when handling many input files.
380    
381    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
382    
383            * R/textdoccol.R: Restructured and extended file import
384            mechanism.
385    
386            * inst/doc/clustering.Rnw: Adapted vignette for use with
387            ReutNews.rda
388    
389            * man/ReutNews.Rd: Documentation for ReutNews.rda
390    
391            * data/ReutNews.rda: A tiny Reuters21578 example data set.
392    
393    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
396            clustering facilities of this package.
397    
398    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/aobjects.R: Changed package document structure to avoid class
401            dependency problems.
402    
403    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
404    
405            * Wrote a script for the ModLewis Split for the Reuters-21578 XML
406            data set.
407    
408            * Finished documentation and reordered directory structure. Now "R
409            CMD check textmin" works without errors.
410    
411    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * src/: Various splits can now be easily created for the
414            Reuters21578 data set.
415    
416    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * Updated documentation
419    
420    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * Wrote R documentation for some classes and methods.
423    
424    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
425    
426            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
427            files. See the questionnaire data/Umfrage.csv for such an example.
428            We are now able to import files in Reuters-21578 XML format.
429    
430            * Changed class interfaces in various files. Weighting of the text
431            matrix is now possible.
432    
433    2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
434    
435            * R/textdoccol.R: One can build term-document matrices if
436            nessecary (with buildTDM(...)) and fill the field tdm from a text
437            document collection with it.
438    
439            * R/textmatrix.R: Wrote S4 class for term-document matrices.
440    
441    2005-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * R/textdoccol.R: We now can read in a whole XML file with several
444            news items.
445    
446  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448          * R/textdoccol.R: Set up an S4 class for a collection of text          * R/textdoccol.R: Set up an S4 class for a collection of text

Legend:
Removed from v.17  
changed lines
  Added in v.744

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge