SCM

SCM Repository

[tm] Diff of /trunk/tm/ChangeLog
ViewVC logotype

Diff of /trunk/tm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 20, Tue Nov 8 16:40:52 2005 UTC trunk/tm/ChangeLog revision 758, Wed Jun 13 02:25:36 2007 UTC
# Line 1  Line 1 
1    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
2    
3            * R/dictionary.R: Added dictionary support.
4    
5    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
6    
7            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
8            documents. This simplifies some functions, e.g., asPlain.
9    
10    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
11    
12            * inst/doc/tm.Rnw: Fixed some typos in vignette.
13    
14    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
15    
16            * R/textdoccol.R (replaceWords): Added method to replace a set of
17            words by a single word. Useful for synonyms.
18    
19    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
20    
21            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
22    
23    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
24    
25            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
26            vectors. Thanks to Ariel Maguyon for his error report.
27            (removeSparseTerms): New function to remove columns from a
28            term-document matrix exceeding a sparse factor.
29    
30    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
31    
32            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
33    
34    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
35    
36            * man/sFilter.Rd: Corrected documentation on statement format (use
37            '==' instead of '=').
38    
39    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
40    
41            * R/aobjects.R (StructuredTextDocument): Inherits from
42            TextDocument.
43    
44    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
45    
46            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
47            on sparse matrices as proposed by Martin Maechler.
48    
49    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
50    
51            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
52            \pkg{filehash} version makes them deprecated.
53    
54    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
55    
56            * R/termdocmatrix.R (textvector): Stemming is now performed before
57            erasing stopwords.
58            (weightMatrix): Adapted to handle sparse matrices.
59            (TermDocMatrix): Sparse matrix is now efficiently built by
60            direct stepwise insertion of row values into it.
61    
62    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
63    
64            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
65            due to ongoing problems. For our purposes the latter is as useful
66            as the replaced package.
67    
68    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
69    
70            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
71    
72            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
73    
74    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
75    
76            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
77            languages with available stopwords.
78    
79    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
80    
81            * inst/doc/tm.Rnw: Minor corrections in the vignette.
82    
83    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
84    
85            * DESCRIPTION: Update to version 0.2, since a lot of new features
86            have been integrated.
87    
88            * inst/stopwords: Updated existing stopwords and added stopwords
89            for various other languages.
90    
91    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
92    
93            * man/: Updated documentation.
94    
95            * Work/testDb.R: Script to test database stuff.
96    
97            * R/: Fixed various database related bugs. Seems to be rather
98            useable now, i.e., consider as alpha status for now.
99    
100    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
101    
102            * R/: Fixed some bugs related to database support.
103    
104    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
105    
106            * man/: Added a lot of examples to the manuals.
107    
108    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
109    
110            * man/: Updated parts of the documentation.
111    
112            * R/textdoccol.R (asPlain): Added conversion from newsgroup
113            documents to plain text documents.
114    
115    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
116    
117            * R/textdoccol.R: Finished experimental database support. Not yet
118            intensively tested.
119    
120            * R/source.R: Now each source has a default reader.
121    
122            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
123            class anymore.
124    
125            * R/plaintextdoc.R: Custom show method for plain text documents.
126    
127            * R/aobjects.R: Added a class for structured text documents.
128    
129            * R/reader.R: Replaced remaining \code{parser} occurrences with
130            \code{reader}.
131    
132            * R/textdoccol.R (summary): Indent tags.
133    
134            * R/textdoccol.R (removePunctuation): Transform method to remove
135            punctuation marks.
136    
137    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
138    
139            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
140            using prescindMeta().
141    
142    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
143    
144            * R/textdoccol.R: Improved database support.
145    
146    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
147    
148            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
149    
150            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
151            language code.
152    
153            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
154            into parserControl argument.
155    
156            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
157    
158    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
159    
160            * Work/tmDataSetup.R: The datasets acq and crude can now be
161            created on the fly.
162    
163            * R/stopwords.R: Introduced a function returning the stopwords for
164            a given language (English, German and French at the moment)
165    
166            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
167            otherwise falls back to Snowball package.
168    
169    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
170    
171            * man/dissimilarity-methods.Rd: Make clear that any method offered
172            by "dists" from package "cba" can be used.
173    
174    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
175    
176            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
177            to Kurt's latex suggestion. Removed points and underscores in
178            variable names for consistent naming.
179    
180            * DESCRIPTION: Update to version 0.1-2.
181    
182            * man/TextRepository.Rd: Fixed bug in documentation.
183    
184    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
185    
186            * DESCRIPTION: Update to version 0.1-1.
187    
188    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
189    
190            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
191            wordStem.
192    
193    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * R/: Changes due to Kurt's review.
196    
197    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
198    
199            * R/: Implemented improvements based upon comments by David
200            Meyer.
201    
202    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
203    
204            * inst/doc/: Rewrote vignette.
205    
206            * man/: Improved documentation.
207    
208    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
209    
210            * man/: Updated documentation.
211    
212            * DESCRIPTION: Changed package name to "tm". Updated version to
213            0.1 for first CRAN release.
214    
215            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
216            list archive example.
217    
218            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
219            archive example.
220    
221            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
222            from (several mails per box) mbox format to (single mail per file)
223            eml format.
224    
225    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
226    
227            * data/crude.rda: Rebuilt.
228    
229            * data/acq.rda: Rebuilt.
230    
231            * R/reader.R: Factored out reader and parser methods from
232            textdoccol.R.
233    
234            * R/source.R: Factored out Source methods from aobjects.R and
235            textdoccol.R.
236            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
237            feeds.
238    
239            * R/textdoccol.R (DirSource): Added support for recursive
240            traversal of directories.
241    
242    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
243    
244            * R/textdoccol.R ([[): Loads the document corpus automatically
245            into memory upon access.
246            (tm_transform, tm_filter): Removed several checks whether the
247            document is already loaded ([[ ensures this now).
248            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
249            mailing list archive.
250    
251    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
252    
253            * R/aobjects.R (TextDocument): Is now a virtual class.
254            (Source): Is now a virtual class.
255    
256    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * R/textdoccol.R (c): Support for an arbitrary number of document
259            collections.
260    
261    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
262    
263            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
264            append_meta and remove_meta.
265    
266            * R/textdoccol.R: Removed modify_metadata method.
267    
268            * R/textrepo.R: Removed modify_metadata method.
269    
270            * R/textdoccol.R (remove_meta): Supports removal of document
271            collection metadata and document (= in data frame) metadata.
272    
273    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
274    
275            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
276    
277            * data/crude.rda: Rebuilt.
278    
279            * data/acq.rda: Rebuilt.
280    
281            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
282    
283            * R/textdoccol.R ([): Bug fix for subsetting a document
284            collection's data frame.
285    
286    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
287    
288            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
289            to s_filter.
290    
291            * R/textdoccol.R: Local text documents' metadata can now be copied
292            to a document collection's data frame with prescind_meta.
293    
294    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
295    
296            * R/: Text documents' slot metadata is now accessible in s_filter.
297    
298            * R/: Rewrote s_filter function (has still some restrictions).
299    
300    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
301    
302            * R/: Various fixes in handling metadata.
303    
304            * R/: Added update mechanism for text document collections.
305    
306    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
307    
308            * R/: Merging of document collections now creates a binary tree
309            for reconstructing merged document collections.
310    
311            * R/: Redesign of metadata for document collections.
312    
313    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
314    
315            * R/: Messages now use \code{ngettext}.
316    
317    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
318    
319            * R/: Added functions for modifying and removing metadata.
320    
321    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
322    
323            * man/: Updated some documentation.
324    
325            * R/: Corrected some connection issues.
326    
327            * inst/doc: Worked on the vignette.
328    
329    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
330    
331            * inst/: Added texts and started vignette.
332    
333            * R/: Final changes based upon David's comments.
334    
335    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
336    
337            * NAMESPACE: Corrected exports (generic methods need exportMethods
338            directives!).
339    
340    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
341    
342            * R/: Modified the TextDocCol constructur and various parsers. It
343            is now modular and supports various file formats via plugins (see
344            the new "Source" class).
345    
346    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * man/: Revised documentation after previous code changes.
349    
350    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * R/: Remaining changes as discussed with David.
353    
354    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
355    
356            * R/: Some changes as suggested by David. The rest will follow
357            within the next days.
358    
359    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * man/: Finished documentation.
362    
363    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
364    
365            * man/: Wrote some documentation.
366    
367    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
368    
369            * R/: Further syntactic sugar in form of additional assignment and
370            accessor methods.
371    
372    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * R/: Syntactic sugar in form of "length", "show" and "summary"
375            operators.
376    
377    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
378    
379            * R/: Diverse updates. Mainly on default operators ("[" or "c")
380            and dissimilarities.
381    
382    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
383    
384            * R/: Added similarity functions.
385    
386            * data/: Added english stopwords.
387    
388    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
389    
390            * data/: Examples compiled for new features
391    
392            * R/: Changes due to new structure.
393    
394            * NAMESPACE: Corrected namespace to reflect new structure.
395    
396            * R/termdocmatrix.R: Adapted for new naming scheme.
397    
398    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/textdoccol.R: Adapted code for new class structure. Wrote
401            several transform and filter functions operating on text document
402            collections (alias text document databases).
403    
404            * R/aobjects.R: Adapted class structure with inheritance,
405            repositories and additional meta data. Loading files on demand is
406            now possible.
407    
408    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
409    
410            * R/: Some cosmetic cleanups.
411    
412            * inst/: Removed vignette on clustering. That and much more is now
413            described in the JSS paper on text mining. Based upon that
414            article an elaborated vignette will be incorporated in the future.
415    
416    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * R/: Updated generic S4 methods to comply with signature changes
419            in newer versions of R (> 2.3)
420    
421    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * ext/R/importRIS.R: Automatic RIS import is now possible.
424    
425    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * R/textdoccol.R: Added RIS HTML input format.
428    
429    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
430    
431            * R/textdoccol.R: Removed bug that caused invalid text document
432            collections when handling many input files.
433    
434    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
435    
436            * R/textdoccol.R: Restructured and extended file import
437            mechanism.
438    
439            * inst/doc/clustering.Rnw: Adapted vignette for use with
440            ReutNews.rda
441    
442            * man/ReutNews.Rd: Documentation for ReutNews.rda
443    
444            * data/ReutNews.rda: A tiny Reuters21578 example data set.
445    
446    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
449            clustering facilities of this package.
450    
451    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
452    
453            * R/aobjects.R: Changed package document structure to avoid class
454            dependency problems.
455    
456    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
457    
458            * Wrote a script for the ModLewis Split for the Reuters-21578 XML
459            data set.
460    
461            * Finished documentation and reordered directory structure. Now "R
462            CMD check textmin" works without errors.
463    
464    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
465    
466            * src/: Various splits can now be easily created for the
467            Reuters21578 data set.
468    
469    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
470    
471            * Updated documentation
472    
473    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
474    
475            * Wrote R documentation for some classes and methods.
476    
477    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
480            files. See the questionnaire data/Umfrage.csv for such an example.
481            We are now able to import files in Reuters-21578 XML format.
482    
483            * Changed class interfaces in various files. Weighting of the text
484            matrix is now possible.
485    
486  2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
487    
488          * R/textdoccol.R: One can build term-document matrices if          * R/textdoccol.R: One can build term-document matrices if

Legend:
Removed from v.20  
changed lines
  Added in v.758

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge