SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 17, Sat Nov 5 14:47:12 2005 UTC pkg/ChangeLog revision 981, Fri Aug 7 09:04:37 2009 UTC
# Line 1  Line 1 
1    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
4            postings are basically e-mails with some extra headers.
5    
6    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/transform.R: Move convertMboxEml, removeCitation,
9            removeMultipart, and removeSignature to the tm.plugin.mail package
10            since they are mainly utility functions (for handling e-mails) and
11            not very framework specific.
12    
13    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
14    
15            * man/: Fix documentation.
16    
17    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
18    
19            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
20            plain text document instead of an XML document for texts of the
21            Reuters-21578 dataset.
22    
23            * R/sparse.R: Removed since the slam package is now available on
24            CRAN.
25    
26            * DESCRIPTION (Depends): Add slam package.
27    
28    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
29    
30            * R/transform.R (stemDoc): Fix character(0) handling.
31    
32    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
33    
34            * R/doc.R (show): Pretty print.
35    
36    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
37    
38            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
39            gracefully.
40    
41    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
42    
43            * R/corpus.R: Make corpus virtual. Implement corpus with standard
44            and permanent storage semantics.
45    
46            * DESCRIPTION: New major release. A *lot* of improvements.
47    
48    2009-05-04   Ingo Feinerer <feinerer@logic.at>
49    
50            * NAMESPACE: Export some simple_triplet_matrix functions.
51    
52    2009-04-28   Ingo Feinerer <feinerer@logic.at>
53    
54            * R/weight.R: Adapt tf-idf to new matrix format.
55    
56    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
57    
58            * R/matrix.R: Create two distinct classes for term-document and
59            document-term matrices.
60    
61    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
62    
63            * R/termdocmatrix.R: No longer use Matrix package. This reduces
64            package start-up time significantly.
65    
66    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
67    
68            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
69    
70    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
71    
72            * R/transform.R (tmReduce): Combine multiple maps into one
73            transformation.
74    
75    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
76    
77            * R/weight.R: Remove weightLogical since it does not return a
78            dgCMatrix.
79    
80            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
81            or TermDocumentMatrix instead.
82    
83    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
84    
85            * inst/doc/extensions.Rnw: Finished vignette.
86    
87    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
88    
89            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
90            DocumentTermMatrix representations.
91    
92    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
93    
94            * R/reader.R (readXML): New reader for arbitrary XML files.
95    
96    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
97    
98            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
99            (XMLSource): New XMLSource class for arbitrary XML files.
100            (Source): New slot Vectorized.
101    
102    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
103    
104            * R/reader.R (readTabular): Experimental reader for tabular data
105            structures which can be customized via user-defined mappings.
106    
107            * R/reader.R: Always use UTC time zone.
108    
109            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
110    
111    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
112    
113            * R/reader.R (readDOC): Options can be passed over to antiword.
114    
115            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
116            pdftotext.
117    
118    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
119    
120            * R/source.R (DirSource): Add pattern and ignore.case arguments
121            which are internally passed over to list.files().
122    
123    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
124    
125            * inst/doc/tm.Rnw: Suppress pointless loading message.
126    
127    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
128    
129            * DESCRIPTION: Speed up package loading (via moving packages not
130            strictly necessary for normal operation to Suggests instead of
131            Depends).
132    
133    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
134    
135            * R/reader.R (readNewsgroup): The date format is now configurable.
136    
137    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
138    
139            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
140    
141    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
142    
143            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
144    
145    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/source.R (DataframeSource): New source class for data frames.
148    
149            * R/source.R: Fixed non-standard call evaluation.
150    
151    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
152    
153            * R/source.R (URISource): New source class for a single document.
154    
155    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
156    
157            * R/source.R: Refactoring.
158    
159    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
160    
161            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
162            Rmpi installations more gracefully.
163    
164    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
165    
166            * R/source.R (Source): Add Length slot.
167    
168    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
169    
170            * R/AAA.R: Unify duplicated .onLoad function.
171    
172    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
173    
174            * DESCRIPTION (Suggests): Added Rmpi.
175    
176    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/source.R (getElem): Fix 'no visible binding' warning.
179    
180            * man/WeightFunction.Rd: Fix signature.
181    
182    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
183    
184            * R/weight.R: Introduce name abbreviations for weighting functions.
185    
186    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
187    
188            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
189    
190            * R/cluster.R: Provide convenience functions for using a MPI
191            cluster.
192    
193            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
194            available.
195    
196            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
197            available.
198    
199    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
200    
201            * R/textdoccol.R (lapply): Removed debug print out.
202    
203    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
204    
205            * R/reader.R (readRCV1): Improved meta data extraction from
206            Reuters Corpus Volume 1 documents.
207    
208    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
209    
210            * R/transform.R: Ensure that all mappings preserve multiline
211            structures.
212    
213    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
214    
215            * R/filter.R: Every filter has now an attribute indicating whether
216            it sould be applied to document level (doclevel).
217    
218            * R/textdoccol.R (tmFilter): Set searchFullText as new default
219            filter.
220    
221    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
222    
223            * R/transform.R (replacePatterns): Replaced removeWords by
224            replacePatterns. Suggested by Christian Buchta.
225    
226            * R/textdoccol.R (inspect): Improved formatting.
227    
228    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
229    
230            * inst/CITATION: Updated JSS article information.
231    
232            * R/textdoccol.R (setAs): Added coerce method from list to
233            corpus.
234    
235            * R/meta.R (meta): Improved meta data handling.
236    
237    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
238    
239            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
240            Christian Buchta.
241    
242            * inst/CITATION: Added template to include JSS article reference.
243    
244    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
245    
246            * R/textdoccol.R (tmMap): Introduced lazy mapping.
247    
248            * R/source.R: Added VectorSource.
249    
250    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
251    
252            * man/: Language codes should be in ISO 639-1 format.
253    
254            * R/textdoccol.R (asPlain): Preserve local meta data.
255    
256    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
257    
258            * R/textdoccol.R (writeCorpus): Function for writing a corpus
259            containing plain text documents to disk.
260    
261    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
262    
263            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
264            always set correctly.
265    
266            * R/textdoccol.R: Set load = TRUE as default for load on demand
267            since in most cases this is the wanted behaviour.
268    
269    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
270    
271            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
272    
273            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
274    
275    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
276    
277            * R/meta.R (meta): New function for consistent access to meta data
278            of document collections, repositories, and texts.
279    
280    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
281    
282            * R/: Better support for encodings.
283    
284    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
285    
286            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
287            selection when no reader argument is given.
288    
289    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
290    
291            * R/source.R (CSVSource): Now uses read.csv instead of scan
292            internally.
293    
294    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
295    
296            * R/reader.R (getReaders): Returns available reader functions.
297    
298            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
299            as default.
300    
301    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
302    
303            * R/stopwords.R (stopwords): Shortened code, removed codetools
304            variable warnings.
305    
306            * man/: Documentation for showMeta, added an example for tmMap.
307    
308            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
309            some minor typos fixed.
310    
311    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
312    
313            * R/aobjects.R (showMeta): Added method for pretty printing a
314            text document's meta data.
315    
316    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
317    
318            * R/textdoccol.R (TextDocCol): Better handling of empty
319            arguments.
320    
321            * NAMESPACE: Exported readDOC.
322    
323            * man/completeStems.Rd: Added an example.
324    
325    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
326    
327            * R/stopwords.R (stopwords): Look up .dat files at every
328            call. Allows users to modify stopword .dat files interactively.
329    
330    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
331    
332            * R/termdocmatrix.R (termFreq): Correct processing of empty
333            documents.
334    
335    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
336    
337            * man/: Updated documentation.
338    
339    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
340    
341            * R/complete.R (completeStems): Completes (heuristically) word
342            stems.
343    
344            * R/termdocmatrix.R (TermDocMatrix2): New modular
345            constructor.
346    
347            * NAMESPACE: Exported termFreq.
348    
349    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
350    
351            * R/reader.R (readDOC): Added MS Word reader (using antiword).
352    
353    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
354    
355            * R/weight.R: Weighting functions for TermDocMatrix.
356    
357    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
358    
359            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
360            functions for accessing dimension, column, and row names.
361    
362            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
363    
364    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
367    
368    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
371    
372    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
373    
374            * R/reader.R (readPDF): Removed manual checks for pdftotext and
375            pdfinfo. The system call gives a warning anyway.
376    
377    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
378    
379            * R/textdoccol.R (asPlain): Conversion from
380            StructuredTextDocuments to PlainTextDocuments.
381    
382    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
383    
384            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
385            for accessing term-document matrices.
386    
387            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
388            are installed.
389    
390    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
391    
392            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
393            Christian Buchta.
394    
395    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
396    
397            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
398    
399    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
400    
401            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
402    
403            * R/reader.R (readPDF): Added PDF reader.
404    
405    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
406    
407            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
408    
409            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
410    
411            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
412    
413            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
414    
415    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
416    
417            * R/distmeasure.R (dissimilarity): Replaced dists call from
418            package cba by new dist call from package proxy.
419    
420    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
423    
424    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
425    
426            * R/termdocmatrix.R: require() uses the quietly option to suppress
427            loading messages.
428    
429    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
430    
431            * R/dictionary.R: Added dictionary support.
432    
433    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
434    
435            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
436            documents. This simplifies some functions, e.g., asPlain.
437    
438    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
439    
440            * inst/doc/tm.Rnw: Fixed some typos in vignette.
441    
442    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
443    
444            * R/textdoccol.R (replaceWords): Added method to replace a set of
445            words by a single word. Useful for synonyms.
446    
447    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
448    
449            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
450    
451    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
452    
453            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
454            vectors. Thanks to Ariel Maguyon for his error report.
455            (removeSparseTerms): New function to remove columns from a
456            term-document matrix exceeding a sparse factor.
457    
458    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
459    
460            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
461    
462    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
463    
464            * man/sFilter.Rd: Corrected documentation on statement format (use
465            '==' instead of '=').
466    
467    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
468    
469            * R/aobjects.R (StructuredTextDocument): Inherits from
470            TextDocument.
471    
472    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
475            on sparse matrices as proposed by Martin Maechler.
476    
477    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
480            \pkg{filehash} version makes them deprecated.
481    
482    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
483    
484            * R/termdocmatrix.R (textvector): Stemming is now performed before
485            erasing stopwords.
486            (weightMatrix): Adapted to handle sparse matrices.
487            (TermDocMatrix): Sparse matrix is now efficiently built by
488            direct stepwise insertion of row values into it.
489    
490    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
491    
492            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
493            due to ongoing problems. For our purposes the latter is as useful
494            as the replaced package.
495    
496    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
497    
498            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
499    
500            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
501    
502    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
505            languages with available stopwords.
506    
507    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
508    
509            * inst/doc/tm.Rnw: Minor corrections in the vignette.
510    
511    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
512    
513            * DESCRIPTION: Update to version 0.2, since a lot of new features
514            have been integrated.
515    
516            * inst/stopwords: Updated existing stopwords and added stopwords
517            for various other languages.
518    
519    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
520    
521            * man/: Updated documentation.
522    
523            * Work/testDb.R: Script to test database stuff.
524    
525            * R/: Fixed various database related bugs. Seems to be rather
526            useable now, i.e., consider as alpha status for now.
527    
528    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
529    
530            * R/: Fixed some bugs related to database support.
531    
532    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
533    
534            * man/: Added a lot of examples to the manuals.
535    
536    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
537    
538            * man/: Updated parts of the documentation.
539    
540            * R/textdoccol.R (asPlain): Added conversion from newsgroup
541            documents to plain text documents.
542    
543    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
544    
545            * R/textdoccol.R: Finished experimental database support. Not yet
546            intensively tested.
547    
548            * R/source.R: Now each source has a default reader.
549    
550            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
551            class anymore.
552    
553            * R/plaintextdoc.R: Custom show method for plain text documents.
554    
555            * R/aobjects.R: Added a class for structured text documents.
556    
557            * R/reader.R: Replaced remaining \code{parser} occurrences with
558            \code{reader}.
559    
560            * R/textdoccol.R (summary): Indent tags.
561    
562            * R/textdoccol.R (removePunctuation): Transform method to remove
563            punctuation marks.
564    
565    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
566    
567            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
568            using prescindMeta().
569    
570    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
571    
572            * R/textdoccol.R: Improved database support.
573    
574    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
575    
576            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
577    
578            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
579            language code.
580    
581            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
582            into parserControl argument.
583    
584            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
585    
586    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
587    
588            * Work/tmDataSetup.R: The datasets acq and crude can now be
589            created on the fly.
590    
591            * R/stopwords.R: Introduced a function returning the stopwords for
592            a given language (English, German and French at the moment)
593    
594            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
595            otherwise falls back to Snowball package.
596    
597    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
598    
599            * man/dissimilarity-methods.Rd: Make clear that any method offered
600            by "dists" from package "cba" can be used.
601    
602    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
603    
604            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
605            to Kurt's latex suggestion. Removed points and underscores in
606            variable names for consistent naming.
607    
608            * DESCRIPTION: Update to version 0.1-2.
609    
610            * man/TextRepository.Rd: Fixed bug in documentation.
611    
612    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
613    
614            * DESCRIPTION: Update to version 0.1-1.
615    
616    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
617    
618            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
619            wordStem.
620    
621    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
622    
623            * R/: Changes due to Kurt's review.
624    
625    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
626    
627            * R/: Implemented improvements based upon comments by David
628            Meyer.
629    
630    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
631    
632            * inst/doc/: Rewrote vignette.
633    
634            * man/: Improved documentation.
635    
636    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
637    
638            * man/: Updated documentation.
639    
640            * DESCRIPTION: Changed package name to "tm". Updated version to
641            0.1 for first CRAN release.
642    
643            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
644            list archive example.
645    
646            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
647            archive example.
648    
649            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
650            from (several mails per box) mbox format to (single mail per file)
651            eml format.
652    
653    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
654    
655            * data/crude.rda: Rebuilt.
656    
657            * data/acq.rda: Rebuilt.
658    
659            * R/reader.R: Factored out reader and parser methods from
660            textdoccol.R.
661    
662            * R/source.R: Factored out Source methods from aobjects.R and
663            textdoccol.R.
664            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
665            feeds.
666    
667            * R/textdoccol.R (DirSource): Added support for recursive
668            traversal of directories.
669    
670    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
671    
672            * R/textdoccol.R ([[): Loads the document corpus automatically
673            into memory upon access.
674            (tm_transform, tm_filter): Removed several checks whether the
675            document is already loaded ([[ ensures this now).
676            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
677            mailing list archive.
678    
679    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
680    
681            * R/aobjects.R (TextDocument): Is now a virtual class.
682            (Source): Is now a virtual class.
683    
684    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
685    
686            * R/textdoccol.R (c): Support for an arbitrary number of document
687            collections.
688    
689    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
690    
691            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
692            append_meta and remove_meta.
693    
694            * R/textdoccol.R: Removed modify_metadata method.
695    
696            * R/textrepo.R: Removed modify_metadata method.
697    
698            * R/textdoccol.R (remove_meta): Supports removal of document
699            collection metadata and document (= in data frame) metadata.
700    
701    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
702    
703            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
704    
705            * data/crude.rda: Rebuilt.
706    
707            * data/acq.rda: Rebuilt.
708    
709            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
710    
711            * R/textdoccol.R ([): Bug fix for subsetting a document
712            collection's data frame.
713    
714    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
715    
716            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
717            to s_filter.
718    
719            * R/textdoccol.R: Local text documents' metadata can now be copied
720            to a document collection's data frame with prescind_meta.
721    
722    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
723    
724            * R/: Text documents' slot metadata is now accessible in s_filter.
725    
726            * R/: Rewrote s_filter function (has still some restrictions).
727    
728    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
729    
730            * R/: Various fixes in handling metadata.
731    
732            * R/: Added update mechanism for text document collections.
733    
734    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
735    
736            * R/: Merging of document collections now creates a binary tree
737            for reconstructing merged document collections.
738    
739            * R/: Redesign of metadata for document collections.
740    
741    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
742    
743            * R/: Messages now use \code{ngettext}.
744    
745    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
746    
747            * R/: Added functions for modifying and removing metadata.
748    
749    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
750    
751            * man/: Updated some documentation.
752    
753            * R/: Corrected some connection issues.
754    
755            * inst/doc: Worked on the vignette.
756    
757    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
758    
759            * inst/: Added texts and started vignette.
760    
761            * R/: Final changes based upon David's comments.
762    
763    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
764    
765            * NAMESPACE: Corrected exports (generic methods need exportMethods
766            directives!).
767    
768    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
769    
770            * R/: Modified the TextDocCol constructur and various parsers. It
771            is now modular and supports various file formats via plugins (see
772            the new "Source" class).
773    
774    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
775    
776            * man/: Revised documentation after previous code changes.
777    
778    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
779    
780            * R/: Remaining changes as discussed with David.
781    
782    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
783    
784            * R/: Some changes as suggested by David. The rest will follow
785            within the next days.
786    
787    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
788    
789            * man/: Finished documentation.
790    
791    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
792    
793            * man/: Wrote some documentation.
794    
795    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
796    
797            * R/: Further syntactic sugar in form of additional assignment and
798            accessor methods.
799    
800    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
801    
802            * R/: Syntactic sugar in form of "length", "show" and "summary"
803            operators.
804    
805    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
806    
807            * R/: Diverse updates. Mainly on default operators ("[" or "c")
808            and dissimilarities.
809    
810    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
811    
812            * R/: Added similarity functions.
813    
814            * data/: Added english stopwords.
815    
816    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
817    
818            * data/: Examples compiled for new features
819    
820            * R/: Changes due to new structure.
821    
822            * NAMESPACE: Corrected namespace to reflect new structure.
823    
824            * R/termdocmatrix.R: Adapted for new naming scheme.
825    
826    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * R/textdoccol.R: Adapted code for new class structure. Wrote
829            several transform and filter functions operating on text document
830            collections (alias text document databases).
831    
832            * R/aobjects.R: Adapted class structure with inheritance,
833            repositories and additional meta data. Loading files on demand is
834            now possible.
835    
836    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
837    
838            * R/: Some cosmetic cleanups.
839    
840            * inst/: Removed vignette on clustering. That and much more is now
841            described in the JSS paper on text mining. Based upon that
842            article an elaborated vignette will be incorporated in the future.
843    
844    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
845    
846            * R/: Updated generic S4 methods to comply with signature changes
847            in newer versions of R (> 2.3)
848    
849    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
850    
851            * ext/R/importRIS.R: Automatic RIS import is now possible.
852    
853    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
854    
855            * R/textdoccol.R: Added RIS HTML input format.
856    
857    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
858    
859            * R/textdoccol.R: Removed bug that caused invalid text document
860            collections when handling many input files.
861    
862    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
863    
864            * R/textdoccol.R: Restructured and extended file import
865            mechanism.
866    
867            * inst/doc/clustering.Rnw: Adapted vignette for use with
868            ReutNews.rda
869    
870            * man/ReutNews.Rd: Documentation for ReutNews.rda
871    
872            * data/ReutNews.rda: A tiny Reuters21578 example data set.
873    
874    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
875    
876            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
877            clustering facilities of this package.
878    
879    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
880    
881            * R/aobjects.R: Changed package document structure to avoid class
882            dependency problems.
883    
884    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
885    
886            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
887            data set.
888    
889            *  Finished documentation and reordered directory structure. Now "R
890            CMD check textmin" works without errors.
891    
892    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894            * src/: Various splits can now be easily created for the
895            Reuters21578 data set.
896    
897    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
898    
899            *  Updated documentation
900    
901    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
902    
903            *  Wrote R documentation for some classes and methods.
904    
905    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
906    
907            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
908            files. See the questionnaire data/Umfrage.csv for such an example.
909            We are now able to import files in Reuters-21578 XML format.
910    
911            *  Changed class interfaces in various files. Weighting of the text
912            matrix is now possible.
913    
914    2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
915    
916            * R/textdoccol.R: One can build term-document matrices if
917            nessecary (with buildTDM(...)) and fill the field tdm from a text
918            document collection with it.
919    
920            * R/textmatrix.R: Wrote S4 class for term-document matrices.
921    
922    2005-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
923    
924            * R/textdoccol.R: We now can read in a whole XML file with several
925            news items.
926    
927  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
928    
929          * R/textdoccol.R: Set up an S4 class for a collection of text          * R/textdoccol.R: Set up an S4 class for a collection of text

Legend:
Removed from v.17  
changed lines
  Added in v.981

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge