SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 17, Sat Nov 5 14:47:12 2005 UTC pkg/ChangeLog revision 957, Fri Jun 12 12:47:57 2009 UTC
# Line 1  Line 1 
1    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/doc.R (show): Pretty print.
4    
5    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
8            gracefully.
9    
10    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/corpus.R: Make corpus virtual. Implement corpus with standard
13            and permanent storage semantics.
14    
15            * DESCRIPTION: New major release. A *lot* of improvements.
16    
17    2009-05-04   Ingo Feinerer <feinerer@logic.at>
18    
19            * NAMESPACE: Export some simple_triplet_matrix functions.
20    
21    2009-04-28   Ingo Feinerer <feinerer@logic.at>
22    
23            * R/weight.R: Adapt tf-idf to new matrix format.
24    
25    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
26    
27            * R/matrix.R: Create two distinct classes for term-document and
28            document-term matrices.
29    
30    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
31    
32            * R/termdocmatrix.R: No longer use Matrix package. This reduces
33            package start-up time significantly.
34    
35    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
36    
37            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
38    
39    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
40    
41            * R/transform.R (tmReduce): Combine multiple maps into one
42            transformation.
43    
44    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
45    
46            * R/weight.R: Remove weightLogical since it does not return a
47            dgCMatrix.
48    
49            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
50            or TermDocumentMatrix instead.
51    
52    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
53    
54            * inst/doc/extensions.Rnw: Finished vignette.
55    
56    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
57    
58            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
59            DocumentTermMatrix representations.
60    
61    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
62    
63            * R/reader.R (readXML): New reader for arbitrary XML files.
64    
65    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
66    
67            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
68            (XMLSource): New XMLSource class for arbitrary XML files.
69            (Source): New slot Vectorized.
70    
71    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
72    
73            * R/reader.R (readTabular): Experimental reader for tabular data
74            structures which can be customized via user-defined mappings.
75    
76            * R/reader.R: Always use UTC time zone.
77    
78            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
79    
80    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
81    
82            * R/reader.R (readDOC): Options can be passed over to antiword.
83    
84            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
85            pdftotext.
86    
87    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
88    
89            * R/source.R (DirSource): Add pattern and ignore.case arguments
90            which are internally passed over to list.files().
91    
92    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
93    
94            * inst/doc/tm.Rnw: Suppress pointless loading message.
95    
96    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
97    
98            * DESCRIPTION: Speed up package loading (via moving packages not
99            strictly necessary for normal operation to Suggests instead of
100            Depends).
101    
102    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
103    
104            * R/reader.R (readNewsgroup): The date format is now configurable.
105    
106    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
107    
108            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
109    
110    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
111    
112            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
113    
114    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
115    
116            * R/source.R (DataframeSource): New source class for data frames.
117    
118            * R/source.R: Fixed non-standard call evaluation.
119    
120    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
121    
122            * R/source.R (URISource): New source class for a single document.
123    
124    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
125    
126            * R/source.R: Refactoring.
127    
128    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
129    
130            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
131            Rmpi installations more gracefully.
132    
133    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
134    
135            * R/source.R (Source): Add Length slot.
136    
137    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
138    
139            * R/AAA.R: Unify duplicated .onLoad function.
140    
141    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
142    
143            * DESCRIPTION (Suggests): Added Rmpi.
144    
145    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
146    
147            * R/source.R (getElem): Fix 'no visible binding' warning.
148    
149            * man/WeightFunction.Rd: Fix signature.
150    
151    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
152    
153            * R/weight.R: Introduce name abbreviations for weighting functions.
154    
155    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
156    
157            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
158    
159            * R/cluster.R: Provide convenience functions for using a MPI
160            cluster.
161    
162            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
163            available.
164    
165            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
166            available.
167    
168    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
169    
170            * R/textdoccol.R (lapply): Removed debug print out.
171    
172    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
173    
174            * R/reader.R (readRCV1): Improved meta data extraction from
175            Reuters Corpus Volume 1 documents.
176    
177    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
178    
179            * R/transform.R: Ensure that all mappings preserve multiline
180            structures.
181    
182    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
183    
184            * R/filter.R: Every filter has now an attribute indicating whether
185            it sould be applied to document level (doclevel).
186    
187            * R/textdoccol.R (tmFilter): Set searchFullText as new default
188            filter.
189    
190    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
191    
192            * R/transform.R (replacePatterns): Replaced removeWords by
193            replacePatterns. Suggested by Christian Buchta.
194    
195            * R/textdoccol.R (inspect): Improved formatting.
196    
197    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
198    
199            * inst/CITATION: Updated JSS article information.
200    
201            * R/textdoccol.R (setAs): Added coerce method from list to
202            corpus.
203    
204            * R/meta.R (meta): Improved meta data handling.
205    
206    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
207    
208            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
209            Christian Buchta.
210    
211            * inst/CITATION: Added template to include JSS article reference.
212    
213    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
214    
215            * R/textdoccol.R (tmMap): Introduced lazy mapping.
216    
217            * R/source.R: Added VectorSource.
218    
219    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
220    
221            * man/: Language codes should be in ISO 639-1 format.
222    
223            * R/textdoccol.R (asPlain): Preserve local meta data.
224    
225    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
226    
227            * R/textdoccol.R (writeCorpus): Function for writing a corpus
228            containing plain text documents to disk.
229    
230    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
231    
232            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
233            always set correctly.
234    
235            * R/textdoccol.R: Set load = TRUE as default for load on demand
236            since in most cases this is the wanted behaviour.
237    
238    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
239    
240            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
241    
242            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
243    
244    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
245    
246            * R/meta.R (meta): New function for consistent access to meta data
247            of document collections, repositories, and texts.
248    
249    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
250    
251            * R/: Better support for encodings.
252    
253    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
254    
255            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
256            selection when no reader argument is given.
257    
258    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * R/source.R (CSVSource): Now uses read.csv instead of scan
261            internally.
262    
263    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
264    
265            * R/reader.R (getReaders): Returns available reader functions.
266    
267            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
268            as default.
269    
270    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
271    
272            * R/stopwords.R (stopwords): Shortened code, removed codetools
273            variable warnings.
274    
275            * man/: Documentation for showMeta, added an example for tmMap.
276    
277            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
278            some minor typos fixed.
279    
280    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
281    
282            * R/aobjects.R (showMeta): Added method for pretty printing a
283            text document's meta data.
284    
285    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
286    
287            * R/textdoccol.R (TextDocCol): Better handling of empty
288            arguments.
289    
290            * NAMESPACE: Exported readDOC.
291    
292            * man/completeStems.Rd: Added an example.
293    
294    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
295    
296            * R/stopwords.R (stopwords): Look up .dat files at every
297            call. Allows users to modify stopword .dat files interactively.
298    
299    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
300    
301            * R/termdocmatrix.R (termFreq): Correct processing of empty
302            documents.
303    
304    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
305    
306            * man/: Updated documentation.
307    
308    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
309    
310            * R/complete.R (completeStems): Completes (heuristically) word
311            stems.
312    
313            * R/termdocmatrix.R (TermDocMatrix2): New modular
314            constructor.
315    
316            * NAMESPACE: Exported termFreq.
317    
318    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
319    
320            * R/reader.R (readDOC): Added MS Word reader (using antiword).
321    
322    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
323    
324            * R/weight.R: Weighting functions for TermDocMatrix.
325    
326    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
329            functions for accessing dimension, column, and row names.
330    
331            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
332    
333    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
334    
335            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
336    
337    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
338    
339            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
340    
341    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
342    
343            * R/reader.R (readPDF): Removed manual checks for pdftotext and
344            pdfinfo. The system call gives a warning anyway.
345    
346    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * R/textdoccol.R (asPlain): Conversion from
349            StructuredTextDocuments to PlainTextDocuments.
350    
351    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
352    
353            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
354            for accessing term-document matrices.
355    
356            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
357            are installed.
358    
359    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
360    
361            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
362            Christian Buchta.
363    
364    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
367    
368    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
369    
370            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
371    
372            * R/reader.R (readPDF): Added PDF reader.
373    
374    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
377    
378            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
379    
380            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
381    
382            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
383    
384    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
385    
386            * R/distmeasure.R (dissimilarity): Replaced dists call from
387            package cba by new dist call from package proxy.
388    
389    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
392    
393    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * R/termdocmatrix.R: require() uses the quietly option to suppress
396            loading messages.
397    
398    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/dictionary.R: Added dictionary support.
401    
402    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
405            documents. This simplifies some functions, e.g., asPlain.
406    
407    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * inst/doc/tm.Rnw: Fixed some typos in vignette.
410    
411    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * R/textdoccol.R (replaceWords): Added method to replace a set of
414            words by a single word. Useful for synonyms.
415    
416    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
419    
420    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
423            vectors. Thanks to Ariel Maguyon for his error report.
424            (removeSparseTerms): New function to remove columns from a
425            term-document matrix exceeding a sparse factor.
426    
427    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
428    
429            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
430    
431    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
432    
433            * man/sFilter.Rd: Corrected documentation on statement format (use
434            '==' instead of '=').
435    
436    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
437    
438            * R/aobjects.R (StructuredTextDocument): Inherits from
439            TextDocument.
440    
441    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
442    
443            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
444            on sparse matrices as proposed by Martin Maechler.
445    
446    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
447    
448            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
449            \pkg{filehash} version makes them deprecated.
450    
451    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
452    
453            * R/termdocmatrix.R (textvector): Stemming is now performed before
454            erasing stopwords.
455            (weightMatrix): Adapted to handle sparse matrices.
456            (TermDocMatrix): Sparse matrix is now efficiently built by
457            direct stepwise insertion of row values into it.
458    
459    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
462            due to ongoing problems. For our purposes the latter is as useful
463            as the replaced package.
464    
465    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
466    
467            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
468    
469            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
470    
471    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
472    
473            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
474            languages with available stopwords.
475    
476    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
477    
478            * inst/doc/tm.Rnw: Minor corrections in the vignette.
479    
480    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
481    
482            * DESCRIPTION: Update to version 0.2, since a lot of new features
483            have been integrated.
484    
485            * inst/stopwords: Updated existing stopwords and added stopwords
486            for various other languages.
487    
488    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
489    
490            * man/: Updated documentation.
491    
492            * Work/testDb.R: Script to test database stuff.
493    
494            * R/: Fixed various database related bugs. Seems to be rather
495            useable now, i.e., consider as alpha status for now.
496    
497    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * R/: Fixed some bugs related to database support.
500    
501    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
502    
503            * man/: Added a lot of examples to the manuals.
504    
505    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
506    
507            * man/: Updated parts of the documentation.
508    
509            * R/textdoccol.R (asPlain): Added conversion from newsgroup
510            documents to plain text documents.
511    
512    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
513    
514            * R/textdoccol.R: Finished experimental database support. Not yet
515            intensively tested.
516    
517            * R/source.R: Now each source has a default reader.
518    
519            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
520            class anymore.
521    
522            * R/plaintextdoc.R: Custom show method for plain text documents.
523    
524            * R/aobjects.R: Added a class for structured text documents.
525    
526            * R/reader.R: Replaced remaining \code{parser} occurrences with
527            \code{reader}.
528    
529            * R/textdoccol.R (summary): Indent tags.
530    
531            * R/textdoccol.R (removePunctuation): Transform method to remove
532            punctuation marks.
533    
534    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
535    
536            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
537            using prescindMeta().
538    
539    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
540    
541            * R/textdoccol.R: Improved database support.
542    
543    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
544    
545            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
546    
547            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
548            language code.
549    
550            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
551            into parserControl argument.
552    
553            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
554    
555    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
556    
557            * Work/tmDataSetup.R: The datasets acq and crude can now be
558            created on the fly.
559    
560            * R/stopwords.R: Introduced a function returning the stopwords for
561            a given language (English, German and French at the moment)
562    
563            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
564            otherwise falls back to Snowball package.
565    
566    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
567    
568            * man/dissimilarity-methods.Rd: Make clear that any method offered
569            by "dists" from package "cba" can be used.
570    
571    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
572    
573            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
574            to Kurt's latex suggestion. Removed points and underscores in
575            variable names for consistent naming.
576    
577            * DESCRIPTION: Update to version 0.1-2.
578    
579            * man/TextRepository.Rd: Fixed bug in documentation.
580    
581    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
582    
583            * DESCRIPTION: Update to version 0.1-1.
584    
585    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
586    
587            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
588            wordStem.
589    
590    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
591    
592            * R/: Changes due to Kurt's review.
593    
594    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
595    
596            * R/: Implemented improvements based upon comments by David
597            Meyer.
598    
599    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
600    
601            * inst/doc/: Rewrote vignette.
602    
603            * man/: Improved documentation.
604    
605    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
606    
607            * man/: Updated documentation.
608    
609            * DESCRIPTION: Changed package name to "tm". Updated version to
610            0.1 for first CRAN release.
611    
612            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
613            list archive example.
614    
615            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
616            archive example.
617    
618            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
619            from (several mails per box) mbox format to (single mail per file)
620            eml format.
621    
622    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
623    
624            * data/crude.rda: Rebuilt.
625    
626            * data/acq.rda: Rebuilt.
627    
628            * R/reader.R: Factored out reader and parser methods from
629            textdoccol.R.
630    
631            * R/source.R: Factored out Source methods from aobjects.R and
632            textdoccol.R.
633            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
634            feeds.
635    
636            * R/textdoccol.R (DirSource): Added support for recursive
637            traversal of directories.
638    
639    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
640    
641            * R/textdoccol.R ([[): Loads the document corpus automatically
642            into memory upon access.
643            (tm_transform, tm_filter): Removed several checks whether the
644            document is already loaded ([[ ensures this now).
645            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
646            mailing list archive.
647    
648    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
649    
650            * R/aobjects.R (TextDocument): Is now a virtual class.
651            (Source): Is now a virtual class.
652    
653    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
654    
655            * R/textdoccol.R (c): Support for an arbitrary number of document
656            collections.
657    
658    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
659    
660            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
661            append_meta and remove_meta.
662    
663            * R/textdoccol.R: Removed modify_metadata method.
664    
665            * R/textrepo.R: Removed modify_metadata method.
666    
667            * R/textdoccol.R (remove_meta): Supports removal of document
668            collection metadata and document (= in data frame) metadata.
669    
670    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
671    
672            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
673    
674            * data/crude.rda: Rebuilt.
675    
676            * data/acq.rda: Rebuilt.
677    
678            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
679    
680            * R/textdoccol.R ([): Bug fix for subsetting a document
681            collection's data frame.
682    
683    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
686            to s_filter.
687    
688            * R/textdoccol.R: Local text documents' metadata can now be copied
689            to a document collection's data frame with prescind_meta.
690    
691    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
692    
693            * R/: Text documents' slot metadata is now accessible in s_filter.
694    
695            * R/: Rewrote s_filter function (has still some restrictions).
696    
697    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
698    
699            * R/: Various fixes in handling metadata.
700    
701            * R/: Added update mechanism for text document collections.
702    
703    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
704    
705            * R/: Merging of document collections now creates a binary tree
706            for reconstructing merged document collections.
707    
708            * R/: Redesign of metadata for document collections.
709    
710    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
711    
712            * R/: Messages now use \code{ngettext}.
713    
714    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
715    
716            * R/: Added functions for modifying and removing metadata.
717    
718    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
719    
720            * man/: Updated some documentation.
721    
722            * R/: Corrected some connection issues.
723    
724            * inst/doc: Worked on the vignette.
725    
726    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
727    
728            * inst/: Added texts and started vignette.
729    
730            * R/: Final changes based upon David's comments.
731    
732    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
733    
734            * NAMESPACE: Corrected exports (generic methods need exportMethods
735            directives!).
736    
737    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
738    
739            * R/: Modified the TextDocCol constructur and various parsers. It
740            is now modular and supports various file formats via plugins (see
741            the new "Source" class).
742    
743    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
744    
745            * man/: Revised documentation after previous code changes.
746    
747    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
748    
749            * R/: Remaining changes as discussed with David.
750    
751    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
752    
753            * R/: Some changes as suggested by David. The rest will follow
754            within the next days.
755    
756    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
757    
758            * man/: Finished documentation.
759    
760    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
761    
762            * man/: Wrote some documentation.
763    
764    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
765    
766            * R/: Further syntactic sugar in form of additional assignment and
767            accessor methods.
768    
769    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
770    
771            * R/: Syntactic sugar in form of "length", "show" and "summary"
772            operators.
773    
774    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
775    
776            * R/: Diverse updates. Mainly on default operators ("[" or "c")
777            and dissimilarities.
778    
779    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
780    
781            * R/: Added similarity functions.
782    
783            * data/: Added english stopwords.
784    
785    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
786    
787            * data/: Examples compiled for new features
788    
789            * R/: Changes due to new structure.
790    
791            * NAMESPACE: Corrected namespace to reflect new structure.
792    
793            * R/termdocmatrix.R: Adapted for new naming scheme.
794    
795    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
796    
797            * R/textdoccol.R: Adapted code for new class structure. Wrote
798            several transform and filter functions operating on text document
799            collections (alias text document databases).
800    
801            * R/aobjects.R: Adapted class structure with inheritance,
802            repositories and additional meta data. Loading files on demand is
803            now possible.
804    
805    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
806    
807            * R/: Some cosmetic cleanups.
808    
809            * inst/: Removed vignette on clustering. That and much more is now
810            described in the JSS paper on text mining. Based upon that
811            article an elaborated vignette will be incorporated in the future.
812    
813    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
814    
815            * R/: Updated generic S4 methods to comply with signature changes
816            in newer versions of R (> 2.3)
817    
818    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
819    
820            * ext/R/importRIS.R: Automatic RIS import is now possible.
821    
822    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
823    
824            * R/textdoccol.R: Added RIS HTML input format.
825    
826    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
827    
828            * R/textdoccol.R: Removed bug that caused invalid text document
829            collections when handling many input files.
830    
831    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
832    
833            * R/textdoccol.R: Restructured and extended file import
834            mechanism.
835    
836            * inst/doc/clustering.Rnw: Adapted vignette for use with
837            ReutNews.rda
838    
839            * man/ReutNews.Rd: Documentation for ReutNews.rda
840    
841            * data/ReutNews.rda: A tiny Reuters21578 example data set.
842    
843    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
844    
845            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
846            clustering facilities of this package.
847    
848    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
849    
850            * R/aobjects.R: Changed package document structure to avoid class
851            dependency problems.
852    
853    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
854    
855            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
856            data set.
857    
858            *  Finished documentation and reordered directory structure. Now "R
859            CMD check textmin" works without errors.
860    
861    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
862    
863            * src/: Various splits can now be easily created for the
864            Reuters21578 data set.
865    
866    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
867    
868            *  Updated documentation
869    
870    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
871    
872            *  Wrote R documentation for some classes and methods.
873    
874    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
875    
876            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
877            files. See the questionnaire data/Umfrage.csv for such an example.
878            We are now able to import files in Reuters-21578 XML format.
879    
880            *  Changed class interfaces in various files. Weighting of the text
881            matrix is now possible.
882    
883    2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
884    
885            * R/textdoccol.R: One can build term-document matrices if
886            nessecary (with buildTDM(...)) and fill the field tdm from a text
887            document collection with it.
888    
889            * R/textmatrix.R: Wrote S4 class for term-document matrices.
890    
891    2005-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
892    
893            * R/textdoccol.R: We now can read in a whole XML file with several
894            news items.
895    
896  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
897    
898          * R/textdoccol.R: Set up an S4 class for a collection of text          * R/textdoccol.R: Set up an S4 class for a collection of text

Legend:
Removed from v.17  
changed lines
  Added in v.957

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge