SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1020, Tue Nov 17 09:16:13 2009 UTC
# Line 1  Line 1 
1    2009-11-17  Ingo Feinerer  <feinerer@logic.at>
2    
3            * man/plot.Rd: Use \dontrun{} in \examples{} section in the hope
4            that CRAN Mac OS X builds do not fail any longer.
5    
6    2009-11-15  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/matrix.R (tokenize): Use scan(..., what = "character") instead
9            of RWeka:AlphabeticTokenizer() as default.
10    
11    2009-11-14  Ingo Feinerer  <feinerer@logic.at>
12    
13            * R/transform.R (removeWords.PlainTextDocument): Fix bug which
14            caused words at the beginning or the end of a line not to be removed. Do
15            not delete whitespace anymore.
16    
17    2009-11-12  Ingo Feinerer  <feinerer@logic.at>
18    
19            * R/source.R (DirSource): Default to working directory if no path
20            is specified.
21    
22    2009-11-11  Ingo Feinerer  <feinerer@logic.at>
23    
24            * R/source.R (DirSource): Stop on empty directories.
25    
26    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
27    
28            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
29            named documents.
30    
31    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
32    
33            * R/transform.R (removeWords): Improve regular expressions.
34    
35    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
36    
37            * R/meta.R (DublinCore): Allow lower case tags.
38    
39    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
40    
41            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
42            instead of x$children.
43    
44    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
45    
46            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
47    
48    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
49    
50            * R/: Use S3 instead of S4 class system.
51    
52    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
53    
54            * R/reader.R (readMail): Moved to tm.plugin.mail package.
55    
56    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
57    
58            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
59            postings are basically e-mails with some extra headers.
60    
61    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
62    
63            * R/transform.R: Move convertMboxEml, removeCitation,
64            removeMultipart, and removeSignature to the tm.plugin.mail package
65            since they are mainly utility functions (for handling e-mails) and
66            not very framework specific.
67    
68    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
69    
70            * man/: Fix documentation.
71    
72    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
73    
74            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
75            plain text document instead of an XML document for texts of the
76            Reuters-21578 dataset.
77    
78            * R/sparse.R: Removed since the slam package is now available on
79            CRAN.
80    
81            * DESCRIPTION (Depends): Add slam package.
82    
83    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
84    
85            * R/transform.R (stemDoc): Fix character(0) handling.
86    
87    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
88    
89            * R/doc.R (show): Pretty print.
90    
91    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
92    
93            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
94            gracefully.
95    
96    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
97    
98            * R/corpus.R: Make corpus virtual. Implement corpus with standard
99            and permanent storage semantics.
100    
101            * DESCRIPTION: New major release. A *lot* of improvements.
102    
103    2009-05-04   Ingo Feinerer <feinerer@logic.at>
104    
105            * NAMESPACE: Export some simple_triplet_matrix functions.
106    
107    2009-04-28   Ingo Feinerer <feinerer@logic.at>
108    
109            * R/weight.R: Adapt tf-idf to new matrix format.
110    
111    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
112    
113            * R/matrix.R: Create two distinct classes for term-document and
114            document-term matrices.
115    
116    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
117    
118            * R/termdocmatrix.R: No longer use Matrix package. This reduces
119            package start-up time significantly.
120    
121    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
122    
123            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
124    
125    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
126    
127            * R/transform.R (tmReduce): Combine multiple maps into one
128            transformation.
129    
130    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
131    
132            * R/weight.R: Remove weightLogical since it does not return a
133            dgCMatrix.
134    
135            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
136            or TermDocumentMatrix instead.
137    
138    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
139    
140            * inst/doc/extensions.Rnw: Finished vignette.
141    
142    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
143    
144            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
145            DocumentTermMatrix representations.
146    
147    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
148    
149            * R/reader.R (readXML): New reader for arbitrary XML files.
150    
151    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
152    
153            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
154            (XMLSource): New XMLSource class for arbitrary XML files.
155            (Source): New slot Vectorized.
156    
157    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
158    
159            * R/reader.R (readTabular): Experimental reader for tabular data
160            structures which can be customized via user-defined mappings.
161    
162            * R/reader.R: Always use UTC time zone.
163    
164            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
165    
166    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
167    
168            * R/reader.R (readDOC): Options can be passed over to antiword.
169    
170            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
171            pdftotext.
172    
173    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
174    
175            * R/source.R (DirSource): Add pattern and ignore.case arguments
176            which are internally passed over to list.files().
177    
178    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
179    
180            * inst/doc/tm.Rnw: Suppress pointless loading message.
181    
182    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
183    
184            * DESCRIPTION: Speed up package loading (via moving packages not
185            strictly necessary for normal operation to Suggests instead of
186            Depends).
187    
188    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
189    
190            * R/reader.R (readNewsgroup): The date format is now configurable.
191    
192    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
193    
194            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
195    
196    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
197    
198            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
199    
200    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
201    
202            * R/source.R (DataframeSource): New source class for data frames.
203    
204            * R/source.R: Fixed non-standard call evaluation.
205    
206    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
207    
208            * R/source.R (URISource): New source class for a single document.
209    
210    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
211    
212            * R/source.R: Refactoring.
213    
214    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
215    
216            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
217            Rmpi installations more gracefully.
218    
219    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
220    
221            * R/source.R (Source): Add Length slot.
222    
223    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
224    
225            * R/AAA.R: Unify duplicated .onLoad function.
226    
227    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
228    
229            * DESCRIPTION (Suggests): Added Rmpi.
230    
231    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
232    
233            * R/source.R (getElem): Fix 'no visible binding' warning.
234    
235            * man/WeightFunction.Rd: Fix signature.
236    
237    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
238    
239            * R/weight.R: Introduce name abbreviations for weighting functions.
240    
241    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
242    
243            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
244    
245            * R/cluster.R: Provide convenience functions for using a MPI
246            cluster.
247    
248            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
249            available.
250    
251            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
252            available.
253    
254    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
255    
256            * R/textdoccol.R (lapply): Removed debug print out.
257    
258    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * R/reader.R (readRCV1): Improved meta data extraction from
261            Reuters Corpus Volume 1 documents.
262    
263    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
264    
265            * R/transform.R: Ensure that all mappings preserve multiline
266            structures.
267    
268    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
269    
270            * R/filter.R: Every filter has now an attribute indicating whether
271            it sould be applied to document level (doclevel).
272    
273            * R/textdoccol.R (tmFilter): Set searchFullText as new default
274            filter.
275    
276    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
277    
278            * R/transform.R (replacePatterns): Replaced removeWords by
279            replacePatterns. Suggested by Christian Buchta.
280    
281            * R/textdoccol.R (inspect): Improved formatting.
282    
283    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
284    
285            * inst/CITATION: Updated JSS article information.
286    
287            * R/textdoccol.R (setAs): Added coerce method from list to
288            corpus.
289    
290            * R/meta.R (meta): Improved meta data handling.
291    
292    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
293    
294            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
295            Christian Buchta.
296    
297            * inst/CITATION: Added template to include JSS article reference.
298    
299    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
300    
301            * R/textdoccol.R (tmMap): Introduced lazy mapping.
302    
303            * R/source.R: Added VectorSource.
304    
305    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
306    
307            * man/: Language codes should be in ISO 639-1 format.
308    
309            * R/textdoccol.R (asPlain): Preserve local meta data.
310    
311    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
312    
313            * R/textdoccol.R (writeCorpus): Function for writing a corpus
314            containing plain text documents to disk.
315    
316    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
317    
318            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
319            always set correctly.
320    
321            * R/textdoccol.R: Set load = TRUE as default for load on demand
322            since in most cases this is the wanted behaviour.
323    
324    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
327    
328            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
329    
330    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
331    
332            * R/meta.R (meta): New function for consistent access to meta data
333            of document collections, repositories, and texts.
334    
335    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
336    
337            * R/: Better support for encodings.
338    
339    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
340    
341            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
342            selection when no reader argument is given.
343    
344    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
345    
346            * R/source.R (CSVSource): Now uses read.csv instead of scan
347            internally.
348    
349    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
350    
351            * R/reader.R (getReaders): Returns available reader functions.
352    
353            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
354            as default.
355    
356    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
357    
358            * R/stopwords.R (stopwords): Shortened code, removed codetools
359            variable warnings.
360    
361            * man/: Documentation for showMeta, added an example for tmMap.
362    
363            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
364            some minor typos fixed.
365    
366    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
367    
368            * R/aobjects.R (showMeta): Added method for pretty printing a
369            text document's meta data.
370    
371    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
372    
373            * R/textdoccol.R (TextDocCol): Better handling of empty
374            arguments.
375    
376            * NAMESPACE: Exported readDOC.
377    
378            * man/completeStems.Rd: Added an example.
379    
380    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
381    
382            * R/stopwords.R (stopwords): Look up .dat files at every
383            call. Allows users to modify stopword .dat files interactively.
384    
385    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
386    
387            * R/termdocmatrix.R (termFreq): Correct processing of empty
388            documents.
389    
390    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
391    
392            * man/: Updated documentation.
393    
394    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * R/complete.R (completeStems): Completes (heuristically) word
397            stems.
398    
399            * R/termdocmatrix.R (TermDocMatrix2): New modular
400            constructor.
401    
402            * NAMESPACE: Exported termFreq.
403    
404    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
405    
406            * R/reader.R (readDOC): Added MS Word reader (using antiword).
407    
408    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
409    
410            * R/weight.R: Weighting functions for TermDocMatrix.
411    
412    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
415            functions for accessing dimension, column, and row names.
416    
417            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
418    
419    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
420    
421            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
422    
423    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
424    
425            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
426    
427    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
428    
429            * R/reader.R (readPDF): Removed manual checks for pdftotext and
430            pdfinfo. The system call gives a warning anyway.
431    
432    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
433    
434            * R/textdoccol.R (asPlain): Conversion from
435            StructuredTextDocuments to PlainTextDocuments.
436    
437    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
438    
439            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
440            for accessing term-document matrices.
441    
442            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
443            are installed.
444    
445    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
448            Christian Buchta.
449    
450    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
453    
454    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
457    
458            * R/reader.R (readPDF): Added PDF reader.
459    
460    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
463    
464            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
465    
466            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
467    
468            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
469    
470    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
471    
472            * R/distmeasure.R (dissimilarity): Replaced dists call from
473            package cba by new dist call from package proxy.
474    
475    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
476    
477            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
478    
479    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
480    
481            * R/termdocmatrix.R: require() uses the quietly option to suppress
482            loading messages.
483    
484    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
485    
486            * R/dictionary.R: Added dictionary support.
487    
488    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
489    
490            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
491            documents. This simplifies some functions, e.g., asPlain.
492    
493    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
494    
495            * inst/doc/tm.Rnw: Fixed some typos in vignette.
496    
497    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * R/textdoccol.R (replaceWords): Added method to replace a set of
500            words by a single word. Useful for synonyms.
501    
502    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
505    
506    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
507    
508            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
509            vectors. Thanks to Ariel Maguyon for his error report.
510            (removeSparseTerms): New function to remove columns from a
511            term-document matrix exceeding a sparse factor.
512    
513    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
514    
515            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
516    
517    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
518    
519            * man/sFilter.Rd: Corrected documentation on statement format (use
520            '==' instead of '=').
521    
522    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
523    
524            * R/aobjects.R (StructuredTextDocument): Inherits from
525            TextDocument.
526    
527    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
528    
529            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
530            on sparse matrices as proposed by Martin Maechler.
531    
532    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
533    
534            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
535            \pkg{filehash} version makes them deprecated.
536    
537    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
538    
539            * R/termdocmatrix.R (textvector): Stemming is now performed before
540            erasing stopwords.
541            (weightMatrix): Adapted to handle sparse matrices.
542            (TermDocMatrix): Sparse matrix is now efficiently built by
543            direct stepwise insertion of row values into it.
544    
545    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
546    
547            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
548            due to ongoing problems. For our purposes the latter is as useful
549            as the replaced package.
550    
551    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
552    
553            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
554    
555            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
556    
557    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
558    
559            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
560            languages with available stopwords.
561    
562    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
563    
564            * inst/doc/tm.Rnw: Minor corrections in the vignette.
565    
566    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
567    
568            * DESCRIPTION: Update to version 0.2, since a lot of new features
569            have been integrated.
570    
571            * inst/stopwords: Updated existing stopwords and added stopwords
572            for various other languages.
573    
574    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
575    
576            * man/: Updated documentation.
577    
578            * Work/testDb.R: Script to test database stuff.
579    
580            * R/: Fixed various database related bugs. Seems to be rather
581            useable now, i.e., consider as alpha status for now.
582    
583    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
584    
585            * R/: Fixed some bugs related to database support.
586    
587    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
588    
589            * man/: Added a lot of examples to the manuals.
590    
591    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
592    
593            * man/: Updated parts of the documentation.
594    
595            * R/textdoccol.R (asPlain): Added conversion from newsgroup
596            documents to plain text documents.
597    
598    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
599    
600            * R/textdoccol.R: Finished experimental database support. Not yet
601            intensively tested.
602    
603            * R/source.R: Now each source has a default reader.
604    
605            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
606            class anymore.
607    
608            * R/plaintextdoc.R: Custom show method for plain text documents.
609    
610            * R/aobjects.R: Added a class for structured text documents.
611    
612            * R/reader.R: Replaced remaining \code{parser} occurrences with
613            \code{reader}.
614    
615            * R/textdoccol.R (summary): Indent tags.
616    
617            * R/textdoccol.R (removePunctuation): Transform method to remove
618            punctuation marks.
619    
620    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
621    
622            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
623            using prescindMeta().
624    
625    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
626    
627            * R/textdoccol.R: Improved database support.
628    
629    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
630    
631            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
632    
633            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
634            language code.
635    
636            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
637            into parserControl argument.
638    
639            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
640    
641    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
642    
643            * Work/tmDataSetup.R: The datasets acq and crude can now be
644            created on the fly.
645    
646            * R/stopwords.R: Introduced a function returning the stopwords for
647            a given language (English, German and French at the moment)
648    
649            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
650            otherwise falls back to Snowball package.
651    
652    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
653    
654            * man/dissimilarity-methods.Rd: Make clear that any method offered
655            by "dists" from package "cba" can be used.
656    
657    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
658    
659            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
660            to Kurt's latex suggestion. Removed points and underscores in
661            variable names for consistent naming.
662    
663            * DESCRIPTION: Update to version 0.1-2.
664    
665            * man/TextRepository.Rd: Fixed bug in documentation.
666    
667    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
668    
669            * DESCRIPTION: Update to version 0.1-1.
670    
671    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
672    
673            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
674            wordStem.
675    
676    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
677    
678            * R/: Changes due to Kurt's review.
679    
680    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
681    
682            * R/: Implemented improvements based upon comments by David
683            Meyer.
684    
685    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
686    
687            * inst/doc/: Rewrote vignette.
688    
689            * man/: Improved documentation.
690    
691    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
692    
693            * man/: Updated documentation.
694    
695            * DESCRIPTION: Changed package name to "tm". Updated version to
696            0.1 for first CRAN release.
697    
698            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
699            list archive example.
700    
701            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
702            archive example.
703    
704            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
705            from (several mails per box) mbox format to (single mail per file)
706            eml format.
707    
708    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
709    
710            * data/crude.rda: Rebuilt.
711    
712            * data/acq.rda: Rebuilt.
713    
714            * R/reader.R: Factored out reader and parser methods from
715            textdoccol.R.
716    
717            * R/source.R: Factored out Source methods from aobjects.R and
718            textdoccol.R.
719            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
720            feeds.
721    
722            * R/textdoccol.R (DirSource): Added support for recursive
723            traversal of directories.
724    
725    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
726    
727            * R/textdoccol.R ([[): Loads the document corpus automatically
728            into memory upon access.
729            (tm_transform, tm_filter): Removed several checks whether the
730            document is already loaded ([[ ensures this now).
731            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
732            mailing list archive.
733    
734    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
735    
736            * R/aobjects.R (TextDocument): Is now a virtual class.
737            (Source): Is now a virtual class.
738    
739    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
740    
741            * R/textdoccol.R (c): Support for an arbitrary number of document
742            collections.
743    
744    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
745    
746            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
747            append_meta and remove_meta.
748    
749            * R/textdoccol.R: Removed modify_metadata method.
750    
751            * R/textrepo.R: Removed modify_metadata method.
752    
753            * R/textdoccol.R (remove_meta): Supports removal of document
754            collection metadata and document (= in data frame) metadata.
755    
756    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
757    
758            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
759    
760            * data/crude.rda: Rebuilt.
761    
762            * data/acq.rda: Rebuilt.
763    
764            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
765    
766            * R/textdoccol.R ([): Bug fix for subsetting a document
767            collection's data frame.
768    
769    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
770    
771            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
772            to s_filter.
773    
774            * R/textdoccol.R: Local text documents' metadata can now be copied
775            to a document collection's data frame with prescind_meta.
776    
777    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
778    
779            * R/: Text documents' slot metadata is now accessible in s_filter.
780    
781            * R/: Rewrote s_filter function (has still some restrictions).
782    
783    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
784    
785            * R/: Various fixes in handling metadata.
786    
787            * R/: Added update mechanism for text document collections.
788    
789    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
790    
791            * R/: Merging of document collections now creates a binary tree
792            for reconstructing merged document collections.
793    
794            * R/: Redesign of metadata for document collections.
795    
796    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
797    
798            * R/: Messages now use \code{ngettext}.
799    
800    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
801    
802            * R/: Added functions for modifying and removing metadata.
803    
804    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
805    
806            * man/: Updated some documentation.
807    
808            * R/: Corrected some connection issues.
809    
810            * inst/doc: Worked on the vignette.
811    
812    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
813    
814            * inst/: Added texts and started vignette.
815    
816            * R/: Final changes based upon David's comments.
817    
818    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
819    
820            * NAMESPACE: Corrected exports (generic methods need exportMethods
821            directives!).
822    
823    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
824    
825            * R/: Modified the TextDocCol constructur and various parsers. It
826            is now modular and supports various file formats via plugins (see
827            the new "Source" class).
828    
829    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
830    
831            * man/: Revised documentation after previous code changes.
832    
833    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
834    
835            * R/: Remaining changes as discussed with David.
836    
837    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
838    
839            * R/: Some changes as suggested by David. The rest will follow
840            within the next days.
841    
842    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
843    
844            * man/: Finished documentation.
845    
846    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
847    
848            * man/: Wrote some documentation.
849    
850    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
851    
852            * R/: Further syntactic sugar in form of additional assignment and
853            accessor methods.
854    
855    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
856    
857            * R/: Syntactic sugar in form of "length", "show" and "summary"
858            operators.
859    
860    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
861    
862            * R/: Diverse updates. Mainly on default operators ("[" or "c")
863            and dissimilarities.
864    
865    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
866    
867            * R/: Added similarity functions.
868    
869            * data/: Added english stopwords.
870    
871    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
872    
873            * data/: Examples compiled for new features
874    
875            * R/: Changes due to new structure.
876    
877            * NAMESPACE: Corrected namespace to reflect new structure.
878    
879            * R/termdocmatrix.R: Adapted for new naming scheme.
880    
881    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
882    
883            * R/textdoccol.R: Adapted code for new class structure. Wrote
884            several transform and filter functions operating on text document
885            collections (alias text document databases).
886    
887            * R/aobjects.R: Adapted class structure with inheritance,
888            repositories and additional meta data. Loading files on demand is
889            now possible.
890    
891    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
892    
893            * R/: Some cosmetic cleanups.
894    
895            * inst/: Removed vignette on clustering. That and much more is now
896            described in the JSS paper on text mining. Based upon that
897            article an elaborated vignette will be incorporated in the future.
898    
899    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
900    
901            * R/: Updated generic S4 methods to comply with signature changes
902            in newer versions of R (> 2.3)
903    
904    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
905    
906            * ext/R/importRIS.R: Automatic RIS import is now possible.
907    
908    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
909    
910            * R/textdoccol.R: Added RIS HTML input format.
911    
912    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
913    
914            * R/textdoccol.R: Removed bug that caused invalid text document
915            collections when handling many input files.
916    
917  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
918    
919          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1020

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge