SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1015, Sat Nov 7 11:15:19 2009 UTC
# Line 1  Line 1 
1    2009-11-07  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/matrix.R (TermDocumentMatrix): Avoid prefixes originating from
4            named documents.
5    
6    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
7    
8            * R/transform.R (removeWords): Improve regular expressions.
9    
10    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/meta.R (DublinCore): Allow lower case tags.
13    
14    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
17            instead of x$children.
18    
19    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
20    
21            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
22    
23    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
24    
25            * R/: Use S3 instead of S4 class system.
26    
27    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
28    
29            * R/reader.R (readMail): Moved to tm.plugin.mail package.
30    
31    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
32    
33            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
34            postings are basically e-mails with some extra headers.
35    
36    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
37    
38            * R/transform.R: Move convertMboxEml, removeCitation,
39            removeMultipart, and removeSignature to the tm.plugin.mail package
40            since they are mainly utility functions (for handling e-mails) and
41            not very framework specific.
42    
43    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
44    
45            * man/: Fix documentation.
46    
47    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
48    
49            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
50            plain text document instead of an XML document for texts of the
51            Reuters-21578 dataset.
52    
53            * R/sparse.R: Removed since the slam package is now available on
54            CRAN.
55    
56            * DESCRIPTION (Depends): Add slam package.
57    
58    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
59    
60            * R/transform.R (stemDoc): Fix character(0) handling.
61    
62    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
63    
64            * R/doc.R (show): Pretty print.
65    
66    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
67    
68            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
69            gracefully.
70    
71    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
72    
73            * R/corpus.R: Make corpus virtual. Implement corpus with standard
74            and permanent storage semantics.
75    
76            * DESCRIPTION: New major release. A *lot* of improvements.
77    
78    2009-05-04   Ingo Feinerer <feinerer@logic.at>
79    
80            * NAMESPACE: Export some simple_triplet_matrix functions.
81    
82    2009-04-28   Ingo Feinerer <feinerer@logic.at>
83    
84            * R/weight.R: Adapt tf-idf to new matrix format.
85    
86    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/matrix.R: Create two distinct classes for term-document and
89            document-term matrices.
90    
91    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
92    
93            * R/termdocmatrix.R: No longer use Matrix package. This reduces
94            package start-up time significantly.
95    
96    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
97    
98            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
99    
100    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
101    
102            * R/transform.R (tmReduce): Combine multiple maps into one
103            transformation.
104    
105    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
106    
107            * R/weight.R: Remove weightLogical since it does not return a
108            dgCMatrix.
109    
110            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
111            or TermDocumentMatrix instead.
112    
113    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
114    
115            * inst/doc/extensions.Rnw: Finished vignette.
116    
117    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
118    
119            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
120            DocumentTermMatrix representations.
121    
122    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
123    
124            * R/reader.R (readXML): New reader for arbitrary XML files.
125    
126    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
127    
128            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
129            (XMLSource): New XMLSource class for arbitrary XML files.
130            (Source): New slot Vectorized.
131    
132    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
133    
134            * R/reader.R (readTabular): Experimental reader for tabular data
135            structures which can be customized via user-defined mappings.
136    
137            * R/reader.R: Always use UTC time zone.
138    
139            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
140    
141    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
142    
143            * R/reader.R (readDOC): Options can be passed over to antiword.
144    
145            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
146            pdftotext.
147    
148    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
149    
150            * R/source.R (DirSource): Add pattern and ignore.case arguments
151            which are internally passed over to list.files().
152    
153    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
154    
155            * inst/doc/tm.Rnw: Suppress pointless loading message.
156    
157    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
158    
159            * DESCRIPTION: Speed up package loading (via moving packages not
160            strictly necessary for normal operation to Suggests instead of
161            Depends).
162    
163    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
164    
165            * R/reader.R (readNewsgroup): The date format is now configurable.
166    
167    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
168    
169            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
170    
171    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
172    
173            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
174    
175    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
176    
177            * R/source.R (DataframeSource): New source class for data frames.
178    
179            * R/source.R: Fixed non-standard call evaluation.
180    
181    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
182    
183            * R/source.R (URISource): New source class for a single document.
184    
185    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
186    
187            * R/source.R: Refactoring.
188    
189    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
190    
191            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
192            Rmpi installations more gracefully.
193    
194    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
195    
196            * R/source.R (Source): Add Length slot.
197    
198    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
199    
200            * R/AAA.R: Unify duplicated .onLoad function.
201    
202    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
203    
204            * DESCRIPTION (Suggests): Added Rmpi.
205    
206    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
207    
208            * R/source.R (getElem): Fix 'no visible binding' warning.
209    
210            * man/WeightFunction.Rd: Fix signature.
211    
212    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
213    
214            * R/weight.R: Introduce name abbreviations for weighting functions.
215    
216    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
217    
218            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
219    
220            * R/cluster.R: Provide convenience functions for using a MPI
221            cluster.
222    
223            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
224            available.
225    
226            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
227            available.
228    
229    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
230    
231            * R/textdoccol.R (lapply): Removed debug print out.
232    
233    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
234    
235            * R/reader.R (readRCV1): Improved meta data extraction from
236            Reuters Corpus Volume 1 documents.
237    
238    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
239    
240            * R/transform.R: Ensure that all mappings preserve multiline
241            structures.
242    
243    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
244    
245            * R/filter.R: Every filter has now an attribute indicating whether
246            it sould be applied to document level (doclevel).
247    
248            * R/textdoccol.R (tmFilter): Set searchFullText as new default
249            filter.
250    
251    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
252    
253            * R/transform.R (replacePatterns): Replaced removeWords by
254            replacePatterns. Suggested by Christian Buchta.
255    
256            * R/textdoccol.R (inspect): Improved formatting.
257    
258    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
259    
260            * inst/CITATION: Updated JSS article information.
261    
262            * R/textdoccol.R (setAs): Added coerce method from list to
263            corpus.
264    
265            * R/meta.R (meta): Improved meta data handling.
266    
267    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
268    
269            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
270            Christian Buchta.
271    
272            * inst/CITATION: Added template to include JSS article reference.
273    
274    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
275    
276            * R/textdoccol.R (tmMap): Introduced lazy mapping.
277    
278            * R/source.R: Added VectorSource.
279    
280    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
281    
282            * man/: Language codes should be in ISO 639-1 format.
283    
284            * R/textdoccol.R (asPlain): Preserve local meta data.
285    
286    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
287    
288            * R/textdoccol.R (writeCorpus): Function for writing a corpus
289            containing plain text documents to disk.
290    
291    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
292    
293            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
294            always set correctly.
295    
296            * R/textdoccol.R: Set load = TRUE as default for load on demand
297            since in most cases this is the wanted behaviour.
298    
299    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
300    
301            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
302    
303            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
304    
305    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
306    
307            * R/meta.R (meta): New function for consistent access to meta data
308            of document collections, repositories, and texts.
309    
310    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
311    
312            * R/: Better support for encodings.
313    
314    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
317            selection when no reader argument is given.
318    
319    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/source.R (CSVSource): Now uses read.csv instead of scan
322            internally.
323    
324    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
325    
326            * R/reader.R (getReaders): Returns available reader functions.
327    
328            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
329            as default.
330    
331    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
332    
333            * R/stopwords.R (stopwords): Shortened code, removed codetools
334            variable warnings.
335    
336            * man/: Documentation for showMeta, added an example for tmMap.
337    
338            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
339            some minor typos fixed.
340    
341    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
342    
343            * R/aobjects.R (showMeta): Added method for pretty printing a
344            text document's meta data.
345    
346    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
347    
348            * R/textdoccol.R (TextDocCol): Better handling of empty
349            arguments.
350    
351            * NAMESPACE: Exported readDOC.
352    
353            * man/completeStems.Rd: Added an example.
354    
355    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/stopwords.R (stopwords): Look up .dat files at every
358            call. Allows users to modify stopword .dat files interactively.
359    
360    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * R/termdocmatrix.R (termFreq): Correct processing of empty
363            documents.
364    
365    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
366    
367            * man/: Updated documentation.
368    
369    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
370    
371            * R/complete.R (completeStems): Completes (heuristically) word
372            stems.
373    
374            * R/termdocmatrix.R (TermDocMatrix2): New modular
375            constructor.
376    
377            * NAMESPACE: Exported termFreq.
378    
379    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
380    
381            * R/reader.R (readDOC): Added MS Word reader (using antiword).
382    
383    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
384    
385            * R/weight.R: Weighting functions for TermDocMatrix.
386    
387    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
388    
389            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
390            functions for accessing dimension, column, and row names.
391    
392            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
393    
394    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
395    
396            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
397    
398    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
399    
400            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
401    
402    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * R/reader.R (readPDF): Removed manual checks for pdftotext and
405            pdfinfo. The system call gives a warning anyway.
406    
407    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/textdoccol.R (asPlain): Conversion from
410            StructuredTextDocuments to PlainTextDocuments.
411    
412    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
413    
414            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
415            for accessing term-document matrices.
416    
417            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
418            are installed.
419    
420    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
423            Christian Buchta.
424    
425    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
426    
427            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
428    
429    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
430    
431            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
432    
433            * R/reader.R (readPDF): Added PDF reader.
434    
435    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
436    
437            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
438    
439            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
440    
441            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
442    
443            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
444    
445    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * R/distmeasure.R (dissimilarity): Replaced dists call from
448            package cba by new dist call from package proxy.
449    
450    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
451    
452            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
453    
454    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/termdocmatrix.R: require() uses the quietly option to suppress
457            loading messages.
458    
459    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
460    
461            * R/dictionary.R: Added dictionary support.
462    
463    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
466            documents. This simplifies some functions, e.g., asPlain.
467    
468    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
469    
470            * inst/doc/tm.Rnw: Fixed some typos in vignette.
471    
472    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * R/textdoccol.R (replaceWords): Added method to replace a set of
475            words by a single word. Useful for synonyms.
476    
477    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
480    
481    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
484            vectors. Thanks to Ariel Maguyon for his error report.
485            (removeSparseTerms): New function to remove columns from a
486            term-document matrix exceeding a sparse factor.
487    
488    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
489    
490            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
491    
492    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
493    
494            * man/sFilter.Rd: Corrected documentation on statement format (use
495            '==' instead of '=').
496    
497    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * R/aobjects.R (StructuredTextDocument): Inherits from
500            TextDocument.
501    
502    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
505            on sparse matrices as proposed by Martin Maechler.
506    
507    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
508    
509            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
510            \pkg{filehash} version makes them deprecated.
511    
512    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
513    
514            * R/termdocmatrix.R (textvector): Stemming is now performed before
515            erasing stopwords.
516            (weightMatrix): Adapted to handle sparse matrices.
517            (TermDocMatrix): Sparse matrix is now efficiently built by
518            direct stepwise insertion of row values into it.
519    
520    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
521    
522            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
523            due to ongoing problems. For our purposes the latter is as useful
524            as the replaced package.
525    
526    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
527    
528            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
529    
530            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
531    
532    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
533    
534            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
535            languages with available stopwords.
536    
537    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
538    
539            * inst/doc/tm.Rnw: Minor corrections in the vignette.
540    
541    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
542    
543            * DESCRIPTION: Update to version 0.2, since a lot of new features
544            have been integrated.
545    
546            * inst/stopwords: Updated existing stopwords and added stopwords
547            for various other languages.
548    
549    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
550    
551            * man/: Updated documentation.
552    
553            * Work/testDb.R: Script to test database stuff.
554    
555            * R/: Fixed various database related bugs. Seems to be rather
556            useable now, i.e., consider as alpha status for now.
557    
558    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
559    
560            * R/: Fixed some bugs related to database support.
561    
562    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
563    
564            * man/: Added a lot of examples to the manuals.
565    
566    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
567    
568            * man/: Updated parts of the documentation.
569    
570            * R/textdoccol.R (asPlain): Added conversion from newsgroup
571            documents to plain text documents.
572    
573    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
574    
575            * R/textdoccol.R: Finished experimental database support. Not yet
576            intensively tested.
577    
578            * R/source.R: Now each source has a default reader.
579    
580            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
581            class anymore.
582    
583            * R/plaintextdoc.R: Custom show method for plain text documents.
584    
585            * R/aobjects.R: Added a class for structured text documents.
586    
587            * R/reader.R: Replaced remaining \code{parser} occurrences with
588            \code{reader}.
589    
590            * R/textdoccol.R (summary): Indent tags.
591    
592            * R/textdoccol.R (removePunctuation): Transform method to remove
593            punctuation marks.
594    
595    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
596    
597            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
598            using prescindMeta().
599    
600    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
601    
602            * R/textdoccol.R: Improved database support.
603    
604    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
605    
606            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
607    
608            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
609            language code.
610    
611            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
612            into parserControl argument.
613    
614            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
615    
616    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
617    
618            * Work/tmDataSetup.R: The datasets acq and crude can now be
619            created on the fly.
620    
621            * R/stopwords.R: Introduced a function returning the stopwords for
622            a given language (English, German and French at the moment)
623    
624            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
625            otherwise falls back to Snowball package.
626    
627    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
628    
629            * man/dissimilarity-methods.Rd: Make clear that any method offered
630            by "dists" from package "cba" can be used.
631    
632    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
633    
634            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
635            to Kurt's latex suggestion. Removed points and underscores in
636            variable names for consistent naming.
637    
638            * DESCRIPTION: Update to version 0.1-2.
639    
640            * man/TextRepository.Rd: Fixed bug in documentation.
641    
642    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
643    
644            * DESCRIPTION: Update to version 0.1-1.
645    
646    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
647    
648            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
649            wordStem.
650    
651    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
652    
653            * R/: Changes due to Kurt's review.
654    
655    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
656    
657            * R/: Implemented improvements based upon comments by David
658            Meyer.
659    
660    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
661    
662            * inst/doc/: Rewrote vignette.
663    
664            * man/: Improved documentation.
665    
666    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
667    
668            * man/: Updated documentation.
669    
670            * DESCRIPTION: Changed package name to "tm". Updated version to
671            0.1 for first CRAN release.
672    
673            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
674            list archive example.
675    
676            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
677            archive example.
678    
679            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
680            from (several mails per box) mbox format to (single mail per file)
681            eml format.
682    
683    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * data/crude.rda: Rebuilt.
686    
687            * data/acq.rda: Rebuilt.
688    
689            * R/reader.R: Factored out reader and parser methods from
690            textdoccol.R.
691    
692            * R/source.R: Factored out Source methods from aobjects.R and
693            textdoccol.R.
694            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
695            feeds.
696    
697            * R/textdoccol.R (DirSource): Added support for recursive
698            traversal of directories.
699    
700    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
701    
702            * R/textdoccol.R ([[): Loads the document corpus automatically
703            into memory upon access.
704            (tm_transform, tm_filter): Removed several checks whether the
705            document is already loaded ([[ ensures this now).
706            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
707            mailing list archive.
708    
709    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
710    
711            * R/aobjects.R (TextDocument): Is now a virtual class.
712            (Source): Is now a virtual class.
713    
714    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
715    
716            * R/textdoccol.R (c): Support for an arbitrary number of document
717            collections.
718    
719    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
720    
721            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
722            append_meta and remove_meta.
723    
724            * R/textdoccol.R: Removed modify_metadata method.
725    
726            * R/textrepo.R: Removed modify_metadata method.
727    
728            * R/textdoccol.R (remove_meta): Supports removal of document
729            collection metadata and document (= in data frame) metadata.
730    
731    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
732    
733            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
734    
735            * data/crude.rda: Rebuilt.
736    
737            * data/acq.rda: Rebuilt.
738    
739            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
740    
741            * R/textdoccol.R ([): Bug fix for subsetting a document
742            collection's data frame.
743    
744    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
745    
746            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
747            to s_filter.
748    
749            * R/textdoccol.R: Local text documents' metadata can now be copied
750            to a document collection's data frame with prescind_meta.
751    
752    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
753    
754            * R/: Text documents' slot metadata is now accessible in s_filter.
755    
756            * R/: Rewrote s_filter function (has still some restrictions).
757    
758    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
759    
760            * R/: Various fixes in handling metadata.
761    
762            * R/: Added update mechanism for text document collections.
763    
764    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
765    
766            * R/: Merging of document collections now creates a binary tree
767            for reconstructing merged document collections.
768    
769            * R/: Redesign of metadata for document collections.
770    
771    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
772    
773            * R/: Messages now use \code{ngettext}.
774    
775    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
776    
777            * R/: Added functions for modifying and removing metadata.
778    
779    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
780    
781            * man/: Updated some documentation.
782    
783            * R/: Corrected some connection issues.
784    
785            * inst/doc: Worked on the vignette.
786    
787    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
788    
789            * inst/: Added texts and started vignette.
790    
791            * R/: Final changes based upon David's comments.
792    
793    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
794    
795            * NAMESPACE: Corrected exports (generic methods need exportMethods
796            directives!).
797    
798    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
799    
800            * R/: Modified the TextDocCol constructur and various parsers. It
801            is now modular and supports various file formats via plugins (see
802            the new "Source" class).
803    
804    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
805    
806            * man/: Revised documentation after previous code changes.
807    
808    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
809    
810            * R/: Remaining changes as discussed with David.
811    
812    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
813    
814            * R/: Some changes as suggested by David. The rest will follow
815            within the next days.
816    
817    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
818    
819            * man/: Finished documentation.
820    
821    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
822    
823            * man/: Wrote some documentation.
824    
825    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
826    
827            * R/: Further syntactic sugar in form of additional assignment and
828            accessor methods.
829    
830    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
831    
832            * R/: Syntactic sugar in form of "length", "show" and "summary"
833            operators.
834    
835    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
836    
837            * R/: Diverse updates. Mainly on default operators ("[" or "c")
838            and dissimilarities.
839    
840    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
841    
842            * R/: Added similarity functions.
843    
844            * data/: Added english stopwords.
845    
846    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
847    
848            * data/: Examples compiled for new features
849    
850            * R/: Changes due to new structure.
851    
852            * NAMESPACE: Corrected namespace to reflect new structure.
853    
854            * R/termdocmatrix.R: Adapted for new naming scheme.
855    
856    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
857    
858            * R/textdoccol.R: Adapted code for new class structure. Wrote
859            several transform and filter functions operating on text document
860            collections (alias text document databases).
861    
862            * R/aobjects.R: Adapted class structure with inheritance,
863            repositories and additional meta data. Loading files on demand is
864            now possible.
865    
866    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
867    
868            * R/: Some cosmetic cleanups.
869    
870            * inst/: Removed vignette on clustering. That and much more is now
871            described in the JSS paper on text mining. Based upon that
872            article an elaborated vignette will be incorporated in the future.
873    
874    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
875    
876            * R/: Updated generic S4 methods to comply with signature changes
877            in newer versions of R (> 2.3)
878    
879    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
880    
881            * ext/R/importRIS.R: Automatic RIS import is now possible.
882    
883    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
884    
885            * R/textdoccol.R: Added RIS HTML input format.
886    
887    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889            * R/textdoccol.R: Removed bug that caused invalid text document
890            collections when handling many input files.
891    
892  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
893    
894          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1015

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge