SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 37, Wed Jan 11 17:49:17 2006 UTC pkg/ChangeLog revision 1013, Wed Oct 21 12:34:39 2009 UTC
# Line 1  Line 1 
1    2009-10-21  Ingo Feinerer  <feinerer@logic.at>
2    
3            * R/transform.R (removeWords): Improve regular expressions.
4    
5    2009-10-19  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/meta.R (DublinCore): Allow lower case tags.
8    
9    2009-10-09  Ingo Feinerer  <feinerer@logic.at>
10    
11            * R/source.R (GmaneSource, ReutersSource): Use xmlChildren(x)
12            instead of x$children.
13    
14    2009-09-15  Ingo Feinerer  <feinerer@logic.at>
15    
16            * R/preprocess.R (preprocessReut21578XML): Fix generated file names.
17    
18    2009-09-06  Ingo Feinerer  <feinerer@logic.at>
19    
20            * R/: Use S3 instead of S4 class system.
21    
22    2009-08-11  Ingo Feinerer  <feinerer@logic.at>
23    
24            * R/reader.R (readMail): Moved to tm.plugin.mail package.
25    
26    2009-07-04  Ingo Feinerer  <feinerer@logic.at>
27    
28            * R/reader.R (readNewsgroup): Rename to readMail as newsgroup
29            postings are basically e-mails with some extra headers.
30    
31    2009-07-03  Ingo Feinerer  <feinerer@logic.at>
32    
33            * R/transform.R: Move convertMboxEml, removeCitation,
34            removeMultipart, and removeSignature to the tm.plugin.mail package
35            since they are mainly utility functions (for handling e-mails) and
36            not very framework specific.
37    
38    2009-06-28  Ingo Feinerer  <feinerer@logic.at>
39    
40            * man/: Fix documentation.
41    
42    2009-06-26  Ingo Feinerer  <feinerer@logic.at>
43    
44            * R/reader.R (readReut21578XMLasPlain): New reader which returns a
45            plain text document instead of an XML document for texts of the
46            Reuters-21578 dataset.
47    
48            * R/sparse.R: Removed since the slam package is now available on
49            CRAN.
50    
51            * DESCRIPTION (Depends): Add slam package.
52    
53    2009-06-17  Ingo Feinerer  <feinerer@logic.at>
54    
55            * R/transform.R (stemDoc): Fix character(0) handling.
56    
57    2009-06-12  Ingo Feinerer  <feinerer@logic.at>
58    
59            * R/doc.R (show): Pretty print.
60    
61    2009-05-27  Ingo Feinerer  <feinerer@logic.at>
62    
63            * R/matrix.R (print.TermDocumentMatrix): Handle empty matrices
64            gracefully.
65    
66    2009-05-13  Ingo Feinerer  <feinerer@logic.at>
67    
68            * R/corpus.R: Make corpus virtual. Implement corpus with standard
69            and permanent storage semantics.
70    
71            * DESCRIPTION: New major release. A *lot* of improvements.
72    
73    2009-05-04   Ingo Feinerer <feinerer@logic.at>
74    
75            * NAMESPACE: Export some simple_triplet_matrix functions.
76    
77    2009-04-28   Ingo Feinerer <feinerer@logic.at>
78    
79            * R/weight.R: Adapt tf-idf to new matrix format.
80    
81    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
82    
83            * R/matrix.R: Create two distinct classes for term-document and
84            document-term matrices.
85    
86    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/termdocmatrix.R: No longer use Matrix package. This reduces
89            package start-up time significantly.
90    
91    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
92    
93            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
94    
95    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
96    
97            * R/transform.R (tmReduce): Combine multiple maps into one
98            transformation.
99    
100    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
101    
102            * R/weight.R: Remove weightLogical since it does not return a
103            dgCMatrix.
104    
105            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
106            or TermDocumentMatrix instead.
107    
108    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
109    
110            * inst/doc/extensions.Rnw: Finished vignette.
111    
112    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
113    
114            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
115            DocumentTermMatrix representations.
116    
117    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
118    
119            * R/reader.R (readXML): New reader for arbitrary XML files.
120    
121    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
122    
123            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
124            (XMLSource): New XMLSource class for arbitrary XML files.
125            (Source): New slot Vectorized.
126    
127    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
128    
129            * R/reader.R (readTabular): Experimental reader for tabular data
130            structures which can be customized via user-defined mappings.
131    
132            * R/reader.R: Always use UTC time zone.
133    
134            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
135    
136    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
137    
138            * R/reader.R (readDOC): Options can be passed over to antiword.
139    
140            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
141            pdftotext.
142    
143    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
144    
145            * R/source.R (DirSource): Add pattern and ignore.case arguments
146            which are internally passed over to list.files().
147    
148    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
149    
150            * inst/doc/tm.Rnw: Suppress pointless loading message.
151    
152    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
153    
154            * DESCRIPTION: Speed up package loading (via moving packages not
155            strictly necessary for normal operation to Suggests instead of
156            Depends).
157    
158    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
159    
160            * R/reader.R (readNewsgroup): The date format is now configurable.
161    
162    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
163    
164            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
165    
166    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
167    
168            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
169    
170    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
171    
172            * R/source.R (DataframeSource): New source class for data frames.
173    
174            * R/source.R: Fixed non-standard call evaluation.
175    
176    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
177    
178            * R/source.R (URISource): New source class for a single document.
179    
180    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
181    
182            * R/source.R: Refactoring.
183    
184    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
185    
186            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
187            Rmpi installations more gracefully.
188    
189    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
190    
191            * R/source.R (Source): Add Length slot.
192    
193    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
194    
195            * R/AAA.R: Unify duplicated .onLoad function.
196    
197    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
198    
199            * DESCRIPTION (Suggests): Added Rmpi.
200    
201    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
202    
203            * R/source.R (getElem): Fix 'no visible binding' warning.
204    
205            * man/WeightFunction.Rd: Fix signature.
206    
207    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
208    
209            * R/weight.R: Introduce name abbreviations for weighting functions.
210    
211    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
212    
213            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
214    
215            * R/cluster.R: Provide convenience functions for using a MPI
216            cluster.
217    
218            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
219            available.
220    
221            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
222            available.
223    
224    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
225    
226            * R/textdoccol.R (lapply): Removed debug print out.
227    
228    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
229    
230            * R/reader.R (readRCV1): Improved meta data extraction from
231            Reuters Corpus Volume 1 documents.
232    
233    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
234    
235            * R/transform.R: Ensure that all mappings preserve multiline
236            structures.
237    
238    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
239    
240            * R/filter.R: Every filter has now an attribute indicating whether
241            it sould be applied to document level (doclevel).
242    
243            * R/textdoccol.R (tmFilter): Set searchFullText as new default
244            filter.
245    
246    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
247    
248            * R/transform.R (replacePatterns): Replaced removeWords by
249            replacePatterns. Suggested by Christian Buchta.
250    
251            * R/textdoccol.R (inspect): Improved formatting.
252    
253    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
254    
255            * inst/CITATION: Updated JSS article information.
256    
257            * R/textdoccol.R (setAs): Added coerce method from list to
258            corpus.
259    
260            * R/meta.R (meta): Improved meta data handling.
261    
262    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
263    
264            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
265            Christian Buchta.
266    
267            * inst/CITATION: Added template to include JSS article reference.
268    
269    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
270    
271            * R/textdoccol.R (tmMap): Introduced lazy mapping.
272    
273            * R/source.R: Added VectorSource.
274    
275    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
276    
277            * man/: Language codes should be in ISO 639-1 format.
278    
279            * R/textdoccol.R (asPlain): Preserve local meta data.
280    
281    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
282    
283            * R/textdoccol.R (writeCorpus): Function for writing a corpus
284            containing plain text documents to disk.
285    
286    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
287    
288            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
289            always set correctly.
290    
291            * R/textdoccol.R: Set load = TRUE as default for load on demand
292            since in most cases this is the wanted behaviour.
293    
294    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
295    
296            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
297    
298            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
299    
300    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
301    
302            * R/meta.R (meta): New function for consistent access to meta data
303            of document collections, repositories, and texts.
304    
305    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
306    
307            * R/: Better support for encodings.
308    
309    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
310    
311            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
312            selection when no reader argument is given.
313    
314    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
315    
316            * R/source.R (CSVSource): Now uses read.csv instead of scan
317            internally.
318    
319    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
320    
321            * R/reader.R (getReaders): Returns available reader functions.
322    
323            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
324            as default.
325    
326    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * R/stopwords.R (stopwords): Shortened code, removed codetools
329            variable warnings.
330    
331            * man/: Documentation for showMeta, added an example for tmMap.
332    
333            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
334            some minor typos fixed.
335    
336    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
337    
338            * R/aobjects.R (showMeta): Added method for pretty printing a
339            text document's meta data.
340    
341    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
342    
343            * R/textdoccol.R (TextDocCol): Better handling of empty
344            arguments.
345    
346            * NAMESPACE: Exported readDOC.
347    
348            * man/completeStems.Rd: Added an example.
349    
350    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
351    
352            * R/stopwords.R (stopwords): Look up .dat files at every
353            call. Allows users to modify stopword .dat files interactively.
354    
355    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
356    
357            * R/termdocmatrix.R (termFreq): Correct processing of empty
358            documents.
359    
360    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
361    
362            * man/: Updated documentation.
363    
364    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * R/complete.R (completeStems): Completes (heuristically) word
367            stems.
368    
369            * R/termdocmatrix.R (TermDocMatrix2): New modular
370            constructor.
371    
372            * NAMESPACE: Exported termFreq.
373    
374    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
375    
376            * R/reader.R (readDOC): Added MS Word reader (using antiword).
377    
378    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/weight.R: Weighting functions for TermDocMatrix.
381    
382    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
383    
384            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
385            functions for accessing dimension, column, and row names.
386    
387            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
388    
389    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
390    
391            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
392    
393    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
394    
395            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
396    
397    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
398    
399            * R/reader.R (readPDF): Removed manual checks for pdftotext and
400            pdfinfo. The system call gives a warning anyway.
401    
402    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
403    
404            * R/textdoccol.R (asPlain): Conversion from
405            StructuredTextDocuments to PlainTextDocuments.
406    
407    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
410            for accessing term-document matrices.
411    
412            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
413            are installed.
414    
415    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
416    
417            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
418            Christian Buchta.
419    
420    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
421    
422            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
423    
424    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
425    
426            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
427    
428            * R/reader.R (readPDF): Added PDF reader.
429    
430    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
431    
432            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
433    
434            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
435    
436            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
437    
438            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
439    
440    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
441    
442            * R/distmeasure.R (dissimilarity): Replaced dists call from
443            package cba by new dist call from package proxy.
444    
445    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
448    
449    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
450    
451            * R/termdocmatrix.R: require() uses the quietly option to suppress
452            loading messages.
453    
454    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
455    
456            * R/dictionary.R: Added dictionary support.
457    
458    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
459    
460            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
461            documents. This simplifies some functions, e.g., asPlain.
462    
463    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
464    
465            * inst/doc/tm.Rnw: Fixed some typos in vignette.
466    
467    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
468    
469            * R/textdoccol.R (replaceWords): Added method to replace a set of
470            words by a single word. Useful for synonyms.
471    
472    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
473    
474            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
475    
476    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
477    
478            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
479            vectors. Thanks to Ariel Maguyon for his error report.
480            (removeSparseTerms): New function to remove columns from a
481            term-document matrix exceeding a sparse factor.
482    
483    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
484    
485            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
486    
487    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
488    
489            * man/sFilter.Rd: Corrected documentation on statement format (use
490            '==' instead of '=').
491    
492    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
493    
494            * R/aobjects.R (StructuredTextDocument): Inherits from
495            TextDocument.
496    
497    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
498    
499            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
500            on sparse matrices as proposed by Martin Maechler.
501    
502    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
503    
504            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
505            \pkg{filehash} version makes them deprecated.
506    
507    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
508    
509            * R/termdocmatrix.R (textvector): Stemming is now performed before
510            erasing stopwords.
511            (weightMatrix): Adapted to handle sparse matrices.
512            (TermDocMatrix): Sparse matrix is now efficiently built by
513            direct stepwise insertion of row values into it.
514    
515    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
516    
517            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
518            due to ongoing problems. For our purposes the latter is as useful
519            as the replaced package.
520    
521    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
522    
523            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
524    
525            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
526    
527    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
528    
529            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
530            languages with available stopwords.
531    
532    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
533    
534            * inst/doc/tm.Rnw: Minor corrections in the vignette.
535    
536    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
537    
538            * DESCRIPTION: Update to version 0.2, since a lot of new features
539            have been integrated.
540    
541            * inst/stopwords: Updated existing stopwords and added stopwords
542            for various other languages.
543    
544    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
545    
546            * man/: Updated documentation.
547    
548            * Work/testDb.R: Script to test database stuff.
549    
550            * R/: Fixed various database related bugs. Seems to be rather
551            useable now, i.e., consider as alpha status for now.
552    
553    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
554    
555            * R/: Fixed some bugs related to database support.
556    
557    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
558    
559            * man/: Added a lot of examples to the manuals.
560    
561    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
562    
563            * man/: Updated parts of the documentation.
564    
565            * R/textdoccol.R (asPlain): Added conversion from newsgroup
566            documents to plain text documents.
567    
568    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
569    
570            * R/textdoccol.R: Finished experimental database support. Not yet
571            intensively tested.
572    
573            * R/source.R: Now each source has a default reader.
574    
575            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
576            class anymore.
577    
578            * R/plaintextdoc.R: Custom show method for plain text documents.
579    
580            * R/aobjects.R: Added a class for structured text documents.
581    
582            * R/reader.R: Replaced remaining \code{parser} occurrences with
583            \code{reader}.
584    
585            * R/textdoccol.R (summary): Indent tags.
586    
587            * R/textdoccol.R (removePunctuation): Transform method to remove
588            punctuation marks.
589    
590    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
591    
592            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
593            using prescindMeta().
594    
595    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
596    
597            * R/textdoccol.R: Improved database support.
598    
599    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
600    
601            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
602    
603            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
604            language code.
605    
606            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
607            into parserControl argument.
608    
609            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
610    
611    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
612    
613            * Work/tmDataSetup.R: The datasets acq and crude can now be
614            created on the fly.
615    
616            * R/stopwords.R: Introduced a function returning the stopwords for
617            a given language (English, German and French at the moment)
618    
619            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
620            otherwise falls back to Snowball package.
621    
622    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
623    
624            * man/dissimilarity-methods.Rd: Make clear that any method offered
625            by "dists" from package "cba" can be used.
626    
627    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
628    
629            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
630            to Kurt's latex suggestion. Removed points and underscores in
631            variable names for consistent naming.
632    
633            * DESCRIPTION: Update to version 0.1-2.
634    
635            * man/TextRepository.Rd: Fixed bug in documentation.
636    
637    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
638    
639            * DESCRIPTION: Update to version 0.1-1.
640    
641    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
642    
643            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
644            wordStem.
645    
646    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
647    
648            * R/: Changes due to Kurt's review.
649    
650    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
651    
652            * R/: Implemented improvements based upon comments by David
653            Meyer.
654    
655    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
656    
657            * inst/doc/: Rewrote vignette.
658    
659            * man/: Improved documentation.
660    
661    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
662    
663            * man/: Updated documentation.
664    
665            * DESCRIPTION: Changed package name to "tm". Updated version to
666            0.1 for first CRAN release.
667    
668            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
669            list archive example.
670    
671            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
672            archive example.
673    
674            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
675            from (several mails per box) mbox format to (single mail per file)
676            eml format.
677    
678    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
679    
680            * data/crude.rda: Rebuilt.
681    
682            * data/acq.rda: Rebuilt.
683    
684            * R/reader.R: Factored out reader and parser methods from
685            textdoccol.R.
686    
687            * R/source.R: Factored out Source methods from aobjects.R and
688            textdoccol.R.
689            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
690            feeds.
691    
692            * R/textdoccol.R (DirSource): Added support for recursive
693            traversal of directories.
694    
695    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
696    
697            * R/textdoccol.R ([[): Loads the document corpus automatically
698            into memory upon access.
699            (tm_transform, tm_filter): Removed several checks whether the
700            document is already loaded ([[ ensures this now).
701            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
702            mailing list archive.
703    
704    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
705    
706            * R/aobjects.R (TextDocument): Is now a virtual class.
707            (Source): Is now a virtual class.
708    
709    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
710    
711            * R/textdoccol.R (c): Support for an arbitrary number of document
712            collections.
713    
714    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
715    
716            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
717            append_meta and remove_meta.
718    
719            * R/textdoccol.R: Removed modify_metadata method.
720    
721            * R/textrepo.R: Removed modify_metadata method.
722    
723            * R/textdoccol.R (remove_meta): Supports removal of document
724            collection metadata and document (= in data frame) metadata.
725    
726    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
727    
728            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
729    
730            * data/crude.rda: Rebuilt.
731    
732            * data/acq.rda: Rebuilt.
733    
734            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
735    
736            * R/textdoccol.R ([): Bug fix for subsetting a document
737            collection's data frame.
738    
739    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
740    
741            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
742            to s_filter.
743    
744            * R/textdoccol.R: Local text documents' metadata can now be copied
745            to a document collection's data frame with prescind_meta.
746    
747    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
748    
749            * R/: Text documents' slot metadata is now accessible in s_filter.
750    
751            * R/: Rewrote s_filter function (has still some restrictions).
752    
753    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
754    
755            * R/: Various fixes in handling metadata.
756    
757            * R/: Added update mechanism for text document collections.
758    
759    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
760    
761            * R/: Merging of document collections now creates a binary tree
762            for reconstructing merged document collections.
763    
764            * R/: Redesign of metadata for document collections.
765    
766    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
767    
768            * R/: Messages now use \code{ngettext}.
769    
770    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
771    
772            * R/: Added functions for modifying and removing metadata.
773    
774    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
775    
776            * man/: Updated some documentation.
777    
778            * R/: Corrected some connection issues.
779    
780            * inst/doc: Worked on the vignette.
781    
782    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
783    
784            * inst/: Added texts and started vignette.
785    
786            * R/: Final changes based upon David's comments.
787    
788    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
789    
790            * NAMESPACE: Corrected exports (generic methods need exportMethods
791            directives!).
792    
793    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
794    
795            * R/: Modified the TextDocCol constructur and various parsers. It
796            is now modular and supports various file formats via plugins (see
797            the new "Source" class).
798    
799    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
800    
801            * man/: Revised documentation after previous code changes.
802    
803    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
804    
805            * R/: Remaining changes as discussed with David.
806    
807    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
808    
809            * R/: Some changes as suggested by David. The rest will follow
810            within the next days.
811    
812    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
813    
814            * man/: Finished documentation.
815    
816    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
817    
818            * man/: Wrote some documentation.
819    
820    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
821    
822            * R/: Further syntactic sugar in form of additional assignment and
823            accessor methods.
824    
825    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
826    
827            * R/: Syntactic sugar in form of "length", "show" and "summary"
828            operators.
829    
830    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
831    
832            * R/: Diverse updates. Mainly on default operators ("[" or "c")
833            and dissimilarities.
834    
835    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
836    
837            * R/: Added similarity functions.
838    
839            * data/: Added english stopwords.
840    
841    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
842    
843            * data/: Examples compiled for new features
844    
845            * R/: Changes due to new structure.
846    
847            * NAMESPACE: Corrected namespace to reflect new structure.
848    
849            * R/termdocmatrix.R: Adapted for new naming scheme.
850    
851    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
852    
853            * R/textdoccol.R: Adapted code for new class structure. Wrote
854            several transform and filter functions operating on text document
855            collections (alias text document databases).
856    
857            * R/aobjects.R: Adapted class structure with inheritance,
858            repositories and additional meta data. Loading files on demand is
859            now possible.
860    
861    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
862    
863            * R/: Some cosmetic cleanups.
864    
865            * inst/: Removed vignette on clustering. That and much more is now
866            described in the JSS paper on text mining. Based upon that
867            article an elaborated vignette will be incorporated in the future.
868    
869    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
870    
871            * R/: Updated generic S4 methods to comply with signature changes
872            in newer versions of R (> 2.3)
873    
874    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
875    
876            * ext/R/importRIS.R: Automatic RIS import is now possible.
877    
878    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
879    
880            * R/textdoccol.R: Added RIS HTML input format.
881    
882    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
883    
884            * R/textdoccol.R: Removed bug that caused invalid text document
885            collections when handling many input files.
886    
887  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
888    
889          * R/textdoccol.R: Restructured and extended file import          * R/textdoccol.R: Restructured and extended file import

Legend:
Removed from v.37  
changed lines
  Added in v.1013

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge