SCM

SCM Repository

[tm] Diff of /pkg/ChangeLog
ViewVC logotype

Diff of /pkg/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/R/trunk/ChangeLog revision 17, Sat Nov 5 14:47:12 2005 UTC pkg/ChangeLog revision 942, Tue Apr 28 11:02:24 2009 UTC
# Line 1  Line 1 
1    2009-04-28   Ingo Feinerer <feinerer@logic.at>
2    
3            * R/weight.R: Adapt tf-idf to new matrix format.
4    
5    2009-04-27  Ingo Feinerer  <feinerer@logic.at>
6    
7            * R/matrix.R: Create two distinct classes for term-document and
8            document-term matrices.
9    
10    2009-04-26  Ingo Feinerer  <feinerer@logic.at>
11    
12            * R/termdocmatrix.R: No longer use Matrix package. This reduces
13            package start-up time significantly.
14    
15    2009-04-11  Ingo Feinerer  <feinerer@logic.at>
16    
17            * inst/doc/tm.Rnw: Fix code/documentation mismatch.
18    
19    2009-04-04  Ingo Feinerer  <feinerer@logic.at>
20    
21            * R/transform.R (tmReduce): Combine multiple maps into one
22            transformation.
23    
24    2009-04-03  Ingo Feinerer  <feinerer@logic.at>
25    
26            * R/weight.R: Remove weightLogical since it does not return a
27            dgCMatrix.
28    
29            * R/termdocmatrix.R: Removed TermDocMatrix. Use DocumentTermMatrix
30            or TermDocumentMatrix instead.
31    
32    2009-03-28  Ingo Feinerer  <feinerer@logic.at>
33    
34            * inst/doc/extensions.Rnw: Finished vignette.
35    
36    2009-03-27  Ingo Feinerer  <feinerer@logic.at>
37    
38            * R/termdocmatrix.R: Start to work on new TermDocumentMatrix and
39            DocumentTermMatrix representations.
40    
41    2009-03-23  Ingo Feinerer  <feinerer@logic.at>
42    
43            * R/reader.R (readXML): New reader for arbitrary XML files.
44    
45    2009-03-22  Ingo Feinerer  <feinerer@logic.at>
46    
47            * R/source.R (CSVSource): Defunct (use DataframeSource instead).
48            (XMLSource): New XMLSource class for arbitrary XML files.
49            (Source): New slot Vectorized.
50    
51    2009-03-21  Ingo Feinerer  <feinerer@logic.at>
52    
53            * R/reader.R (readTabular): Experimental reader for tabular data
54            structures which can be customized via user-defined mappings.
55    
56            * R/reader.R: Always use UTC time zone.
57    
58            * R/AAA.R (.onLoad): No longer try to start a MPI cluster.
59    
60    2009-03-20  Ingo Feinerer  <feinerer@logic.at>
61    
62            * R/reader.R (readDOC): Options can be passed over to antiword.
63    
64            * R/reader.R (readPDF): Options can be passed over to pdfinfo and
65            pdftotext.
66    
67    2009-03-10  Ingo Feinerer  <feinerer@logic.at>
68    
69            * R/source.R (DirSource): Add pattern and ignore.case arguments
70            which are internally passed over to list.files().
71    
72    2009-03-02  Ingo Feinerer  <feinerer@logic.at>
73    
74            * inst/doc/tm.Rnw: Suppress pointless loading message.
75    
76    2009-01-29  Ingo Feinerer  <feinerer@logic.at>
77    
78            * DESCRIPTION: Speed up package loading (via moving packages not
79            strictly necessary for normal operation to Suggests instead of
80            Depends).
81    
82    2009-01-08  Ingo Feinerer  <feinerer@logic.at>
83    
84            * R/reader.R (readNewsgroup): The date format is now configurable.
85    
86    2008-12-20  Ingo Feinerer  <feinerer@logic.at>
87    
88            * R/preprocess.R (convertMboxEml): Fix off-by-one error.
89    
90    2008-12-16  Ingo Feinerer  <feinerer@logic.at>
91    
92            * R/termdocmatrix.R (TermDocMatrix): Sort row indices.
93    
94    2008-12-06  Ingo Feinerer  <feinerer@logic.at>
95    
96            * R/source.R (DataframeSource): New source class for data frames.
97    
98            * R/source.R: Fixed non-standard call evaluation.
99    
100    2008-11-29  Ingo Feinerer  <feinerer@logic.at>
101    
102            * R/source.R (URISource): New source class for a single document.
103    
104    2008-11-27  Ingo Feinerer  <feinerer@logic.at>
105    
106            * R/source.R: Refactoring.
107    
108    2008-11-25  Ingo Feinerer  <feinerer@logic.at>
109    
110            * R/AAA.R (.onLoad, .Last): Use tryCatch() to handle misconfigured
111            Rmpi installations more gracefully.
112    
113    2008-11-08  Ingo Feinerer  <feinerer@logic.at>
114    
115            * R/source.R (Source): Add Length slot.
116    
117    2008-11-06  Ingo Feinerer  <feinerer@logic.at>
118    
119            * R/AAA.R: Unify duplicated .onLoad function.
120    
121    2008-11-03  Ingo Feinerer  <feinerer@logic.at>
122    
123            * DESCRIPTION (Suggests): Added Rmpi.
124    
125    2008-11-02  Ingo Feinerer  <feinerer@logic.at>
126    
127            * R/source.R (getElem): Fix 'no visible binding' warning.
128    
129            * man/WeightFunction.Rd: Fix signature.
130    
131    2008-08-03  Ingo Feinerer  <feinerer@logic.at>
132    
133            * R/weight.R: Introduce name abbreviations for weighting functions.
134    
135    2008-07-24  Ingo Feinerer  <feinerer@logic.at>
136    
137            * R/AAA.R (.onLoad, .Last): Start and stop MPI cluster.
138    
139            * R/cluster.R: Provide convenience functions for using a MPI
140            cluster.
141    
142            * R/termdocmatrix.R (TermDocMatrix): Use MPI cluster if
143            available.
144    
145            * R/textdoccol.R (tmIndex, tmFilter, tmMap): Use MPI cluster if
146            available.
147    
148    2008-07-17  Ingo Feinerer  <feinerer@logic.at>
149    
150            * R/textdoccol.R (lapply): Removed debug print out.
151    
152    2008-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
153    
154            * R/reader.R (readRCV1): Improved meta data extraction from
155            Reuters Corpus Volume 1 documents.
156    
157    2008-05-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
158    
159            * R/transform.R: Ensure that all mappings preserve multiline
160            structures.
161    
162    2008-05-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
163    
164            * R/filter.R: Every filter has now an attribute indicating whether
165            it sould be applied to document level (doclevel).
166    
167            * R/textdoccol.R (tmFilter): Set searchFullText as new default
168            filter.
169    
170    2008-04-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
171    
172            * R/transform.R (replacePatterns): Replaced removeWords by
173            replacePatterns. Suggested by Christian Buchta.
174    
175            * R/textdoccol.R (inspect): Improved formatting.
176    
177    2008-04-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
178    
179            * inst/CITATION: Updated JSS article information.
180    
181            * R/textdoccol.R (setAs): Added coerce method from list to
182            corpus.
183    
184            * R/meta.R (meta): Improved meta data handling.
185    
186    2008-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
187    
188            * R/textdoccol.R (materialize, tmMap): Improvements suggested by
189            Christian Buchta.
190    
191            * inst/CITATION: Added template to include JSS article reference.
192    
193    2008-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
194    
195            * R/textdoccol.R (tmMap): Introduced lazy mapping.
196    
197            * R/source.R: Added VectorSource.
198    
199    2008-02-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
200    
201            * man/: Language codes should be in ISO 639-1 format.
202    
203            * R/textdoccol.R (asPlain): Preserve local meta data.
204    
205    2008-01-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
206    
207            * R/textdoccol.R (writeCorpus): Function for writing a corpus
208            containing plain text documents to disk.
209    
210    2008-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
211    
212            * R/termdocmatrix.R (TermDocMatrix): Ensure that dimnames are
213            always set correctly.
214    
215            * R/textdoccol.R: Set load = TRUE as default for load on demand
216            since in most cases this is the wanted behaviour.
217    
218    2008-01-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
219    
220            * R/: Renamed TextDocCol to Corpus, and Corpus to Content.
221    
222            * DESCRIPTION: Updated Version to 0.3 due to core name changes.
223    
224    2008-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
225    
226            * R/meta.R (meta): New function for consistent access to meta data
227            of document collections, repositories, and texts.
228    
229    2008-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
230    
231            * R/: Better support for encodings.
232    
233    2008-01-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
234    
235            * R/textdoccol.R (TextDocCol): Fixed bug regarding default reader
236            selection when no reader argument is given.
237    
238    2008-01-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
239    
240            * R/source.R (CSVSource): Now uses read.csv instead of scan
241            internally.
242    
243    2008-01-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
244    
245            * R/reader.R (getReaders): Returns available reader functions.
246    
247            * R/termdocmatrix.R (TermDocMatrix): Set new modular constructor
248            as default.
249    
250    2007-12-02  Ingo Feinerer  <h0125130@wu-wien.ac.at>
251    
252            * R/stopwords.R (stopwords): Shortened code, removed codetools
253            variable warnings.
254    
255            * man/: Documentation for showMeta, added an example for tmMap.
256    
257            * inst/doc/tm.Rnw: Updated vignette, comments on MS word reader,
258            some minor typos fixed.
259    
260    2007-12-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
261    
262            * R/aobjects.R (showMeta): Added method for pretty printing a
263            text document's meta data.
264    
265    2007-11-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
266    
267            * R/textdoccol.R (TextDocCol): Better handling of empty
268            arguments.
269    
270            * NAMESPACE: Exported readDOC.
271    
272            * man/completeStems.Rd: Added an example.
273    
274    2007-11-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
275    
276            * R/stopwords.R (stopwords): Look up .dat files at every
277            call. Allows users to modify stopword .dat files interactively.
278    
279    2007-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
280    
281            * R/termdocmatrix.R (termFreq): Correct processing of empty
282            documents.
283    
284    2007-10-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
285    
286            * man/: Updated documentation.
287    
288    2007-10-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
289    
290            * R/complete.R (completeStems): Completes (heuristically) word
291            stems.
292    
293            * R/termdocmatrix.R (TermDocMatrix2): New modular
294            constructor.
295    
296            * NAMESPACE: Exported termFreq.
297    
298    2007-10-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
299    
300            * R/reader.R (readDOC): Added MS Word reader (using antiword).
301    
302    2007-10-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
303    
304            * R/weight.R: Weighting functions for TermDocMatrix.
305    
306    2007-10-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
307    
308            * R/termdocmatrix.R (dimnames, colnames, rownames): Wrapper
309            functions for accessing dimension, column, and row names.
310    
311            * R/plot.R (plot.TermDocMatrix): Plot correlations between terms.
312    
313    2007-09-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
314    
315            * man/removePunctuation.Rd: Added documentation. Function also exported to NAMESPACE.
316    
317    2007-08-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
318    
319            * R/fungen.R: Use S4 class for function generators instead of S3 attributes.
320    
321    2007-07-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
322    
323            * R/reader.R (readPDF): Removed manual checks for pdftotext and
324            pdfinfo. The system call gives a warning anyway.
325    
326    2007-07-28  Ingo Feinerer  <h0125130@wu-wien.ac.at>
327    
328            * R/textdoccol.R (asPlain): Conversion from
329            StructuredTextDocuments to PlainTextDocuments.
330    
331    2007-07-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
332    
333            * R/termdocmatrix.R: Added convenience methods ("[", nrow, ncol)
334            for accessing term-document matrices.
335    
336            * inst/doc/tm.Rnw: readPDF is only called if pdftotext and pdfinfo
337            are installed.
338    
339    2007-07-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
340    
341            * R/termdocmatrix.R (TermDocMatrix): Improved efficiency. Kudos to
342            Christian Buchta.
343    
344    2007-07-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
345    
346            * inst/doc/tm.Rnw: Update vignette (readPDF, readHTML, preprocessReut21578XML).
347    
348    2007-07-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
349    
350            * R/reader.R (readHTML): Added very simple HTML reader to obtain StructuredTextDocuments.
351    
352            * R/reader.R (readPDF): Added PDF reader.
353    
354    2007-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
355    
356            * DESCRIPTION: Moved proxy from Depends to Imports to avoid name clashes.
357    
358            * inst/stopwords/english.dat: Added the term "yes" to stopwords.
359    
360            * R/termdocmatrix.R (dim): dim function for TermDocMatrix.
361    
362            * R/preprocess.R (convertMboxEml): Accepts gzipped mboxes.
363    
364    2007-07-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
365    
366            * R/distmeasure.R (dissimilarity): Replaced dists call from
367            package cba by new dist call from package proxy.
368    
369    2007-07-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
370    
371            * inst/doc/tm.Rnw: Described removeSparseTerms and Dictionary.
372    
373    2007-06-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
374    
375            * R/termdocmatrix.R: require() uses the quietly option to suppress
376            loading messages.
377    
378    2007-06-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
379    
380            * R/dictionary.R: Added dictionary support.
381    
382    2007-06-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
383    
384            * R/aobjects.R: Added classes for Reuters21578 XML and RCV1
385            documents. This simplifies some functions, e.g., asPlain.
386    
387    2007-06-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
388    
389            * inst/doc/tm.Rnw: Fixed some typos in vignette.
390    
391    2007-06-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
392    
393            * R/textdoccol.R (replaceWords): Added method to replace a set of
394            words by a single word. Useful for synonyms.
395    
396    2007-05-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
397    
398            * man/TermDocMatrix.Rd: Fixed documentation on Data slot.
399    
400    2007-05-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
401    
402            * R/termdocmatrix.R (textvector): Small fix for dealing with empty
403            vectors. Thanks to Ariel Maguyon for his error report.
404            (removeSparseTerms): New function to remove columns from a
405            term-document matrix exceeding a sparse factor.
406    
407    2007-05-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
408    
409            * man/tmUpdate.Rd: Corrected documentation on readerControl parameter.
410    
411    2007-05-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
412    
413            * man/sFilter.Rd: Corrected documentation on statement format (use
414            '==' instead of '=').
415    
416    2007-05-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
417    
418            * R/aobjects.R (StructuredTextDocument): Inherits from
419            TextDocument.
420    
421    2007-05-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
422    
423            * R/termdocmatrix.R (findFreqTerms): Perform efficient computation
424            on sparse matrices as proposed by Martin Maechler.
425    
426    2007-04-27  Ingo Feinerer  <h0125130@wu-wien.ac.at>
427    
428            * R/textdoccol.R: Removed \code{dbDisconnect} calls since last
429            \pkg{filehash} version makes them deprecated.
430    
431    2007-04-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
432    
433            * R/termdocmatrix.R (textvector): Stemming is now performed before
434            erasing stopwords.
435            (weightMatrix): Adapted to handle sparse matrices.
436            (TermDocMatrix): Sparse matrix is now efficiently built by
437            direct stepwise insertion of row values into it.
438    
439    2007-04-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
440    
441            * DESCRIPTION: Replaced \pkg{filehashSQLite} with \pkg{filehash}
442            due to ongoing problems. For our purposes the latter is as useful
443            as the replaced package.
444    
445    2007-04-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
446    
447            * man/TextDocCol.Rd: Replaced \code{readPlain} with \code{object@DefaultReader}.
448    
449            * man/TermDocMatrix.Rd: Remove deprecated \code{language} argument.
450    
451    2007-04-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
452    
453            * R/resolve.R (resolveISOCode): Added ISO 639-1 codes for
454            languages with available stopwords.
455    
456    2007-04-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
457    
458            * inst/doc/tm.Rnw: Minor corrections in the vignette.
459    
460    2007-04-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
461    
462            * DESCRIPTION: Update to version 0.2, since a lot of new features
463            have been integrated.
464    
465            * inst/stopwords: Updated existing stopwords and added stopwords
466            for various other languages.
467    
468    2007-04-10  Ingo Feinerer  <h0125130@wu-wien.ac.at>
469    
470            * man/: Updated documentation.
471    
472            * Work/testDb.R: Script to test database stuff.
473    
474            * R/: Fixed various database related bugs. Seems to be rather
475            useable now, i.e., consider as alpha status for now.
476    
477    2007-04-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
478    
479            * R/: Fixed some bugs related to database support.
480    
481    2007-04-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
482    
483            * man/: Added a lot of examples to the manuals.
484    
485    2007-04-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
486    
487            * man/: Updated parts of the documentation.
488    
489            * R/textdoccol.R (asPlain): Added conversion from newsgroup
490            documents to plain text documents.
491    
492    2007-04-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
493    
494            * R/textdoccol.R: Finished experimental database support. Not yet
495            intensively tested.
496    
497            * R/source.R: Now each source has a default reader.
498    
499            * R/reader.R: \code{FunctionGenerator} is now an attribute, not a
500            class anymore.
501    
502            * R/plaintextdoc.R: Custom show method for plain text documents.
503    
504            * R/aobjects.R: Added a class for structured text documents.
505    
506            * R/reader.R: Replaced remaining \code{parser} occurrences with
507            \code{reader}.
508    
509            * R/textdoccol.R (summary): Indent tags.
510    
511            * R/textdoccol.R (removePunctuation): Transform method to remove
512            punctuation marks.
513    
514    2007-03-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
515    
516            * R/textdoccol.R (sFilter): Simplified sFilter significantly by
517            using prescindMeta().
518    
519    2007-03-18  Ingo Feinerer  <h0125130@wu-wien.ac.at>
520    
521            * R/textdoccol.R: Improved database support.
522    
523    2007-03-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
524    
525            * R/termdocmatrix.R (TermDocMatrix): Uses sparse matrices.
526    
527            * R/resolve.R (resolveISOcode): Extracts the language from a ISO
528            language code.
529    
530            * R/textdoccol.R (TextDocCol): Refactored several parser arguments
531            into parserControl argument.
532    
533            * R/aobjects.R (TextDocument): Introduced the "Language" slot.
534    
535    2007-03-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
536    
537            * Work/tmDataSetup.R: The datasets acq and crude can now be
538            created on the fly.
539    
540            * R/stopwords.R: Introduced a function returning the stopwords for
541            a given language (English, German and French at the moment)
542    
543            * R/textdoccol.R (stemDoc): Stemming uses Rstem if available,
544            otherwise falls back to Snowball package.
545    
546    2007-01-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
547    
548            * man/dissimilarity-methods.Rd: Make clear that any method offered
549            by "dists" from package "cba" can be used.
550    
551    2007-01-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
552    
553            * inst/doc/tm.Rnw: Fixed quotes-appearing-as-boxes-bug according
554            to Kurt's latex suggestion. Removed points and underscores in
555            variable names for consistent naming.
556    
557            * DESCRIPTION: Update to version 0.1-2.
558    
559            * man/TextRepository.Rd: Fixed bug in documentation.
560    
561    2007-01-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
562    
563            * DESCRIPTION: Update to version 0.1-1.
564    
565    2007-01-09  Ingo Feinerer  <h0125130@wu-wien.ac.at>
566    
567            * R/textdoccol.R (stemDoc): Use Rstem::wordStem instead of
568            wordStem.
569    
570    2007-01-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
571    
572            * R/: Changes due to Kurt's review.
573    
574    2006-12-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
575    
576            * R/: Implemented improvements based upon comments by David
577            Meyer.
578    
579    2006-12-17  Ingo Feinerer  <h0125130@wu-wien.ac.at>
580    
581            * inst/doc/: Rewrote vignette.
582    
583            * man/: Improved documentation.
584    
585    2006-12-16  Ingo Feinerer  <h0125130@wu-wien.ac.at>
586    
587            * man/: Updated documentation.
588    
589            * DESCRIPTION: Changed package name to "tm". Updated version to
590            0.1 for first CRAN release.
591    
592            * inst/texts/gmane.comp.lang.r.general.mbox: mbox Gmane R mailing
593            list archive example.
594    
595            * inst/texts/gmane.comp.lang.r.gr.rdf: RSS Gmane R mailing list
596            archive example.
597    
598            * R/preprocess.R (convert_mbox_eml): A simple e-mail converter
599            from (several mails per box) mbox format to (single mail per file)
600            eml format.
601    
602    2006-12-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
603    
604            * data/crude.rda: Rebuilt.
605    
606            * data/acq.rda: Rebuilt.
607    
608            * R/reader.R: Factored out reader and parser methods from
609            textdoccol.R.
610    
611            * R/source.R: Factored out Source methods from aobjects.R and
612            textdoccol.R.
613            (GmaneRSource): Encapsulates Gmane R mailing list archive RSS
614            feeds.
615    
616            * R/textdoccol.R (DirSource): Added support for recursive
617            traversal of directories.
618    
619    2006-12-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
620    
621            * R/textdoccol.R ([[): Loads the document corpus automatically
622            into memory upon access.
623            (tm_transform, tm_filter): Removed several checks whether the
624            document is already loaded ([[ ensures this now).
625            (gmane_r_reader): Reader for RSS feeds as provided by the Gmane R
626            mailing list archive.
627    
628    2006-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
629    
630            * R/aobjects.R (TextDocument): Is now a virtual class.
631            (Source): Is now a virtual class.
632    
633    2006-12-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
634    
635            * R/textdoccol.R (c): Support for an arbitrary number of document
636            collections.
637    
638    2006-11-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
639    
640            * R/textrepo.R: Updated TextRepository (constructor), append_elem,
641            append_meta and remove_meta.
642    
643            * R/textdoccol.R: Removed modify_metadata method.
644    
645            * R/textrepo.R: Removed modify_metadata method.
646    
647            * R/textdoccol.R (remove_meta): Supports removal of document
648            collection metadata and document (= in data frame) metadata.
649    
650    2006-11-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
651    
652            * R/textdoccol.R (append_doc): Bug fix for handling empty metadata.
653    
654            * data/crude.rda: Rebuilt.
655    
656            * data/acq.rda: Rebuilt.
657    
658            * inst/doc/textmin.Rnw: Updated vignette to reflect code changes.
659    
660            * R/textdoccol.R ([): Bug fix for subsetting a document
661            collection's data frame.
662    
663    2006-11-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
664    
665            * R/textdoccol.R: Bug fixes in s_filter. Added full query support
666            to s_filter.
667    
668            * R/textdoccol.R: Local text documents' metadata can now be copied
669            to a document collection's data frame with prescind_meta.
670    
671    2006-11-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
672    
673            * R/: Text documents' slot metadata is now accessible in s_filter.
674    
675            * R/: Rewrote s_filter function (has still some restrictions).
676    
677    2006-11-20  Ingo Feinerer  <h0125130@wu-wien.ac.at>
678    
679            * R/: Various fixes in handling metadata.
680    
681            * R/: Added update mechanism for text document collections.
682    
683    2006-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
684    
685            * R/: Merging of document collections now creates a binary tree
686            for reconstructing merged document collections.
687    
688            * R/: Redesign of metadata for document collections.
689    
690    2006-11-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
691    
692            * R/: Messages now use \code{ngettext}.
693    
694    2006-11-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
695    
696            * R/: Added functions for modifying and removing metadata.
697    
698    2006-11-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
699    
700            * man/: Updated some documentation.
701    
702            * R/: Corrected some connection issues.
703    
704            * inst/doc: Worked on the vignette.
705    
706    2006-10-31  Ingo Feinerer  <h0125130@wu-wien.ac.at>
707    
708            * inst/: Added texts and started vignette.
709    
710            * R/: Final changes based upon David's comments.
711    
712    2006-10-29  Ingo Feinerer  <h0125130@wu-wien.ac.at>
713    
714            * NAMESPACE: Corrected exports (generic methods need exportMethods
715            directives!).
716    
717    2006-10-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
718    
719            * R/: Modified the TextDocCol constructur and various parsers. It
720            is now modular and supports various file formats via plugins (see
721            the new "Source" class).
722    
723    2006-10-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
724    
725            * man/: Revised documentation after previous code changes.
726    
727    2006-10-23  Ingo Feinerer  <h0125130@wu-wien.ac.at>
728    
729            * R/: Remaining changes as discussed with David.
730    
731    2006-10-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
732    
733            * R/: Some changes as suggested by David. The rest will follow
734            within the next days.
735    
736    2006-09-26  Ingo Feinerer  <h0125130@wu-wien.ac.at>
737    
738            * man/: Finished documentation.
739    
740    2006-09-25  Ingo Feinerer  <h0125130@wu-wien.ac.at>
741    
742            * man/: Wrote some documentation.
743    
744    2006-09-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
745    
746            * R/: Further syntactic sugar in form of additional assignment and
747            accessor methods.
748    
749    2006-09-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
750    
751            * R/: Syntactic sugar in form of "length", "show" and "summary"
752            operators.
753    
754    2006-08-24  Ingo Feinerer  <h0125130@wu-wien.ac.at>
755    
756            * R/: Diverse updates. Mainly on default operators ("[" or "c")
757            and dissimilarities.
758    
759    2006-08-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
760    
761            * R/: Added similarity functions.
762    
763            * data/: Added english stopwords.
764    
765    2006-08-07  Ingo Feinerer  <h0125130@wu-wien.ac.at>
766    
767            * data/: Examples compiled for new features
768    
769            * R/: Changes due to new structure.
770    
771            * NAMESPACE: Corrected namespace to reflect new structure.
772    
773            * R/termdocmatrix.R: Adapted for new naming scheme.
774    
775    2006-08-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
776    
777            * R/textdoccol.R: Adapted code for new class structure. Wrote
778            several transform and filter functions operating on text document
779            collections (alias text document databases).
780    
781            * R/aobjects.R: Adapted class structure with inheritance,
782            repositories and additional meta data. Loading files on demand is
783            now possible.
784    
785    2006-07-13  Ingo Feinerer  <h0125130@wu-wien.ac.at>
786    
787            * R/: Some cosmetic cleanups.
788    
789            * inst/: Removed vignette on clustering. That and much more is now
790            described in the JSS paper on text mining. Based upon that
791            article an elaborated vignette will be incorporated in the future.
792    
793    2006-07-01  Ingo Feinerer  <h0125130@wu-wien.ac.at>
794    
795            * R/: Updated generic S4 methods to comply with signature changes
796            in newer versions of R (> 2.3)
797    
798    2006-03-12  Ingo Feinerer  <h0125130@wu-wien.ac.at>
799    
800            * ext/R/importRIS.R: Automatic RIS import is now possible.
801    
802    2006-02-14  Ingo Feinerer  <h0125130@wu-wien.ac.at>
803    
804            * R/textdoccol.R: Added RIS HTML input format.
805    
806    2006-01-21  Ingo Feinerer  <h0125130@wu-wien.ac.at>
807    
808            * R/textdoccol.R: Removed bug that caused invalid text document
809            collections when handling many input files.
810    
811    2006-01-11  Ingo Feinerer  <h0125130@wu-wien.ac.at>
812    
813            * R/textdoccol.R: Restructured and extended file import
814            mechanism.
815    
816            * inst/doc/clustering.Rnw: Adapted vignette for use with
817            ReutNews.rda
818    
819            * man/ReutNews.Rd: Documentation for ReutNews.rda
820    
821            * data/ReutNews.rda: A tiny Reuters21578 example data set.
822    
823    2005-12-22  Ingo Feinerer  <h0125130@wu-wien.ac.at>
824    
825            * inst/doc/clustering.Rnw: Wrote a small vignette to present the
826            clustering facilities of this package.
827    
828    2005-12-15  Ingo Feinerer  <h0125130@wu-wien.ac.at>
829    
830            * R/aobjects.R: Changed package document structure to avoid class
831            dependency problems.
832    
833    2005-12-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
834    
835            *  Wrote a script for the ModLewis Split for the Reuters-21578 XML
836            data set.
837    
838            *  Finished documentation and reordered directory structure. Now "R
839            CMD check textmin" works without errors.
840    
841    2005-12-04  Ingo Feinerer  <h0125130@wu-wien.ac.at>
842    
843            * src/: Various splits can now be easily created for the
844            Reuters21578 data set.
845    
846    2005-12-03  Ingo Feinerer  <h0125130@wu-wien.ac.at>
847    
848            *  Updated documentation
849    
850    2005-11-30  Ingo Feinerer  <h0125130@wu-wien.ac.at>
851    
852            *  Wrote R documentation for some classes and methods.
853    
854    2005-11-19  Ingo Feinerer  <h0125130@wu-wien.ac.at>
855    
856            * R/textdoccol.R: Constructor of textdoccol allows import of CSV
857            files. See the questionnaire data/Umfrage.csv for such an example.
858            We are now able to import files in Reuters-21578 XML format.
859    
860            *  Changed class interfaces in various files. Weighting of the text
861            matrix is now possible.
862    
863    2005-11-08  Ingo Feinerer  <h0125130@wu-wien.ac.at>
864    
865            * R/textdoccol.R: One can build term-document matrices if
866            nessecary (with buildTDM(...)) and fill the field tdm from a text
867            document collection with it.
868    
869            * R/textmatrix.R: Wrote S4 class for term-document matrices.
870    
871    2005-11-06  Ingo Feinerer  <h0125130@wu-wien.ac.at>
872    
873            * R/textdoccol.R: We now can read in a whole XML file with several
874            news items.
875    
876  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>  2005-11-05  Ingo Feinerer  <h0125130@wu-wien.ac.at>
877    
878          * R/textdoccol.R: Set up an S4 class for a collection of text          * R/textdoccol.R: Set up an S4 class for a collection of text

Legend:
Removed from v.17  
changed lines
  Added in v.942

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge