Forum: help

RE: Interpreting output tables from ensemble model scores (table vs graph) [ Reply ]
By: Maya Guéguen on 2022-03-01 12:58

[forum:49338]

Hi Luis,

It's good that you mention it, because I have a differential treatment actually depending on the data....
Currently, if the data is given as an array, a presence is defined when op >= threshold.
But if it is a raster, the conversion is made with reclassify and then it is op > threshold....

I'm correcting this right away, so both data inputs send back same results, that is : a presence is defined when op >= threshold.

Also, I released a new version of biomod2 (4.0), available on github.
You can install it with the following commands :
library(devtools)
install_github("biomodhub/biomod2")

and check the new documentation here :
https://biomodhub.github.io/biomod2/index.html

Help messages should also be reported onto the github in order to gather everything in the same place :)
https://github.com/biomodhub/biomod2/issues

Cheers,
Maya

RE: Interpreting output tables from ensemble model scores (table vs graph) [ Reply ]
By: Luís Santiago on 2022-03-01 12:45

[forum:49337]

Hi Maya,

I have another question.
I have been prompted with an error message when using BinaryTransformation(), but I believe there is a way around it if I reclassify the occurrence probability (op) raster. Hence, I would like to ask you the following: when it comes to the cutoff parameter that concerns the threshold value that defines presence or absence in binary maps, is a presence defined when op >= threshold value, or when op > threshold value?

Thanks in advance.
Regards,
Luis

RE: Interpreting output tables from ensemble model scores (table vs graph) [ Reply ]
By: Maya Guéguen on 2022-02-17 14:30

[forum:49324]

Hi Luis,

The value returned when ROC is selected is indeed an AUC value.
More exactly, the ROC curve is calculated with the roc function from the pROC package, and the AUC value is obtained by applying the auc function from the pROC package over the created roc object.

As for the model score graph, I'm really sorry about that. The metric names were factors, resulting in a mismatch between names and values... Thank you for spotting this. But remember that the point corresponds to the mean value, while the "errorbar" represent the standard deviation (not min and max values).

The error will be corrected into the development version that is soon to be released (and can be installed with the following command lines, but might still present some errors).

library(devtools)
install_github("biomodhub/biomod2", ref = "devel_cleaning")

Sorry again for the inconvenience, and thank you for your patience.

Maya

RE: Interpreting output tables from ensemble model scores (table vs graph) [ Reply ]
By: Luís Santiago on 2022-02-07 17:44

[forum:49304]

scores.JPG (8) downloads

Dear Maya,

Some other things on my model score tables and graph caught my attention, and I feel they still fit in the scope of this topic.

My first question is whether the ROC value is equivalent to the AUC. I am asking this because, since ROC is a curve, when one wants to focus at a specific point of the curve, this point is characterised by two coordinates (xx and yy), and not by one single value only. Hence, I believe the actual ROC value in the model scores graph and table corresponds to an AUC. I am not sure about this, though, and would like to kindly ask you to confirm it, or correct my interpretation.

Secondly, I would like to ask about the score graph and table outputs.
As you can see in the model score graph I attached, TSS max value is roughly 0.5 and ROC max value is beyond 0.7. Because ROC's value is higher than TSS's, I chose the former as in the evaluation metric and evaluation metric quality threshold arguments:

# Ensembling
tapir2010_ensemble <- BIOMOD_EnsembleModeling(
modeling.output = tapir2010_models,
chosen.models = 'all',
em.by = 'all',
eval.metric = 'ROC',
eval.metric.quality.threshold = 0.7,
models.eval.meth = c('TSS','ROC'),
prob.mean = FALSE,
prob.cv = FALSE,
prob.ci = FALSE,
committee.averaging = FALSE,
prob.mean.weight = TRUE,
VarImport = 10)

The following score table (also attached) shows the values for ROC and TSS reflecting the evaluation metric I selected in the piece of code above. This table is showing ROC and TSS values of 0.852 and 0.591, respectively. I don't really understand these values, as in the model score graph in the same screenshot, the average and standard deviation of ROC and TSS do not reach those two values.

Thank you so much in advance!
Regards,

Luis

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ] By: Luís Santiago on 2022-02-01 13:09	[forum:49296]
Thank you so much for your help Maya!

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ] By: Maya Guéguen on 2022-02-01 13:07	[forum:49295]
Yes, exactly :) Cheers, Maya

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ]
By: Luís Santiago on 2022-02-01 12:57

[forum:49294]

I think it is more clear, yes. So what you are basically saying is that, since, there is not only one "recipe" to build models, choosing the amount of tables we want (based in the eval.metric argument), allows the existence of a wider series of options that can be selected according to what the user finds to be the best values according to each parameter (testing.data, cutoff, sensitivity and specificity).

Thank you!
Luis

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ]
By: Maya Guéguen on 2022-02-01 12:36

[forum:49293]

Okay, so I got it wrong the first time, sorry for that.

Here is the explanation :

By setting eval.metric = c('TSS', 'ROC'), you ask for your ensemble models to be build according to two filters, which are not combined, but separate, meaning that you will get in fact 2 ensemble models :
- one selecting only single models verifying the TSS threshold
- another selecting only single models verifying the ROC threshold
This explain your two evaluation tables.

Then in each table, you get indeed the corresponding values for the evaluation metrics you asked to be computed over your ensemble models with the models.eval.meth (here again TSS and ROC).

Sorry for mixing stuff...

Is it clearer now ?

Maya

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ]
By: Luís Santiago on 2022-02-01 12:28

[forum:49292]

No problem at all Maya, perhaps I was not very clear.
Sure, here is my code:

# Ensembling
tapir2010_ensemble <- BIOMOD_EnsembleModeling(
modeling.output = tapir2010_models,
chosen.models = 'all',
em.by = 'all',
eval.metric = c('TSS', 'ROC'),
eval.metric.quality.threshold = c(0.7, 0.7),
models.eval.meth = c('TSS','ROC'),
prob.mean = FALSE,
prob.cv = FALSE,
prob.ci = FALSE,
committee.averaging = FALSE,
prob.mean.weight = TRUE,
VarImport = 10
)

If it helps, in my previous post, I also attached a screenshot of the output with the ensemble model score tables I was referring to.

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ] By: Maya Guéguen on 2022-02-01 12:22	[forum:49291]
Hello Luis, I'm sorry, my mistake. Could you show me your code to create your ensemble model please ? What did you put as value to the eval.metric argument ?

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ] By: Luís Santiago on 2022-01-31 10:12	[forum:49290] 16436236683245506832943242969685.jpg (38) downloads
Hi Maya, Thanks for your reply. I don't think I fully-understood it, though. Take the image I just attached, for instance. I do understand that, since I chose ROC and TSS, I have one table with those two metrics. What I don't understand is why I have two tables - Tapirus.10_EMwmeanByTSS... and Tapirus.10_EMwmeanByROC...?

RE: Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ]
By: Maya Guéguen on 2022-01-31 08:18

[forum:49288]

Hello Luis,

The cutoff concerns indeed the threshold defining the values above (below) which probability values are transformed into presences (absences).

There are two tables because you asked for this binary transformation to be done with both evaluation metric, TSS and ROC.
The first table corresponds to ensemble models whose probability values have been transformed into binary values by optimizing the TSS value.
The second table corresponds to ensemble models whose probability values have been transformed into binary values by optimizing the ROC value.

Maya

Interpreting output tables from ensemble model scores (why more than one table?) [ Reply ]
By: Luís Santiago on 2022-01-28 16:04

[forum:49286]

ensemble model scores.JPG (13) downloads

Hello,

I would like to kindly ask some help interpreting the values in my ensemble model score tables (I am attaching a screenshot).

First, I would like to ask (double-check) if this cutoff parameter concerns the threshold that defines presence or absence in binary maps.

My second question regards the evaluation metrics I am using ROC and TSS. Because I am using them both, I expected to have only one table with the values of ROC and TSS. Instead, I got the two tables in the attached screenshot:

Tapirus.10_EMwmeanByTSS_mergedAlgo_mergedRun_mergedData

and

Tapirus.10_EMwmeanByROC_mergedAlgo_mergedRun_mergedData

These two have slightly different values for ROC and TSS. My first guess would be to use the values on the TSS table, but I would like to double-check if my guess is right and understand why two tables, instead of one, are produced.

Thanks in advance.
Luis