-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
potential parents with a relative cooccurence < 1.0 are never accepted #8
Comments
From the README:
Contrary to what is indicated in the README, it seems that when using the default I wrote a toy-example to demonstrate this behavior: library(lulu)
library(dplyr) # lulu requires dplyr
matchlist_name <- data.frame(x = "B" , y = "A", z = 99.0)
otutable_name <- data.frame(
row.names = c("A", "B"),
s01 = c(0, 1), # <= 'B' present, 'A' absent
s02 = c(9, 1),
s03 = c(9, 1),
s04 = c(9, 1),
s05 = c(9, 1),
s06 = c(9, 1),
s07 = c(9, 1),
s08 = c(9, 1),
s09 = c(9, 1),
s10 = c(9, 1),
s11 = c(9, 1),
s12 = c(9, 1),
s13 = c(9, 1),
s14 = c(9, 1),
s15 = c(9, 1),
s16 = c(9, 1),
s17 = c(9, 1),
s18 = c(9, 1),
s19 = c(9, 1),
s20 = c(9, 1),
s21 = c(9, 1),
s22 = c(9, 0)) # <= 'A' present, 'B' absent
## Note that parent 'A' is absent in one sample where 'B is present',
## and present in one sample where 'B' is absent (same total
## spread). The relative co-occurence is 20 / 21 = 0.95238, which is
## greater than 0.95, the default threshold value.
## bug: no merging with default parameters
## (minimum_ratio = 0)
lulu::lulu(otutable_name, matchlist_name)
## with minimum_ratio_type = "avg": merging as expected
## (minimum_ratio (average) is 8.57)
lulu::lulu(otutable_name, matchlist_name, minimum_ratio_type = "avg")
quit(save = "no") |
Dear @tobiasgf
I think there is a undesired behavior with
minimum_ratio
whenminimum_ratio_type == "min"
, and to some degree whenminimum_ratio_type == "avg"
. It happens in the code below:When
minimum_relative_cooccurence
is smaller than 1.0 (default is 0.95), the potential parent OTU can be absent from some samples where the daughter OTU is present. A missing parent means that the ratio for that particular sample will be zero. This is were the issue is, the wayminimum_ratio
is computed, it will identify the samples where the father is missing as the ones with the smallest ratio (zero). This is belowminimum_ratio
and the father OTU is rejected, despite usingminimum_relative_cooccurence
.To solve that, the minimum_ratio should be searched among the non-null ratio values, not all ratio values.
When
minimum_ratio_type == "avg"
, I think the average should be calculated on non-zeros ratio values too.The text was updated successfully, but these errors were encountered: