Fixed: limit on number of packages + add `rworkflows` #67

bschilder · 2022-12-19T20:07:49Z

Fix cran_downloads and prevent it from failing with many packages:
limit on number of packages as argument to cran_downloads #56
Speed up cran_downloads with parallelisation.
Add new unit test for cran_downloads with lots of packages.
Remove dontrun{} from cran_downloads and cran_top_downloads examples.
Unclear why this was here?
New support function: message_parallel, split_batches
Replace travis (failing) and pkgdown with rworkflows.
Update Authors in DESCRIPTION to reflect contribution, make fields explicit.
Add node_modules$ to .Rbuildignore.

All you need to do on your end @gaborcsardi is add GH token as a GH secret named "PAT_GITHUB" to the cranlogs repo.

Best,
Brian

bschilder · 2022-12-20T15:40:53Z

Heads up @gaborcsardi , cranlogs.org seems to be down today. So none of the functions will work until that is back up:
https://cranlogs.r-pkg.org/

Site is back up now.

bschilder · 2023-01-03T10:58:52Z

@gaborcsardi have you (or any other maintainers of cranlogs) had a chance to look this over?

Thanks,
Brian

gaborcsardi · 2023-01-03T11:04:06Z

Thanks for the PR, but I am not going to merge this, sorry.

I prefer using our workflows instead of rworkflows.

As for the parallelization, I prefer not to use parallel for HTTP.

bschilder · 2023-01-03T11:12:50Z

Thanks for the PR, but I am not going to merge this, sorry.

Ok, I suppose that's up to you, but would you mind providing some justifications for the benefit of users who requested some of these features? eg #56

I prefer using our workflows instead of rworkflows.

It seems your workflows has been failing for several years now, unless I'm missing something. If you were to fix this i think it would be a decent alternative.

As for the parallelization, I prefer not to use parallel for HTTP.

Is there a technical reason for this? I'd be curious to know myself! Similarly, is there an alternative implementation that would suit this purpose better?

If you do not wish to support parallelization at all, you have some other options available to you (in order of preference):

Simply replace parallel::mclapply with lapply. This will at least resolve the bug described here limit on number of packages as argument to cran_downloads #56
Adding some documentation or warnings to notify users that cranlogs is currently unable to request info for more than ~X packages at a time.

Happy to adjust the PR as needed, just let me know what you'd like to accomplish with this package.

gaborcsardi · 2023-01-03T11:22:32Z

It seems your workflows has been failing for several years now, unless I'm missing something. If you were to fix this i think it would be a decent alternative.

I prefer using usethis::use_tidy_github_actions()

Is there a technical reason for this?

Yes, several. Using multiple processes for concurrent HTTP is an anti-pattern. mcapply() is also unsafe on macOS (search for "fork without exec on macOS"), and does not work on Windows. Here is a good implementation for parallel HTTP with curl: https://github.com/r-lib/rcmdcheck/blob/0cb1073db067d59f2659e600678758f25e00a512/R/http.R

However, in general, I am not sure if I want to provide a quick way for people to "rank" all CRAN packages according to downloads. I don't really want people to think that this is an indication of package quality.

bschilder · 2023-01-03T11:37:16Z

It seems your workflows has been failing for several years now, unless I'm missing something. If you were to fix this i think it would be a decent alternative.

I prefer using usethis::use_tidy_github_actions()

Haven't come across this one! I'll check it out though. As you may already be aware, there's also biocthis, just FYI.

Even so, if you're no longer using Travis, perhaps it's time to retire it from here?

Is there a technical reason for this?

Yes, several. Using multiple processes for concurrent HTTP is an anti-pattern. mcapply() is also unsafe on macOS (search for "fork without exec on macOS"), and does not work on Windows. Here is a good implementation for parallel HTTP with curl: https://github.com/r-lib/rcmdcheck/blob/0cb1073db067d59f2659e600678758f25e00a512/R/http.R

This is great information to have! Thanks for the explanation.

However, in general, I am not sure if I want to provide a quick way for people to "rank" all CRAN packages according to downloads. I don't really want people to think that this is an indication of package quality.

Hm, I hadn't thought of it that way. I personally don't agree with this reasoning as I think it's best to give the users the complete data and make decisions about how to responsibly use it themselves (perhaps with a disclaimer in the documentation). For example, in my case I'm using the downloads to assess package usage, but I wouldn't think to use this as a metric of quality (not sure how many other users would?).

Would you mind updating this thread here to let the other users know of your decision either way? Thanks! #56

bschilder added 3 commits December 19, 2022 14:22

upgrade cranlogs

bb36042

fix Rbuildignore. update authors

048e3e1

Update news

7a063a6

bschilder mentioned this pull request Dec 19, 2022

limit on number of packages as argument to cran_downloads #56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed: limit on number of packages + add `rworkflows` #67

Fixed: limit on number of packages + add `rworkflows` #67

bschilder commented Dec 19, 2022

bschilder commented Dec 20, 2022 •

edited

Loading

bschilder commented Jan 3, 2023 •

edited

Loading

gaborcsardi commented Jan 3, 2023

bschilder commented Jan 3, 2023

gaborcsardi commented Jan 3, 2023

bschilder commented Jan 3, 2023

Fixed: limit on number of packages + add rworkflows #67

Are you sure you want to change the base?

Fixed: limit on number of packages + add rworkflows #67

Conversation

bschilder commented Dec 19, 2022

bschilder commented Dec 20, 2022 • edited Loading

bschilder commented Jan 3, 2023 • edited Loading

gaborcsardi commented Jan 3, 2023

bschilder commented Jan 3, 2023

gaborcsardi commented Jan 3, 2023

bschilder commented Jan 3, 2023

Fixed: limit on number of packages + add `rworkflows` #67

Fixed: limit on number of packages + add `rworkflows` #67

bschilder commented Dec 20, 2022 •

edited

Loading

bschilder commented Jan 3, 2023 •

edited

Loading