Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Executing http::GET() in parallel results in an error when no single core GET request was issued before. #749

Closed
rkrug opened this issue Nov 12, 2023 · 7 comments

Comments

@rkrug
Copy link

rkrug commented Nov 12, 2023

macOS Sonoma, MacBook Pro, M1 Pro chip

Trying to use parallel::mclapply() to do GET requests in parallel, results in errors on all cores.

After executing a single core request once, results in the error disappearing.

r$> library(httr)

r$> parallel::mclapply(1:2, function(x){httr::GET("http://openalex.org/")})
objc[45797]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[45797]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[45796]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[45796]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
[[1]]
NULL

[[2]]
NULL

Warning message:
In parallel::mclapply(1:2, function(x) { :
  scheduled cores 1, 2 did not deliver results, all values of the jobs will be affected

r$> httr::GET("http://openalex.org/")
Response [https://openalex.org/]
  Date: 2023-11-12 11:16
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 1.02 kB


r$> parallel::mclapply(1:2, function(x){httr::GET("http://openalex.org/")})
[[1]]
Response [https://openalex.org/]
  Date: 2023-11-12 11:17
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 1.02 kB


[[2]]
Response [https://openalex.org/]
  Date: 2023-11-12 11:17
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 1.02 kB



r$> sessioninfo::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Sonoma 14.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Zurich
 date     2023-11-12
 pandoc   3.1.9 @ /opt/homebrew/bin/pandocPackages ──────────────────────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.0)
 curl          5.1.0   2023-10-02 [1] CRAN (R 4.3.1)
 httr        * 1.4.7   2023-08-15 [1] CRAN (R 4.3.0)
 jsonlite      1.8.7   2023-06-29 [1] CRAN (R 4.3.0)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
 rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.0)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)

 [1] /Users/rainerkrug/R/library/aarch64-apple-darwin20/4.3
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library

─────────────────────────────────────────────────────────────────────────────────────────────

r$>
@rkrug
Copy link
Author

rkrug commented Nov 12, 2023

Based on among others https://community.rstudio.com/t/running-parallel-on-mac/142580/6, I have set OBJC_DISABLE_INITIALIZE_FORK_SAFETY in environ to YES:

Sys.getenv("OBJC_DISABLE_INITIALIZE_FORK_SAFETY")
[1] "YES"

But no change.

@hadley
Copy link
Member

hadley commented Nov 14, 2023

httr has been superseded by httr2, so no further development work will happen. I'd recommend giving httr2::req_perform_parallel() a go since it does parallel requests in a way that actually works (i.e. using curl's parallel request facilities).

@hadley hadley closed this as completed Nov 14, 2023
@rkrug
Copy link
Author

rkrug commented Nov 14, 2023

Thanks - I'll look into httr2. Although httr2::req_perform_parallel() is unfortunately an option, as the call ia=s in a package.

@hadley
Copy link
Member

hadley commented Nov 14, 2023

Why isn't it an option?

@rkrug
Copy link
Author

rkrug commented Nov 14, 2023

It is not my package....

@rkrug
Copy link
Author

rkrug commented Nov 14, 2023

And also, parallel calls can cause problems due to API restrictions of that specific api - so it needs to be handled with care.

@hadley
Copy link
Member

hadley commented Nov 14, 2023

Oh got it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants