ARROW-15622: [R] Implement union_all and union for arrow_dplyr_query#13090
ARROW-15622: [R] Implement union_all and union for arrow_dplyr_query#13090wjones127 wants to merge 7 commits intoapache:masterfrom
Conversation
|
Some example usage: library(arrow)
#>
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#>
#> timestamp
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
tab1 <- arrow_table(x = 1:3)
tab2 <- arrow_table(x = 2:4, y = c("a", "b", "c"))
tab1 |>
mutate(y = "a") |>
union_all(tab2) |>
collect()
#> # A tibble: 6 × 2
#> x y
#> <int> <chr>
#> 1 2 a
#> 2 3 b
#> 3 4 c
#> 4 1 a
#> 5 2 a
#> 6 3 a
tab1 |>
mutate(y = "a") |>
union_all(tab2) |>
arrange(x, y) |>
collect()
#> # A tibble: 6 × 2
#> x y
#> <int> <chr>
#> 1 1 a
#> 2 2 a
#> 3 2 a
#> 4 3 a
#> 5 3 b
#> 6 4 c
tab1 |>
mutate(y = "a") |>
dplyr::union(tab2) |>
arrange(x, y) |>
collect()
#> # A tibble: 5 × 2
#> x y
#> <int> <chr>
#> 1 1 a
#> 2 2 a
#> 3 3 a
#> 4 3 b
#> 5 4 cCreated on 2022-05-09 by the reprex package (v2.0.1) |
fb0b93a to
2d6d125
Compare
2d6d125 to
1640771
Compare
|
The timeouts on RTools mingw seem to be random. I ran locally on mingw64 and was able to get it to build and pass all R tests. |
Probably namespace collision: |
nealrichardson
left a comment
There was a problem hiding this comment.
A couple of suggestions on the tests but otherwise LGTM, thanks!
Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com>
|
Benchmark runs are scheduled for baseline = 6576aa0 and contender = d889ade. d889ade is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
|
['Python', 'R'] benchmarks have high level of regressions. |
…Hub issue numbers (#34260) Rewrite the Jira issue numbers to the GitHub issue numbers, so that the GitHub issue numbers are automatically linked to the issues by pkgdown's auto-linking feature. Issue numbers have been rewritten based on the following correspondence. Also, the pkgdown settings have been changed and updated to link to GitHub. I generated the Changelog page using the `pkgdown::build_news()` function and verified that the links work correctly. --- ARROW-6338 #5198 ARROW-6364 #5201 ARROW-6323 #5169 ARROW-6278 #5141 ARROW-6360 #5329 ARROW-6533 #5450 ARROW-6348 #5223 ARROW-6337 #5399 ARROW-10850 #9128 ARROW-10624 #9092 ARROW-10386 #8549 ARROW-6994 #23308 ARROW-12774 #10320 ARROW-12670 #10287 ARROW-16828 #13484 ARROW-14989 #13482 ARROW-16977 #13514 ARROW-13404 #10999 ARROW-16887 #13601 ARROW-15906 #13206 ARROW-15280 #13171 ARROW-16144 #13183 ARROW-16511 #13105 ARROW-16085 #13088 ARROW-16715 #13555 ARROW-16268 #13550 ARROW-16700 #13518 ARROW-16807 #13583 ARROW-16871 #13517 ARROW-16415 #13190 ARROW-14821 #12154 ARROW-16439 #13174 ARROW-16394 #13118 ARROW-16516 #13163 ARROW-16395 #13627 ARROW-14848 #12589 ARROW-16407 #13196 ARROW-16653 #13506 ARROW-14575 #13160 ARROW-15271 #13170 ARROW-16703 #13650 ARROW-16444 #13397 ARROW-15016 #13541 ARROW-16776 #13563 ARROW-15622 #13090 ARROW-18131 #14484 ARROW-18305 #14581 ARROW-18285 #14615 * Closes: #33631 Authored-by: SHIMA Tatsuya <ts1s1andn@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This PR adds support for
dplyr::unionanddplyr::union_all. Not sure why, but I find I must use the fully qualified namedplyr::unionor else will get an error.