-
Notifications
You must be signed in to change notification settings - Fork 2k
Support wildcard select on multiple column using joins #4840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -150,13 +150,23 @@ pub fn expand_wildcard(schema: &DFSchema, plan: &LogicalPlan) -> Result<Vec<Expr | |
| let using_columns = plan.using_columns()?; | ||
| let columns_to_skip = using_columns | ||
| .into_iter() | ||
| // For each USING JOIN condition, only expand to one column in projection | ||
| // For each USING JOIN condition, only expand to one of each join column in projection | ||
| .flat_map(|cols| { | ||
| let mut cols = cols.into_iter().collect::<Vec<_>>(); | ||
| // sort join columns to make sure we consistently keep the same | ||
| // qualified column | ||
| cols.sort(); | ||
| cols.into_iter().skip(1) | ||
| let mut out_column_names: HashSet<String> = HashSet::new(); | ||
| cols.into_iter() | ||
| .filter_map(|c| { | ||
| if out_column_names.contains(&c.name) { | ||
| Some(c) | ||
| } else { | ||
| out_column_names.insert(c.name); | ||
| None | ||
| } | ||
| }) | ||
| .collect::<Vec<_>>() | ||
| }) | ||
| .collect::<HashSet<_>>(); | ||
|
|
||
|
|
@@ -186,7 +196,6 @@ pub fn expand_wildcard(schema: &DFSchema, plan: &LogicalPlan) -> Result<Vec<Expr | |
| pub fn expand_qualified_wildcard( | ||
| qualifier: &str, | ||
| schema: &DFSchema, | ||
| plan: &LogicalPlan, | ||
| ) -> Result<Vec<Expr>> { | ||
| let qualified_fields: Vec<DFField> = schema | ||
| .fields_with_qualified(qualifier) | ||
|
|
@@ -198,9 +207,14 @@ pub fn expand_qualified_wildcard( | |
| "Invalid qualifier {qualifier}" | ||
| ))); | ||
| } | ||
| let qualifier_schema = | ||
| let qualified_schema = | ||
| DFSchema::new_with_metadata(qualified_fields, schema.metadata().clone())?; | ||
| expand_wildcard(&qualifier_schema, plan) | ||
| // if qualified, allow all columns in output (i.e. ignore using column check) | ||
| Ok(qualified_schema | ||
| .fields() | ||
| .iter() | ||
| .map(|f| Expr::Column(f.qualified_column())) | ||
| .collect::<Vec<Expr>>()) | ||
|
Comment on lines
+210
to
+217
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is an extra fix, as i observed in postgresql if you have the following query: select a.*, b.*, c.*
from categories a
join categories b using (category_id)
join categories c using (category_id)
;then |
||
| } | ||
|
|
||
| /// (expr, "is the SortExpr for window (either comes from PARTITION BY or ORDER BY columns)") | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main fix is here, since instead of only skipping the first column (which is based on assumption of using join with only one column), actually keep track of which columns to skip, allowing only one set of the join columns to be output