Skip to content

Add an InSet function as an optimized version for IN #2093

@yjshen

Description

@yjshen

TPC-DS has many queries with IN predicates where all elements are constants. It's a low-hanging fruit if we could implement an InSet function for this all constants value case.

While implementing this, we could either use a hashtable or a chain of if-elif-else, depending on the length and the type of the constants array.

Q8:

 WHERE substr(ca_zip, 1, 5) IN (
               '24128','76232','65084','87816','83926','77556','20548',
               '26231','43848','15126','91137','61265','98294','25782',
               '17920','18426','98235','40081','84093','28577','55565',
               '17183','54601','67897','22752','86284','18376','38607',
               '45200','21756','29741','96765','23932','89360','29839',
                ......

Q53:

  WHERE ss_item_sk = i_item_sk AND
    ss_sold_date_sk = d_date_sk AND
    ss_store_sk = s_store_sk AND
    d_month_seq IN (1200, 1200 + 1, 1200 + 2, 1200 + 3, 1200 + 4, 1200 + 5, 1200 + 6,
                          1200 + 7, 1200 + 8, 1200 + 9, 1200 + 10, 1200 + 11) AND
    ((i_category IN ('Books', 'Children', 'Electronics') AND
      i_class IN ('personal', 'portable', 'reference', 'self-help') AND
      i_brand IN ('scholaramalgamalg #14', 'scholaramalgamalg #7',
                  'exportiunivamalg #9', 'scholaramalgamalg #9'))
      OR
      (i_category IN ('Women', 'Music', 'Men') AND
        i_class IN ('accessories', 'classical', 'fragrances', 'pants') AND
        i_brand IN ('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1',
                    'importoamalg #1')))

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions