How to make a JOIN between two collections?


I see that there is no JOIN statement in RAW SQL.

If I have one collection( record(key: string, v1: int ) ) and one collection ( record( key: string, v2: int) )

How can I manage to create collection( record( key: string, v1: int, v2: int )) ?

If key is missing in either collection then the corresponding value is set to 0 (v1=0 or v2=0).

Thanks for your help !


I understand you’d like to have a query that ends up displaying each key (of both datasets) once, and reports the matching values (of both datasets too) in distinct columns, with a default value (zero) when no match is found.

An option is to collect the individual keys from both datasets, and compute the needed values using nested queries.

Here is how to compute the distinct keys (SELECT DISTINCT).

            FROM key IN (
              (SELECT key1 FROM dataset1)
              UNION ALL
              (SELECT key2 FROM dataset2)

Here’s how to walk that list of keys and report the matches from each dataset:

  // custom nested queries on datasets
  ISNULL((SELECT FIRST(value1) FROM dataset1 WHERE key1 = key), 0) AS v1, 
  ISNULL((SELECT FIRST(value2) FROM dataset2 WHERE key2 = key), 0) AS v2 
FROM key IN keys
  • The SELECT FIRST(value) .. WHERE is going to match the key and return the first match, null when nothing is found,
  • The wrapping with ISNULL(..., 0) takes care of replacing the possibly null value by a default value (in your case, zero),
  • I used SELECT FIRST because I understood each key would be once in each dataset, and that would be a good way to report the value when the key exists. If there are several values, it would still pick “the first” it finds. If that is a problem, using FIRST isn’t the only option. One could collect the list of values list using SELECT value FROM ... WHERE, the largest one (MAX), the average one (AVG), etc.

Let me know if that corresponds to what you need. Thanks!

Hi @bgaidioz ! That’s exactly what I was looking for! Awesome. Thanks !