Ad

How To Pluck The Oldest Child Record From A Join In BigQuery SQL Where Ordered By Date And Limit 1

I'm analyzing Shopify Schema Data with BiqQuery in GCP and trying to join the Shopify Customers data to the child record that is Orders data so that I can find the very first Order for each Customer created in a certain time, but limit that to 1...like order by created_at asc, limit 1. I'm not really able to figure this out

Select

customers.created_at as customer,
customers.orders_count,
orders.created_at,
orders.order_number

FROM `shopxxx.shopify.customers` as customers

Join  `shopxxx.shopify.orders` as orders

on customer = orders.customer

WHERE
 customers.orders_count  != 0 
AND customers.created_at > "2018-09-12 00:00:00 UTC" and  customers.created_at < "2019-09-12 23:59:59 UTC"

and  orders.source_name != 'web' 


order by  customers.created_at desc
Ad

Answer

Below is for BigQuery Standard SQL

#standardSQL
SELECT
  customers.customer_id,
  customers.created_at AS customer,
  customers.orders_count,
  ARRAY_AGG(STRUCT(orders.created_at AS created_at, orders.order_number AS order_number) ORDER BY orders.created_at LIMIT 1)[OFFSET(0)].*
FROM `shopxxx.shopify.customers` AS customers
JOIN `shopxxx.shopify.orders` AS orders
ON customers.customer_id = orders.customer_id
WHERE customers.orders_count  != 0 
AND customers.created_at > "2018-09-12 00:00:00 UTC" AND  customers.created_at < "2019-09-12 23:59:59 UTC"
AND  orders.source_name != 'web' 
GROUP BY 1, 2, 3
ORDER BY customers.created_at DESC  

or different approach

#standardSQL
SELECT
  customers.customer_id,
  customers.created_at AS customer,
  customers.orders_count,
  orders.created_at AS created_at, 
  orders.order_number AS order_number
FROM `shopxxx.shopify.customers` AS customers
JOIN (
  SELECT customer_id, ARRAY_AGG(STRUCT(created_at, order_number) ORDER BY created_at LIMIT 1)[OFFSET(0)].*
  FROM `shopxxx.shopify.orders` 
  WHERE source_name != 'web' GROUP BY customer_id
) AS orders
ON customers.customer_id = orders.customer_id
WHERE customers.orders_count  != 0 
AND customers.created_at > "2018-09-12 00:00:00 UTC" AND  customers.created_at < "2019-09-12 23:59:59 UTC"
ORDER BY customers.created_at DESC
Ad
source: stackoverflow.com
Ad