Ad

A Table Left Outer Joining With Itself: How Does This Work?

I'm reading the book SQL Antipatterns and trying to understand its example of building a "tree" with a self-referential table such as

                         Comments
-------------------------------------------------------------------
  comment_id | parent_id | author |      comment
-------------------------------------------------------------------
       1          NULL     Fran      What's the cause of this bug?
       2            1      Ollie     I think it's a null pointer.
       3            2      Fran      No, I checked for that.
       4            1      Kukla     We need to check for invalid input.
       5            4      Ollie     Yes, that's a bug.
       6            4      Fran      Yes, please add a check.
       7            6      Kukla     That fixed it.

So the book says

You can retrive a comment and its immediate children using a relatively simple query:

SELECT c1.*, c2.*
FROM Comments c1 LEFT OUTER JOIN Comments c2
  ON c2.parent_id = c1.comment_id

I'm trying to understand how this works. The way I've always understood left outer joins between a table t1 and t2 is that you take all the rows from t1 and for the rows where the ON clause is not satisfied you fill in NULL for the columns from the second table. In this case there is only one table, but I can imagine that the query is taking place with two tables where the second one is a copy of the first. Still, I don't understand how that query returns

two levels of the tree.

What exactly is the resulting table and can you walk me through the logic of how it is resulted?

Ad

Answer

The resulting table will be the following:

                         Comments
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
  c1.comment_id | c1.parent_id | c1.author |  c1.comment                          | c2.comment_id | c2.parent_id | c2.author | c2.comment                         | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
       1               NULL      Fran         What's the cause of this bug?             2               1         Ollie        I think it's a null pointer.
       1               NULL      Fran         What's the cause of this bug?             4               1         Kukla        We need to check for invalid input. 
       2               1         Ollie        I think it's a null pointer.              3               2         Fran         No, I checked for that.                                                      
       3               2         Fran         No, I checked for that.                   NULL            NULL      NULL         NULL                                
       4               1         Kukla        We need to check for invalid input.       5               4         Ollie        Yes, that's a bug.                 
       4               1         Kukla        We need to check for invalid input.       6               4         Fran         Yes, please add a check.           
       5               4         Ollie        Yes, that's a bug.                        NULL            NULL      NULL         NULL                                
       6               4         Fran         Yes, please add a check.                  7               6         Kukla        That fixed it.
       7               6         Kukla        That fixed it.                            NULL            NULL      NULL         NULL

In the ON clause, we have c2.parent_id = c1.comment_id. This will mean that the "right" table (c2)'s parent.id will join with the "left" table (c1)'s comment_id.

The table branches out to itself by mapping each row with its child comments. The result on the right side (c2) will be all the children of c1 entries, which repeat for each child. Since we're doing a left join, rows without children will simply return NULL on the c2 columns.

Ad
source: stackoverflow.com
Ad