Using MySQL INNER JOIN in CakePHP Pagination

First of all, I've got to hand it over to Matt he really did a BBBIIIGGG favor to the CakePHP community by publishing his guide to advanced CakePHP Techniques. This guide / book will give a great insight into the framework to anyone who is a seasoned programmer and is picking up Cake for the first or so time. And I'm going to be floating ideas to compliment the advanced techniques and all in all promote good programming practices of what I'm aware of.
Now, I've seen a lot of shitty code when it comes to CakePHP. Yes, I've even seen mysql_query() calls in views ! ... yes, I've lived that day and kept my sanity in tact. But I can't blame the programmers too because obviously they were newbies and were under a lot of pressure to "get things going" by their blood sucking employers. Anyhoo ... this might be the subject of another post, BUT I really had to get it out of my system ... *phew* ... feel so light now
So, MySQL INNER JOINS ... when should you use them ? - simple answer: when you want to filter out data in your result set. And it's quicker than filtering out results in the "WHERE" clause. Don't have any metrics to show to support this conclusion right now, but I speak in the light of many tests I've conducted on large datasets. A more simpler theory is that the "WHERE" clause needs to filter out a lot more rows in a result-set obtained as a result of using "LEFT JOIN". CakePHP's logic however is sound to use LEFT JOIN as the intention is not to filter out the records, it's merely to include whichever records belongs to the conditions you supply. That's why it's "Containable" behavior is so cool (special thanks to Felix on that for maturing it and making a part of the Cake's core).
The more you familiarize yourself with Cake's datasource classes the better. The most excellent example was published on the bakery by Nate on how to use JOINs in CakePHP. I think this should be made part of the documentation too. This could actually make you get rid of "overriding" Controller::paginate() function. When you come to know about the flexibility offerred by the datasource class you love it even more - a simple example:

  1.  
  2. class PostsController extends AppController {
  3.  
  4. public function by_tag ( $tag ) {
  5. /**
  6.   * This will fetch Posts tagged $tag (say, 'PHP')
  7.   */
  8. $this->paginate['Post'] = array(
  9. 'limit' => 10
  10. , 'contain' => ''
  11. , 'conditions' => array(
  12. 'Post.published' => 1
  13. )
  14. , 'fields' => array('Post.*', 'Tag.*')
  15. , 'joins' => array(
  16. array(
  17. 'table' => 'posts_tags'
  18. , 'type' => 'INNER'
  19. , 'alias' => 'PostTag'
  20. , 'conditions' => array(
  21. 'Post.id = PostTag.post_id'
  22. )
  23. )
  24. , array(
  25. 'table' => 'tags'
  26. , 'alias' => 'Tag'
  27. , 'type' => 'INNER'
  28. , 'conditions' => array(
  29. "PostTag.tag_id = Tag.id AND Tag.name = '$tag'"
  30. )
  31. )
  32. )
  33. );
  34.  
  35. $data = $this->paginate('Post');
  36. $this->set(compact('data'));
  37. }
  38. }
  39.  

This is just a simple example of what you can achieve by adding simple joins in your Model::find() conditions and of course in the paginate part. I've stretched it a bit further. I've actually used sub-queries and sub-joins, really complex stuff when paginating some complex data sets. Thanks to the 'joins' I never had to override the Controller::paginate() method ever. Just for the sake of example, let's say I want to retrieve posts tagged in 'PHP' and 'CakePHP' written by users who have a rating above 3. Of course this can be done in other ways, here is one using a sub-query join in CakePHP elegantly:

  1.  
  2.  
  3. // work this out wherever you want it - in your model or controller
  4. // but if I were you, I'd put this in my model
  5.  
  6. $join = "SELECT posts.id AS POST_ID FROM posts JOIN authors ON (posts.author_id = authors.id)";
  7. $join = $join.' '."JOIN users_ratings ON (authors.user_id = users_ratings.user_id AND users_ratings.rating > 3)"
  8. $join = $join.' '."WHERE 1=1";
  9.  
  10. // in your controller
  11. $this->paginate['Post'] = array(
  12. 'limit' => 10
  13. , 'contain' => ''
  14. , 'conditions' => array(
  15. 'Post.published' => 1
  16. )
  17. , 'fields' => array('Post.*', 'Tag.*')
  18. , 'joins' => array(
  19. array(
  20. 'table' => 'posts_tags'
  21. , 'type' => 'INNER'
  22. , 'alias' => 'PostTag'
  23. , 'conditions' => array(
  24. 'Post.id = PostTag.post_id'
  25. )
  26. )
  27. , array(
  28. 'table' => 'tags'
  29. , 'alias' => 'Tag'
  30. , 'type' => 'INNER'
  31. , 'conditions' => array(
  32. "PostTag.tag_id = Tag.id AND Tag.name IN('PHP', 'CakePHP')"
  33. )
  34. )
  35. , array(
  36. 'table' => '('.$join.')'
  37. , 'alias' => 'FILTERED_RESULTS'
  38. , 'type' => 'INNER'
  39. , 'conditions' => array(
  40. "Post.id = FILTERED_RESULTS.POST_ID"
  41. )
  42. )
  43. )
  44. );
  45.  

And this will elegantly filter out the posts you need
Conclusion:  you can really write any kind of a query and really devise a condition based system that would add filters auto-magically. (I will present such a system in another post) - Remember, CakePHP is all about auto-magic ! ... which is actually the culmination of "convention over configuration" so use it to the fullest !