Sort search results by relevance

When creating a search function for your website, it is fairly important to sort results by relevance. I had to do this last week, so I might as well share how I did it.
I wanted to search the News model with simple keywords. Here’s my News table :

CREATE TABLE `news` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`slug` varchar(100) NOT NULL,
`created` datetime NOT NULL,
`published` tinyint(1) NOT NULL,
`title` varchar(150) NOT NULL,
`description` varchar(255) NOT NULL,
`text` text,
PRIMARY KEY (`id`),
KEY `slug` (`slug`)
);

In order to keep beautiful URLs, I made a simple function in NewsController that redirects to the search function using the data submitted in post. The format is param:value/param2:value

function searchToUrl(){
foreach($this->data['News'] as $paramKey=>$paramValue){
if ($paramValue!=null && $paramValue!='0')
$redirectStr .= $paramKey.':'.urlencode($paramValue).'/';
}
$this->redirect(array('action'=>'search', $redirectStr));
}

The search function I wrote checks for every keyword submitted and awards points when the keyword is found. Every word is worth 3 points when found in the title, 1 point when found in the main text. 4 points is also awarded for the exact match in the text and 10 points for the exact match in the title. The search function I wrote does more than just keyword search, but that’s not the point of this post, so we’ll skip it for now (if you’d like to see it all, ask away and I’ll write something about it).

//Part of the search() function in the NewsController
$conds = array();
$this->paginate = array(
'News'=>
array(
'recursive'=> 1,
'limit'=>15,
'fields'=> array('DISTINCT id', 'slug', 'title_fr','classification',
'text_fr', 'Reference.name', 'datepublished')
),
'Country'=>
array(
'recurisve'=>-1,
'fields'=>array('id', 'name_fr'))
);
 
if(isset($this->params['named']['keywords'])){
//Replaces ' by '', to avoid breaking the SQL queries
$this->params['named']['keywords'] = str_replace("'", "''", $this->params['named']['keywords']);
//Split de keywords in an array
$keywords = preg_split('/[\s,]+/',$this->params['named']['keywords'],-1,PREG_SPLIT_NO_EMPTY);
$condsKw = null;
//Relevance Order is a pseudo col in the SQL queries that is going to be used for order.
$relOrd = '';
foreach($keywords as $keyword){
//Title
$condsKw[]= "title LIKE '%$keyword%' ";
//Each keyword in title is worth 3 points.
$relOrd .= "+ (CASE WHEN title LIKE '%$keyword%' THEN 3 ELSE 0 END) ";
 
//Text
$condsKw[]= "text LIKE '%$keyword%' ";
//Each keyword in text is worth 1 points.
$relOrd .= "+ (CASE WHEN text LIKE '%$keyword%' THEN 1 ELSE 0 END) ";
}
//Must be at least 3 words and 7 chars to receive points for exact string match
//This can obviously be tailored to fit your needs.
if(count($keywords) -> 2 && strlen($this->params['named']['keywords']) -> 6 ){
//10 points if the exact match is found in the title
$relOrd .= "+ (CASE WHEN title LIKE '%".$this->params['named']['keywords']."%' THEN 10 ELSE 0 END) ";
 
//4 points if the exact match is found in the text
$relOrd .= "+ (CASE WHEN text LIKE '%".$this->params['named']['keywords']."%' THEN 4 ELSE 0 END) ";
}
//Formats then adds the pseudo column to the field list.
$relOrd = '(' . substr($relOrd, 1) . ') AS Relevance';
$this->paginate['News']['fields'][] = $relOrd;
$conds['AND'][] = join(' OR ',$condsKw);
 
//Sets the order if no other order is defined in the params.
if(!isset($this->params['named']['sort'])){
$this->paginate['News']['order'] = array('Relevance'=>'DESC');
}
//Sets keywords for use in the view, this can be used to highlight the keywords in the search results
$this->set('kwArr', $keywords);
 
//Preserves the data in the html form.
$this->data['News'] = $this->params['named'];
 
$this->set('news', $this->paginate('News', $conds));

I think everything else should be fairly straightforward (the view etc.) If not, ask away in a comment!