Pattern matching is to check for the presence of some string in a given text. You must be wondering what I am intending to convey by the term fast in the title of this article.
Actually, PHP is already well enriched with many string manipulation functions including preg_match at apex , all contributing good for pattern matching. But they all seem to be very slow and with low performance when need to scale big, the pattern matching.
Implement Fast Pattern Matching Using PHP
Like if one needs to search for multiple words or patterns simultaneously in a text, existing PHP functions fail to provide desired performance.
PHP has added a wonderful package under its stock with name
“PHP fast pattern matching”
The full code can be downloaded from here:
http://www.phpclasses.org/browse/package/8994/download/zip.html
The main file inside this package is fastpm.php which contains the definition of the main class facilitating the fast pattern matching php.
fastpm.php
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
<?php /*************************************************************** * * (c) 2015 Chi Hoang ([email protected]) * All rights reserved * ***************************************************************/ namespace Fastpm; define ( "EMPTY_NODE", "0" ); define ( "START_CHAR_COUNT","0" ); class Ternarytrie { var $head; var $result; function insert ( &$n, $payload, $pos, $is_leaf=false ) { if ( ! is_object ( $n ) ) { $n = new Node ( $payload->payload [ $pos ] ); } if ( ord ( $payload->payload [ $pos ] ) < ord ( $n->char ) ) { $this->insert ( $n->left, $payload, $pos, $is_leaf ); } else if ( ord ( $payload->payload [ $pos ] ) > ord ( $n->char )) { $this->insert ( $n->right, $payload, $pos, $is_leaf ); } else { if ( $pos+1 == strlen ( $payload->payload ) ) { $n->word = $payload; if ($is_leaf == false) { $n->is_leaf = false; } } else { $this->insert ( $n->mid, $payload, $pos+1, $is_leaf ); } } } ////////////////////////////////////////////////////////////////////////// function search ( &$n, $key, $pos ) { if ( !is_object ( $n ) ) return EMPTY_NODE; if ( ord ( $key[ $pos ] ) < ord ( $n->char ) ) { $this->search ( $n->left, $key, $pos ); } else if ( ord ( $key [ $pos ] ) > ord ( $n->char ) ) { $this->search ( $n->right, $key, $pos ); } else { if ( $pos+1 == strlen ( $key ) && $n->is_leaf==false) { $this->result[] = $n->word->get(); } else if ($n->is_leaf==false) { $this->search ( $n->mid, $key, $pos+1 ); $this->result[] = $n->word->get(); } else { $this->search ( $n->mid, $key, $pos+1 ); } } } } class Payload { var $payload; public function __construct ( $string ) { $this->payload = $string; } public function get () { return $this->payload; } }; /////////////////////////////////////////////////////////////////////////////////////////// class Node { var $is_leaf; var $left, $mid, $right; var $word, $char; public function __construct ( $char=null ) { if ( $char == null ) { $this->left = EMPTY_NODE; $this->right = EMPTY_NODE; $this->mid = EMPTY_NODE; $this->is_leaf = false; $this->word = $this->char = ""; } else { $this->char = $char; $this->is_leaf = true; } } public function __unset ( $name ) { echo "$name"; } } class Fastpm extends Ternarytrie { public function add ( $key ) { $this->insert ( $this->head, new Payload ( $key ), START_CHAR_COUNT ); for ($i=(strlen($key))-1;$i>=0;$i--) { $this->insert($this->head, new Payload(substr($key,$i,strlen($key))), 0, 1); } } public function match ( $key ) { for ($i=0;$i<strlen($key);$i++) { $result = $this->search ( $this->head, substr($key,$i,strlen($key)) , START_CHAR_COUNT ); } return implode(",",$this->result); } } ?> |
and here is the example using this wonderful class.
Example.php
1 2 3 4 5 6 7 8 9 10 11 12 |
<?php require_once ( "fastpm.php" ); $tree = new Fastpm\Fastpm (); $tree->add("computer"); $tree->add("of"); $tree->add("match"); $tree->add("pattern"); echo $tree->match("In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures."); ?> |
Here is the Output:
computer, pattern, match, of, of, of, of, pattern, pattern, match, pattern, of
The Output Reveals Some Interesting Points Here:
- Output displays all the search patterns if present, with each pattern displayed exactly the number of times, it is present in the given text.
- The output exactly follows the order in which the pattern is present in the given text.
Now have a try with this package for Fast Pattern Matching in PHP and tell me isn’t it fast?
Hopefully you would like this. Thanks