MeidokonWiki:

Regexes for dumb substring searches aren't that terrible.

This test is really dumb though, but maybe provides meaningful data.

   1 import re
   2 import time
   3 
   4 
   5 d = '/usr/share/dict/british-english'
   6 words = open(d,'r').readlines()
   7 
   8 test_string = 'e'
   9 test_regex = re.compile(test_string)
  10 
  11 
  12 stamp_start = time.time()
  13 
  14 print "Simple substring test"
  15 for word in words:
  16         _ = test_string in word
  17 
  18 stamp_mid = time.time()
  19 
  20 print "Regex test"
  21 for word in words:
  22         _ = test_regex.search(word)
  23 
  24 stamp_end = time.time()
  25 
  26 
  27 print "Substring time was %s" % (stamp_mid - stamp_start,)
  28 print "Regex time was     %s" % (stamp_end - stamp_mid,)

And the output, is fairly consistent. About twice as fast to do it simply.

Simple substring test
Regex test
Substring time was 0.0137870311737
Regex time was     0.0306131839752

MeidokonWiki: furinkan/python/substring_searching (last edited 2013-06-18 08:50:09 by furinkan)