Recipe 1.24. Making Some Strings Case-Insensitive
Credit: Dale Strickland-Clark, Peter Cogolo, Mark
McMahon
Problem
You want to treat some strings so that all comparisons and lookups
are case-insensitive, while all other uses of the strings preserve
the original case.
Solution
The best solution is to wrap the specific strings in question into a
suitable subclass of str:
class iStr(str):
"""
Case insensitive string class.
Behaves just like str, except that all comparisons and lookups
are case insensitive.
"""
def _ _init_ _(self, *args):
self._lowered = str.lower(self)
def _ _repr_ _(self):
return '%s(%s)' % (type(self)._ _name_ _, str._ _repr_ _(self))
def _ _hash_ _(self):
return hash(self._lowered)
def lower(self):
return self._lowered
def _make_case_insensitive(name):
''' wrap one method of str into an iStr one, case-insensitive '''
str_meth = getattr(str, name)
def x(self, other, *args):
''' try lowercasing 'other', which is typically a string, but
be prepared to use it as-is if lowering gives problems,
since strings CAN be correctly compared with non-strings.
'''
try: other = other.lower( )
except (TypeError, AttributeError, ValueError): pass
return str_meth(self._lowered, other, *args)
# in Python 2.4, only, add the statement: x.func_name = name
setattr(iStr, name, x)
# apply the _make_case_insensitive function to specified methods
for name in 'eq lt le gt gt ne cmp contains'.split( ):
_make_case_insensitive('_ _%s_ _' % name)
for name in 'count endswith find index rfind rindex startswith'.split( ):
_make_case_insensitive(name)
# note that we don't modify methods 'replace', 'split', 'strip', ...
# of course, you can add modifications to them, too, if you prefer that.
del _make_case_insensitive # remove helper function, not needed any more
Discussion
Some implementation choices in class
iStr are worthy of notice. First, we choose to
generate the lowercase version once and for all, in method _
_init_ _, since we envision that in typical uses of
iStr instances, this version will be required
repeatedly. We hold that version in an attribute that is private, but
not overly so (i.e., has a name that begins with one underscore, not
two), because if iStr gets subclassed (e.g., to make
a more extensive version that also offers case-insensitive splitting,
replacing, etc., as the comment in the
"Solution" suggests),
iStr's subclasses are quite likely
to want to access this crucial "implementation
detail" of superclass iStr!
We do not offer "case-insensitive"
versions of such methods as replace, because
it's anything but clear what kind of input-output
relation we might want to establish in the general case.
Application-specific subclasses may therefore be the way to provide
this functionality in ways appropriate to a given application. For
example, since the replace method is not wrapped,
calling replace on an instance of
iStr returns an instance of str,
not of iStr. If that is a
problem in your application, you may want to wrap all
iStr methods that return strings, simply to ensure
that the results are made into instances of iStr.
For that purpose, you need another, separate helper function, similar
but not identical to the _make_case_insensitive one
shown in the
"Solution":
def _make_return_iStr(name):
str_meth = getattr(str, name)
def x(*args):
return iStr(str_meth(*args))
setattr(iStr, name, x)
and you need to call this helper function
_make_return_iStr on all the names of relevant
string methods returning strings such as:
for name in 'center ljust rjust strip lstrip rstrip'.split( ):
_make_return_iStr(name)
Strings have about 20 methods (including special methods such as
_ _add_ _ and _ _mul_ _) that
you should consider wrapping in this way. You can also wrap in this
way some additional methods, such as split and
join, which may require special handling, and
others, such as encode and
decode, that you cannot deal with unless you also
define a case-insensitive unicode subtype. In
practice, one can hope that not every single one of these methods
will prove problematic in a typical application. However, as you can
see, the very functional richness of Python strings makes it a bit of
work to customize string subtypes fully, in a general way without
depending on the needs of a specific application.
The implementation of iStr is careful to avoid the
boilerplate code (meaning repetitious and therefore bug-prone code)
that we'd need if we just overrode each needed
method of str in the normal way, with
def statements in the class body. A custom
metaclass or other such advanced technique would offer no special
advantage in this case, so the boilerplate avoidance is simply
obtained with one helper function that generates and installs wrapper
closures, and two loops using that function, one for normal methods
and one for special ones. The loops need to be placed
after the class statement, as
we do in this recipe's Solution, because they need
|