Write a
ci_string class that is identical to the standard
std::string class but that is case-insensitive in the same
way as the commonly provided extension stricmp().
The "how can I make a case-insensitive string?"
question is so common that it probably deserves its own FAQ—hence
this Item.
Here's what we want to achieve:
ci_string s( "AbCdE" );
// case insensitive
//
assert( s == "abcde" );
assert( s == "ABCDE" );
// still case-preserving, of course
//
assert( strcmp( s.c_str(), "AbCdE" ) == 0 );
assert( strcmp( s.c_str(), "abcde" ) != 0 );
The key here is to understand what a
string actually is in Standard C++. If you look in your
trusty string header, you'll see something like this:
typedef basic_string<char> string;
So string isn't really a class; it's a
typedef of a template. In turn, the
basic_string<> template is declared as follows,
possibly with additional implementation-specific template
parameters:
template<class charT,
class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_string;
So "string" really means
"basic_string<char, char_traits<char>,
allocator<char> >," possibly with additional
defaulted template parameters specific to the implementation you're
using. We don't need to worry about the allocator part,
but the key here is the char_traits part, because
char_traits defines how characters interact—and
compare!
So let's compare strings.
basic_string supplies useful comparison functions that let
you compare whether one string is equal to another, less
than another, and so on. These string comparison functions
are built on top of character comparison functions supplied in the
char_traits template. In particular, the
char_traits template supplies character comparison
functions named eq() and lt() for equality and
less-than comparisons, and compare() and find()
functions to compare and search sequences of characters.
If we want these to behave differently, all we
have to do is provide a different char_traits template.
Here's the easiest way:
struct ci_char_traits : public char_traits<char>
// just inherit all the other functions
// that we don't need to replace
{
static bool eq( char c1, char c2 )
{ return toupper(c1) == toupper(c2); }
static bool lt( char c1, char c2 )
{ return toupper(c1) < toupper(c2); }
static int compare( const char* s1,
const char* s2,
size_t n )
{ return memicmp( s1, s2, n ); }
// if available on your platform,
// otherwise you can roll your own
static const char*
find( const char* s, int n, char a )
{
while( n-- > 0 && toupper(*s) != toupper(a) )
{
++s;
}
return n >= 0 ? s : 0;
}
};
And finally, the key that brings it all
together:
typedef basic_string<char, ci_char_traits> ci_string;
All we've done is create a typedef
named ci_string that operates exactly like the standard
string (after all, in most respects it is the standard string), except that
it uses ci_char_traits instead of
char_traits<char> to get its character comparison
rules. Since we've handily made the ci_char_traits rules
case-insensitive, we've made ci_string itself
case-insensitive, without any further surgery—that is, we have a
case-insensitive string without having touched
basic_string at all. Now that's extensibility.