This week I had to register a new data SIM card for my iPad. This SIM card is from T-mobile and affords 200 MB of data per month for free, which is a very good deal indeed. As part of the online setup process you have to provide a contact telephone number. I duly entered my 10 digit phone number in the following format:

555 123 4567

This is surely one of the most common way of specifying a (US) phone number, with spaces separating the area code (555) and exchange number (123). Yes, this is a fictional number. I think it is also not unreasonable to expect some people to format the same telephone number in any of the following ways:

555-123-4567

(555) 123 4567

+1 555 123 4567

It is clear though, that T-mobile can not deal with such complexities. Because, as soon as I tried submitting the web form, this happened:

Yes, I was greeted with Enter Digits Only For Phone Number. It is 2013, and regular expressions — a computational way of recognizing certain patterns of text — have been around since the 1960s. Perhaps T-mobile don't know about them.

The ability to extract just the digits from a series of characters which might represent a phone number is an exercise which is frequenly used when teaching about regular expressions. A Google search for removing spaces from phone numbers using regular expressions links to over 11 million results. Perhaps T-mobile has not seen any of these pages.

In the Unix & Perl course which I co-teach, we might suggest that students could strip non-numeric characters from a phone number using something like the following:

my $number = "555 123 4567";
$number =~ s/[^\d]//g;

This second line uses a negated character class to match anything which isn't a digit (\d) and replace globally (g) with nothing (//).