A friend of mine gave me a call a while back to help come up with an efficient way of making customer IDs.

The problem is they have an odd system for this.

It goes something like this:

- The customer ID is the customer’s last name truncated to 6 characters
- If that’s taken replace the last position with a 0
- This continues until you replace the last two characters and so on

So you’d start off with SMITH, then SMITH0 … SMITH9, progress to SMIT00 … SMIT99 and so on.

The simple solution is to simply iterate through all the names looking for the first free one.

This would work well for not-so-common names. Smith, on the other hand, would be just about pessimal because if you already have 500 Smiths in your system, you will now incur at least 501 queries to find the first free one.

There has to be a better way.

Of course, there *is* a better way. :-)

Basically this winds up being a two-part solution. This assumes that a name can’t have a number in it so you either have to drop numbers or fire some validation up front.

First, find out how many number spots are taken up:

int GetNumDigits(string name) {
for (int numDigits = 0; numDigits < 6; numDigits++) {
string testKey = FormatKey(name, 0, numDigits);
if (!customer.HasKey(testKey)) {
return numDigits-1;
}
}
throw new Exception("Keyspace Exhausted");
}

Now, all we have to do is find the max number in there:

int FindMaxKey(string name, int digits, int digitsLeft,
int maxRadix) {
int baseNum = maxRadix * 10;
for (int i = 0; i < 9; i++) {
string testKey = FormatKey(name,
(Math.Pow(10, digitsLeft-1) * baseNum) + i,
digits);
if (!customer.exists(testKey) {
if (digitsLeft > 1) {
return FindMaxKey(name, digits, digitsLeft - 1,
baseNum + (i-1));
}
else {
return baseNum + (i-1);
}
}
}
if (digitsLeft > 1) {
return FindMaxKey(name, digits, digitsLeft - 1, baseNum + 9);
}
else {
return baseNum + 9;
}
}

Let’s assume that SMI091 is the current max ID for “Smith” for this example. We call GetNumDigits(“Smith”) and that will return 3. Next we call FindMaxKey(“Smith, 3, 3, 0);

This is a recursive function. At first it’ll look for SMI000 and find it. SMI100 will not be found. Now we recurse: FindMaxKey(“Smith”, 3, 2, 0). After that FindMaxKey(“Smith”, 3, 1, 9). Finally it returns 91.

All that’s left at this time is add one and you have it done!

The worst case is 999999. This will incur only 6 + 9*6 queries.

### Like this:

Like Loading...

*Related*

It would only 6 + 9*6 queries if you queried each one one at a time. But you could also keep a table of the highest post-value (999999, 50, whatever) and simply know with one query, or alternately pick a random number and kind of do a sorta-kinda binary search. First search for the endpoint (maybe by adding another 9):

JSmith9

JSmith99

JSmith999

[etc]

Once you find one that doesn’t exist, like JSmith9999, and we know JSmith999 exists, then just check the halfway point for existence, etc etc, much like a binary search of a sorted list.

Hmm… Another interesting idea… Scrapple came up with a cool SQL idea that I’m going to write up and see if I can get a handle on how SQL Server does the query optimization on it.

The first thing I would think about with the solution you proposed is that you can have things like “Smith” and “Smiley” start to intersect because they both start with “Smi” — “SMI000” is a valid one for both of them. (The field is limited to 6 characters for arbitrary reasons)

I think a likely run time on the one I posted would be closer to 1-10 queries for the /normal/ cases as opposed to the pessimal case where /all/ the IDs are used up. 60 queries is a lot better than 1,000,000 queries. :-)