Monday, March 22, 2010

The mysterious Data Center Technical Specialist certification

On March 4, I received an email from Novell Technical Training that I had received a Novell certification for "Data Center Technical Specialist". This came as a surprise to me because I had not applied for this certification, taken any tests for this certification, not had I even heardof this certification.

Due to a cross marketing agreement with the Linux Professional Institute, I had applied for and received the Novell Certified Linux Administrator certification a few weeks prior. This seemed legitimate to me, as I had extensive experience with SUSE Linux and my Linux skills are still sharp. I continue to perform Linux server administration as part of my daily work.

However, I am not quite sure what the Data Center Technical Specialist is supposed to represent. Confused, I wrote to Novell Training asking what the certification meant. I received this equally mysterious reply:
Thank you for contacting Novell Training Services. You have received the certification as part of some changes we have made recently to our partner requirements. As part if these changes, some of the exams/certifications you have now count toward the new certification.

I searched the official Novell Certification web site, and this certification does not appear anywhere. I suspect, but can't confirm that it may be part of the Solution Provider program. As such, it is probably more of a value to Novell sales than to an individual technician. I remain somewhat baffled.

Wednesday, March 3, 2010

Fuzzy string matching in PostgreSQL

A recent project required me to use fuzzy string matching, or sound alike matching, in an application that searched a list of names. It turns out there is a contrib module for the PostgreSQL database called fuzzystrmatch that provides several different matching algorithms.

The task at hand involved rewriting a legacy application, originally in PICK, in Ruby on Rails. The PICK application used a soundex search to find names of people that sounded like the search string.

Three algorithms are available as PostgreSQL functions (after installation of the fuzzystrmatch module). They are soundex(), levenshtein(), and metaphone().

Both soundex and metaphone convert a string into character codes. Soundex uses 4 characters and metaphone uses a configurable number of characters. Levenshtein directly compares two strings and returns an integer indicating how well the two strings match.

After some trial and error, I found that metaphone produced better results than soundex. I didn't test the Levenshtein function.

To improve the results, I added a classic substring search using ILIKE. The combination of ILIKE and metaphone gave me a broad, but reasonably accurate fuzzy string search.