0 votes

Efficy calculates the value for the PHONETIC field using the NAME field as input parameter. I can see some correlation with the SQL Server function Soundex, but it's clearly not the same algorithm.

We have an existing Oracle database with millions of contacts that were initially inserted using SQL scripts. For all contacts that didn't receive an update yet, the PHONETIC column is still empty.

Is there a way to update this column without passing through the API, like with a SQL function?

NAME          FIRSTNAME PHONETIC     Soundex(NAME)
------------- --------- ------------ -------------
Pauwels       Ludovic   p042         P420
Van der Aalst Tine      v05930690423 V500
Simpson       Homer     s051205      S512
Peeters       Frans     p03062       P362
Van der Elst  Mieke     v05930690423 V500
Simpson       Bart      s051205      S512

Used SQL query:

select K_CONTACT, NAME, FIRSTNAME, PHONETIC, 
soundex(NAME) as 'Soundex(NAME)' from CONTACTS
asked in Database by (7.4k points)

1 Answer

+1 vote
Best answer

I'm not a specialist of this part of the code, but I can tell you there is a specific Efficy code, using the algorithm described in http://en.wikipedia.org/wiki/Soundex, and adding the following:

  • improvement for vowels: coded in 0
  • digits coded in 7
  • other character encoded in 9, otherwise a Unicode character gives #0 and truncates the result
  • some consonants are replaced by others
  • "ph" replaced by "f", "y" replaced by "i"
  • "стч" and "тч" replaced by "щ"
  • limited to 12 chars rather than 4
  • Cyrillic and Arabic characters added
  • Japanese and Chinese can not have a Soundex, they do not compose words with letters
answered by (968 points)
selected by
Well, thanks for the detailed answer Anthony. It's quite clear now that trying to reproduce the exact same code using SQL functions will be a complex and time consuming job. We will check our other options
Anthony,

Can we access the Efficy Phonetic calculation function from the API? I'm considering the idea of processing a CSV file with columns K_CONTACT, NAME and producing an output file K_CONTACT, PHONETIC.

Because there is no openeditcontext required, it should be quite fast.

When the CSV is created, we can use SQL to update the million of records
Unfortunately the "StrSoundex" function is currently not published for scripting.
We can easily add this in a future build.
I guess that the customer doesn't use Efficy 11 yet, we should discuss how we can provide you with this function.
The simplest would be to provide it in a DataSynchroRemote and/or DataSynchroEmbedded of Efficy 11, it's an executable on it's own that doesn't require us to upgrade something.

With the embedded, we can do the update direclty.

Would there be a cost involved? Should we commercially discuss it with the client?
The commercial discussion would be more expensive that actually doing the job :).
We'll send you a DataSynchroRemote of Efficy 11 with the additional function "ComputeSoundex" before the end of the week.
Well, that's the spirit. Thanks Robert!
1,249 questions
1,519 answers
1,859 comments
328 users