Jump to content

FuzzyString-UDF - fuzzy string comparison and search in string arrays


Recommended Posts

This UDF provides algorithms for fuzzy string comparison and the associated similarity search in string arrays.
It offers functions for character-based comparisons, comparing the phonetics of words and the geometric distance of characters on a keyboard.

In this way, typing errors can be recognized, similar-sounding words can be detected and other spellings of words can be included in further processing.

The current function list of the UDF:

--------- fuzzy array handling:
_FS_ArraySearchFuzzy           - finds similar entries for a search value in an array
_FS_ArrayToPhoneticGroups      - groups the values of an array according to their phonetics

--------- character-based metrics:
_FS_Levenshtein                - calculate the levenshtein distance between two strings
_FS_OSA                        - calculate the OSA ("optimal string alignment") between two strings
_FS_Hamming                    - calculate the hamming distance between two strings

--------- phonetic metrics:
_FS_Soundex_getCode            - calculate the soundex code for a given word to represent the pronounciation in english
_FS_Soundex_distance           - calculate the soundex-pattern for both input values
_FS_SoundexGerman_getCode      - calculate the modified soundex code for german language for a given word to represent the pronounciation in german
_FS_SoundexGerman_distance     - calculate the soundexGerman-pattern for both input values
_FS_Cologne_getCode            - calculate the cologne phonetics code for german language for a given word to represent the pronounciation in german
_FS_Cologne_distance           - calculate the cologne phonetics distance between both input values

--------- key-position based metrics:
_FS_Keyboard_GetLayout         - return a map with coordinates for the characters for using in _FS_Keyboard_Distance_Chars()
_FS_Keyboard_Distance_Chars    - calculates the geometric key spacing between two characters on a keyboard
_FS_Keyboard_Distance_Strings  - calculate the keyboard-distance between two strings


>>sourcecode and download on github<<

Link to comment
Share on other sites

Nice collection of fuzzy algorithms, thanks for sharing. Just one thing I noticed and could be improved visually it's the bigint 9223372036854700000 that might be declared as a global constant since it's used several times.

When the words fail... music speaks.

Link to comment
Share on other sites

Added to the wiki :)

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...