Jump to content

Recommended Posts

Posted

This UDF provides algorithms for fuzzy string comparison and the associated similarity search in string arrays.
It offers functions for character-based comparisons, comparing the phonetics of words and the geometric distance of characters on a keyboard.

In this way, typing errors can be recognized, similar-sounding words can be detected and other spellings of words can be included in further processing.

The current function list of the UDF:

--------- fuzzy array handling:
_FS_ArraySearchFuzzy           - finds similar entries for a search value in an array
_FS_ArrayToPhoneticGroups      - groups the values of an array according to their phonetics

--------- character-based metrics:
_FS_Levenshtein                - calculate the levenshtein distance between two strings
_FS_OSA                        - calculate the OSA ("optimal string alignment") between two strings
_FS_Hamming                    - calculate the hamming distance between two strings

--------- phonetic metrics:
_FS_Soundex_getCode            - calculate the soundex code for a given word to represent the pronounciation in english
_FS_Soundex_distance           - calculate the soundex-pattern for both input values
_FS_SoundexGerman_getCode      - calculate the modified soundex code for german language for a given word to represent the pronounciation in german
_FS_SoundexGerman_distance     - calculate the soundexGerman-pattern for both input values
_FS_Cologne_getCode            - calculate the cologne phonetics code for german language for a given word to represent the pronounciation in german
_FS_Cologne_distance           - calculate the cologne phonetics distance between both input values

--------- key-position based metrics:
_FS_Keyboard_GetLayout         - return a map with coordinates for the characters for using in _FS_Keyboard_Distance_Chars()
_FS_Keyboard_Distance_Chars    - calculates the geometric key spacing between two characters on a keyboard
_FS_Keyboard_Distance_Strings  - calculate the keyboard-distance between two strings


>>sourcecode and download on github<<

Posted

Nice collection of fuzzy algorithms, thanks for sharing. Just one thing I noticed and could be improved visually it's the bigint 9223372036854700000 that might be declared as a global constant since it's used several times.

Posted

Added to the wiki :)

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...