fmendi Posted January 26, 2021 Author Share Posted January 26, 2021 5 hours ago, Nine said: That seems to work fine if you correctly list all acronyms : #include <Constants.au3> #include <Array.au3> Local $aAddr = [ _ "Frank Conte 2133 Elm St. Gainesville, FL, 45679.", _ "Frank F. Conte 123 First Rd. New York NY, 12345.", _ "Frank Conte 2133 Elm blvd Gainesville fl, 45679.", _ "Frank Conte 2133 Elm av Gaines Town, FL, 45679"] Local $aComp For $sAddr in $aAddr $aComp = StringRegExp($sAddr, "(?si)([^\d]+)(.*(?:st|av|blvd|rd))\.?\h*(.*?),?\h*([A-Z]{2}),\h*(\d+)", 1) _ArrayDisplay($aComp) Next Very impressive, as usual! Thanks again. This last iteration covers most of the usual variations. Link to comment Share on other sites More sharing options...
fmendi Posted January 26, 2021 Author Share Posted January 26, 2021 19 hours ago, Nine said: That seems to work fine if you correctly list all acronyms : #include <Constants.au3> #include <Array.au3> Local $aAddr = [ _ "Frank Conte 2133 Elm St. Gainesville, FL, 45679.", _ "Frank F. Conte 123 First Rd. New York NY, 12345.", _ "Frank Conte 2133 Elm blvd Gainesville fl, 45679.", _ "Frank Conte 2133 Elm av Gaines Town, FL, 45679"] Local $aComp For $sAddr in $aAddr $aComp = StringRegExp($sAddr, "(?si)([^\d]+)(.*(?:st|av|blvd|rd))\.?\h*(.*?),?\h*([A-Z]{2}),\h*(\d+)", 1) _ArrayDisplay($aComp) Next @Nine Thank you so much again. This works incredibly well for all of the examples provided and other variations except if there is no comma after the State. Eg. John Conte, 2113 Elm St Gainesville, FL 45679 or no commas at all John Conte 2113 Elm St Gainesville FL 45679 Link to comment Share on other sites More sharing options...
seadoggie01 Posted January 26, 2021 Share Posted January 26, 2021 52 minutes ago, fmendi said: except if there is no comma after the State So make the comma in his RegEx optional (?si)([^\d]+)(.*(?:st|av|blvd|rd))\.?\h*(.*?),?\h*([A-Z]{2}),?\h*(\d+) All my code provided is Public Domain... but it may not work. Use it, change it, break it, whatever you want. Spoiler My Humble Contributions:Personal Function Documentation - A personal HelpFile for your functionsAcro.au3 UDF - Automating Acrobat ProToDo Finder - Find #ToDo: lines in your scriptsUI-SimpleWrappers UDF - Use UI Automation more Simply-erKeePass UDF - Automate KeePass, a password managerInputBoxes - Simple Input boxes for various variable types Link to comment Share on other sites More sharing options...
fmendi Posted January 26, 2021 Author Share Posted January 26, 2021 3 hours ago, seadoggie01 said: So make the comma in his RegEx optional (?si)([^\d]+)(.*(?:st|av|blvd|rd))\.?\h*(.*?),?\h*([A-Z]{2}),?\h*(\d+) Absolutely, thank you, thank you.! I've learned a lot from this exercise, mostly that my brainpower is very limited in figuring this stuff out . I have two degrees, the second in Medicine but that means nothing to regex. LOL. I now have the perfect time saving script in auto-filling Paypal's address portal. Thanks again to everyone who helped. Link to comment Share on other sites More sharing options...
seadoggie01 Posted January 26, 2021 Share Posted January 26, 2021 2 hours ago, fmendi said: my brainpower is very limited Me too! What helps is to break it down into little pieces. The (?x) flag can be helpful if you might want to change your RegEx later... (?six) # RegEx Flags ([^\d]+) # Person's Name (.*(?:st|av|blvd|rd))\.?\h* # House number and Street (.*?),?\h* # City ([A-Z]{2}),?\h* # State (\d+) # Zip code Just remember that you need to escape spaces! See it in action! All my code provided is Public Domain... but it may not work. Use it, change it, break it, whatever you want. Spoiler My Humble Contributions:Personal Function Documentation - A personal HelpFile for your functionsAcro.au3 UDF - Automating Acrobat ProToDo Finder - Find #ToDo: lines in your scriptsUI-SimpleWrappers UDF - Use UI Automation more Simply-erKeePass UDF - Automate KeePass, a password managerInputBoxes - Simple Input boxes for various variable types Link to comment Share on other sites More sharing options...
fmendi Posted January 27, 2021 Author Share Posted January 27, 2021 Divide and conquer! Makes sense in any complex situation. Link to comment Share on other sites More sharing options...
Confuzzled Posted January 28, 2021 Share Posted January 28, 2021 In the industry this is called 'data washing' and there are many tools well known to marketers and advertisers on how to massage your data to be consistent. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now