iamtheky Posted August 24, 2016 Share Posted August 24, 2016 im just saying for how generic the criteria has to be to capture all nationalities flavor of phone number it will be difficult to create many different criteria of failure, so build those first and work that way. ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
czardas Posted August 24, 2016 Author Share Posted August 24, 2016 I think any numbers between 1 and approx 15 digits should be allowed. operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
SadBunny Posted August 24, 2016 Share Posted August 24, 2016 Well, not to bash the challenger or anything, but if a customer would come to me or my team with requirements this vague and not be willing to discuss better specifications, I wouldn't even accept the job Too much risk of the work turning out not to satisfy the customer, too much dependence on my crystal ball Roses are FF0000, violets are 0000FF... All my base are belong to you. Link to comment Share on other sites More sharing options...
czardas Posted August 24, 2016 Author Share Posted August 24, 2016 (edited) Let's assume the phone numbers exist. Is that any better? Longish ones will undoubtedly begin with zero. Four international call prefixes exist: 00, 0011, 010, 011 [Edit: actually it's more complicated than that]. Stop complaining about specs: phone numbers can be found all over the internet. You are just not thinking out of the box. Edited August 31, 2016 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
orbs Posted August 24, 2016 Share Posted August 24, 2016 ok, feel free to bash me in... how about this: step 1: traverse the array and strip all non-numeric characters for each element. step 2: strip all non-numeric characters for the input. step 3: traverse the array and try to match the input to each element from the right-hand-side of the string (start from the more specific digits sequence to the more general one). any good? czardas 1 Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
czardas Posted August 24, 2016 Author Share Posted August 24, 2016 (edited) On 8/24/2016 at 9:53 PM, orbs said: any good? Very good indeed. At the point that a digit does not match, there are only a very small number of possibilities. Either you ran out of numbers in the shorter version, or you hit a zero = the first character (internal dialing), or you found a 1 that does not coincide with a zero within the international code prefix, else it's a mismatch. Country codes will always match. That's how far I got with it anyway. Edit: This is not quite correct (amended later). Edited August 29, 2016 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
iamtheky Posted August 24, 2016 Share Posted August 24, 2016 Ahh, i was thinking the input could be random and we were trying to discern if it was a phone number... You are providing every string is attempting to be a phone number, because those rules will still match random input like addresses or ssn if you strip the non-numeric ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
czardas Posted August 24, 2016 Author Share Posted August 24, 2016 Nooo You just want to find it if you can. operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
orbs Posted August 24, 2016 Share Posted August 24, 2016 (edited) 6 minutes ago, iamtheky said: ... those rules will still match random input ... well, if the user is insane enough to submit the input "hi!, i'm no.6 and my age is 41, i was born in 1975" then why wouldn't they expect the match 641-1975 (if it exists in the array)? EDIT: of course you can sanitize the input (to some extent), for example by disallowing letters, or allowing only commonly-used phone numbers delimiters (space, brackets, hyphen...) Edited August 24, 2016 by orbs Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
iamtheky Posted August 24, 2016 Share Posted August 24, 2016 (edited) what if they enter 123-45-6789 or 19078 XXX st. Apt. 1904-2 especially if the goal is "find it if you can" you may not want to strip all non-alpha without checking some stuff first, is all. We have no example string from which to find a phone number, so just guessing still... Edited August 24, 2016 by iamtheky ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
czardas Posted August 24, 2016 Author Share Posted August 24, 2016 (edited) 6 minutes ago, iamtheky said: 19078 XXX st. Apt. 1904-2 FAIL. Unless you want to cover random text inserts within phone numbers. XXX is probably a variable covering several numbers. You can look for "st. Apt." using StringInStr() instead. Edited August 24, 2016 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
orbs Posted August 24, 2016 Share Posted August 24, 2016 well, if the array contains something like "+1 112 345 6789" then it will match the first input, and that is a good result (if i get the initial intention right...). the second input is by no means a phone number, and the user who is ought to know that he is looking for a phone number should not expect any coherent result. czardas 1 Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
orbs Posted August 24, 2016 Share Posted August 24, 2016 also, if you want to allow wildcards, then let the user type-in a well-known character (e.g. a question mark, or an X as in your example), then when you strip non-numeric characters, leave that one in, and use RegExp to make the match. Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
JohnOne Posted August 24, 2016 Share Posted August 24, 2016 I still don't get the gig, all I find it 882 8565 at indices 16 and 32, and 7543010 at index 26. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
czardas Posted August 25, 2016 Author Share Posted August 25, 2016 They don't all match. When dialing Manchester from abroad, you have to drop the zero in the regional code 0161. operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
orbs Posted August 26, 2016 Share Posted August 26, 2016 (edited) i get the same results as JohnOne, and i'm not sure what you expect further. here's my code - excluding the wildcard sample: expandcollapse popup#include <Array.au3> ; tester input Global $aPhone = [ _ '+262 692 12 03 00', '1800 251 996', '+1 994 951 0197', _ '091 535 98 91 61', '2397865', '08457 128276', _ '348476300192', '05842 361774', '0-800-022-5649', _ '15499514891', '0096 363 0949', '04813137349', _ '06620 220168', '07766 554433', '047 845 44 22 94', _ '0435 773 4859', '(01) 882 8565', '00441619346434', _ '09314 367090', '0 164 268 0887', '0590995603', _ '991', '0267 746 3393', '064157526153', _ '0 719 829 7756', '+1-541-754-3012', '+441347543010', _ '03890 978398', '(31) 10 7765420', '020 8568 6646', _ '0161 934 6534', '0 637 915 1283', '+44 207 882 8565', _ '0800 275002', '0750 646 9746', '982-714-3119', _ '000 300 74 52 40', '023077529227', '1 758 441 0611', _ '0183 233 0151', '02047092863', '+44 20 7946 0321', _ '04935 410618', '048 257 67 60 79'] Global $aQuery = [ _ '882 8565', _ '123 8762', _ '7543010', _ '07843 543287', _ '00441619346534', _ '0015417543012'] ; declare the match results array: rows = phone numbers, columns = queries Global $aMatch[UBound($aPhone) + 1][UBound($aQuery) + 1] ; populate headers (rows and columns) and strip non-numeric characters For $iPhone = 0 To UBound($aPhone) - 1 $aMatch[$iPhone + 1][0] = _StringStripNonNumeric($aPhone[$iPhone]) Next For $iQuery = 0 To UBound($aQuery) - 1 $aMatch[0][$iQuery + 1] = _StringStripNonNumeric($aQuery[$iQuery]) Next ; match For $iPhone = 1 To UBound($aMatch) - 1 For $iQuery = 1 To UBound($aMatch, 2) - 1 If _StringMatchEnd($aMatch[$iPhone][0], $aMatch[0][$iQuery]) Then $aMatch[$iPhone][$iQuery] = 'MATCH' Next Next ; re-populate headers with original values for display For $iPhone = 0 To UBound($aPhone) - 1 $aMatch[$iPhone + 1][0] = $aPhone[$iPhone] Next For $iQuery = 0 To UBound($aQuery) - 1 $aMatch[0][$iQuery + 1] = $aQuery[$iQuery] Next ; display match results _ArrayDisplay($aMatch) ; functions Func _StringStripNonNumeric($sString) Local $sResult = '' For $i = 1 To StringLen($sString) If StringIsDigit(StringMid($sString, $i, 1)) Then $sResult &= StringMid($sString, $i, 1) Next Return $sResult EndFunc ;==>_StringStripNonNumeric Func _StringMatchEnd($sString, $sSubstr) If StringRight($sString, StringLen($sSubstr)) = $sSubstr Then Return True Return False EndFunc ;==>_StringMatchEnd Edited August 26, 2016 by orbs Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
czardas Posted August 26, 2016 Author Share Posted August 26, 2016 (edited) Like @JLogan3o13, I originally thought this would be quite easy. It turns out to be less trivial than it sounds. Perhaps I didn't quite explain the task clearly enough. If the phone numbers are real, the following code should find them (most of time anyway - see next post ). If @jchd would be kind enough to squash it with a crazy regexp, I would love to see that. I prefer to have the process broken down into more understandable steps, at least to begin with. I haven't thoroughly tested the code yet (posted quickly so you will hopefully get the idea). expandcollapse popup#include <Array.au3> ; MsgBox(0, "", TelCompare('010 44 161 882 8565', '0011 44 161 882 8565')) Local $aArray = _ ['+262 692 12 03 00', '1800 251 996', '+1 994 951 0197', _ '091 535 98 91 61', '2397865', '08457 128276', _ '348476300192', '05842 361774', '0-800-022-5649', _ '15499514891', '0096 363 0949', '04813137349', _ '06620 220168', '07766 554433', '047 845 44 22 94', _ '0435 773 4859', '(01) 882 8565', '00441619346434', _ '09314 367090', '0 164 268 0887', '0590995603', _ '991', '0267 746 3393', '064157526153', _ '0 719 829 7756', '+1-541-754-3012', '+441347543010', _ '03890 978398', '(31) 10 7765420', '020 8568 6646', _ '0161 934 6534', '0 637 915 1283', '+44 207 882 8565', _ '0800 275002', '0750 646 9746', '982-714-3119', _ '000 300 74 52 40', '023077529227', '1 758 441 0611', _ '0183 233 0151', '02047092863', '+44 20 7946 0321', _ '04935 410618', '048 257 67 60 79'] Global $aQuery = [ _ '882 8565', _ '123 8762', _ '7543010', _ '07843 543287', _ '00441619346534', _ '0015417543012'] For $i = 0 To UBound($aQuery) -1 For $j = 0 To UBound($aArray) -1 If TelCompare($aQuery[$i], $aArray[$j]) Then ConsoleWrite($aQuery[$i] & " = " & $aArray[$j] & " found at index: " & $j & @LF) EndIf Next Next _ArrayDisplay($aArray) Func TelCompare($sTelNum1, $sTelNum2, $iMinMatch = 3) ; , $iMaxLen = 25 probably ; get rid of typical delimiters $sTelNum1 = StringRegExpReplace($sTelNum1, '[ \+\(\)\-]', '') $sTelNum2 = StringRegExpReplace($sTelNum2, '[ \+\(\)\-]', '') If $sTelNum1 = $sTelNum2 Then Return True ; no need to go any further Local $iLen1 = StringLen($sTelNum1), $iLen2 = StringLen($sTelNum2), $vTemp If $iLen2 < $iLen1 Then ; make $sTelNum1 the shorter number $vTemp = $iLen1 $iLen1 = $iLen2 $iLen2 = $vTemp $vTemp = $sTelNum1 $sTelNum1 = $sTelNum2 $sTelNum2 = $vTemp EndIf If $iLen1 <= $iMinMatch Then Return False ; insufficient information If StringRight($sTelNum1, $iMinMatch) <> StringRight($sTelNum2, $iMinMatch) Then Return False ; minimum match failed $sTelNum1 = StringReverse($sTelNum1) ; to simplify parsing later $sTelNum2 = StringReverse($sTelNum2) ; dito ; the algorithm [international dialing codes all begin with zero] Local $sDigit1, $sDigit2 For $i = $iMinMatch +1 To $iLen1 $sDigit1 = StringMid($sTelNum1, $i, 1) $sDigit2 = StringMid($sTelNum2, $i, 1) If $sDigit1 <> $sDigit2 Then ; let's find out why Local $iOffSet = $iLen2 - $iLen1 If $i = $iLen1 Then ; we have reached the first digit If $sDigit1 = "0" Then ; maybe omitted in $sTelNum2 or different international dialing code ; test the first zero omission theory with country codes (reversed) If StringRegExp(StringRight($sTelNum2, $iOffSet +1), '(\d){1,3}(00|1100|010|110)?') Then Return True ; next test international dialing codes (reversed) Return ($iOffSet < 3 And StringRegExp($sTelNum2, '(1100|10|110)\z')) ? True : False EndIf Else ; test international dialing codes (reversed) If $sDigit1 = "1" Then ; must have failed to match a zero within the international code ; since the shorter number contains 1 in the international dialing code: Return ($iOffSet = 1 And StringRight($sTelNum2, 4) = '1100') ? True : False EndIf EndIf EndIf Next Return True EndFunc ;==> TelCompare Here are the results: 882 8565 = (01) 882 8565 found at index: 16 882 8565 = +44 207 882 8565 found at index: 32 7543010 = +441347543010 found at index: 26 00441619346534 = 0161 934 6534 found at index: 30 0015417543012 = +1-541-754-3012 found at index: 25 It contains a bug (logic flaw). Edited August 26, 2016 by czardas bugfix operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
czardas Posted August 26, 2016 Author Share Posted August 26, 2016 (edited) It seems I didn't do enough research, but it's going in the right direction:http://www.onesimcard.com/how-to-dial/ Edited August 26, 2016 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
orbs Posted August 26, 2016 Share Posted August 26, 2016 (edited) ok, i took this approach: following my previous code, i changed the match criteria from True/False to a matching score [0..1] - the more identical numerals of the query matching the phonebook entry, the higher the score is. i assumed a score of >0.7 is considered a match, and after testing i found the threshold to be >0.3. try this: expandcollapse popup#include <Array.au3> ; tester input Global $aPhone = [ _ '+262 692 12 03 00', '1800 251 996', '+1 994 951 0197', _ '091 535 98 91 61', '2397865', '08457 128276', _ '348476300192', '05842 361774', '0-800-022-5649', _ '15499514891', '0096 363 0949', '04813137349', _ '06620 220168', '07766 554433', '047 845 44 22 94', _ '0435 773 4859', '(01) 882 8565', '00441619346434', _ '09314 367090', '0 164 268 0887', '0590995603', _ '991', '0267 746 3393', '064157526153', _ '0 719 829 7756', '+1-541-754-3012', '+441347543010', _ '03890 978398', '(31) 10 7765420', '020 8568 6646', _ '0161 934 6534', '0 637 915 1283', '+44 207 882 8565', _ '0800 275002', '0750 646 9746', '982-714-3119', _ '000 300 74 52 40', '023077529227', '1 758 441 0611', _ '0183 233 0151', '02047092863', '+44 20 7946 0321', _ '04935 410618', '048 257 67 60 79'] Global $aQuery = [ _ '882 8565', _ '123 8762', _ '7543010', _ '07843 543287', _ '00441619346534', _ '0015417543012'] Global $iScoreThreshold = 0.3 ; declare the match results array: rows = phone numbers, columns = queries Global $aMatch[UBound($aPhone) + 1][UBound($aQuery) + 1] ; populate headers (rows and columns) and strip non-numeric characters For $iPhone = 0 To UBound($aPhone) - 1 $aMatch[$iPhone + 1][0] = _StringStripNonNumeric($aPhone[$iPhone]) Next For $iQuery = 0 To UBound($aQuery) - 1 $aMatch[0][$iQuery + 1] = _StringStripNonNumeric($aQuery[$iQuery]) Next ; match For $iPhone = 1 To UBound($aMatch) - 1 For $iQuery = 1 To UBound($aMatch, 2) - 1 If _StringMatchEnd($aMatch[$iPhone][0], $aMatch[0][$iQuery]) > $iScoreThreshold Then $aMatch[$iPhone][$iQuery] = 'MATCH' Next Next ; re-populate headers with original values for display For $iPhone = 0 To UBound($aPhone) - 1 $aMatch[$iPhone + 1][0] = $aPhone[$iPhone] Next For $iQuery = 0 To UBound($aQuery) - 1 $aMatch[0][$iQuery + 1] = $aQuery[$iQuery] Next ; display match results _ArrayDisplay($aMatch) ; functions Func _StringStripNonNumeric($sString) Local $sResult = '' For $i = 1 To StringLen($sString) If StringIsDigit(StringMid($sString, $i, 1)) Then $sResult &= StringMid($sString, $i, 1) Next Return $sResult EndFunc ;==>_StringStripNonNumeric Func _StringMatchEnd($sString, $sSubstr) Local $iScore = 0 Local $iScorePerChar = 1 / StringLen($sSubstr) For $i = 1 To StringLen($sSubstr) If StringRight($sString, $i) = StringRight($sSubstr, $i) Then $iScore = $iScorePerChar * $i Next Return $iScore EndFunc ;==>_StringMatchEnd Edited August 26, 2016 by orbs Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
pluto41 Posted August 26, 2016 Share Posted August 26, 2016 (edited) I think if you want to do the job right you have to include every country number convention into the source code. https://en.wikipedia.org/wiki/Category:Telephone_numbers_by_country And even then. How can you be sure its a valid phone number. It seems to me there is no easy way. With service numbers and all (which are also valid numbers) I have looked (a little) into the wiki page to see if its possible to gather the information about the number conventions automatically but as far as i see its not that easy as every page has a different lay-out. Some more info: https://en.wikipedia.org/wiki/National_conventions_for_writing_telephone_numbers Spoiler $aStrPlainNumber = StringRegExp ( $strPhoneNumber, '\d+', $STR_REGEXPARRAYGLOBALMATCH ) ; strip all non-number characters Edited August 26, 2016 by pluto41 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now