czardas Posted February 26, 2016 Share Posted February 26, 2016 (edited) Well it's certainly a challenge to me. There are four calls to StringRegExpReplace() in the following script and I would like to combine the patterns, or use better patterns, to reduce the number of calls. Even combining two of the regular expressions would be an improvement. The rules are very simple. Given a decimal string, strip out non-essential zeros (or add them) in order to test equality with itself after execution: providing strong evidence that numeric conversion is either accurate, inaccurate or sometimes inappropriate. The string may begin with a minus sign and a decimal point may appear anywhere thereafter. No other symbols appear in the input. The string can be any length. I have added the array output at the end of the code. The code requires ArrayWorkshop (in my sig). I think it makes it easier to understand written this way. I have added comments to explain what I expect each regular expression to do. The digit 1 in the samples could be any non-zero digit 1-9. expandcollapse popup; THREE RULES [... '\A\-?(\d*\.?\d+|\d+\.)\z' ...matches input] ; 1. The input string (digits) must contain at least one non-zero value. [1] ; 2. A single period may appear anywhere in the string. [.1] ; 3. The string can be negative. [-.1] ; The output string should be modified to become equal to itself after execution and then tested. ; No alpha characters are allowed. [!1.0e+19] ==> that's a different module #include <Array.au3> #include 'ArrayWorkshop.au3' ; column headers Local $aTest = ['Sample', 'Strip leading zeros?', 'Prefix zero?', 'Strip trailing zeros?', 'Strip trailing period?', "Trust it?"] ; data to modify Local $aSample = _ ['000.01000', _ '-001', _ '1.0', _ '100', _ '001.100', _ '-001.100', _ '1.', _ '0001000', _ '-0001000', _ '00.001', _ '.001', _ '.000000001', _ ; edge case '11111111111111111111', _ ; out of bounds '-01.'] _PreDim($aTest, 2, True) ; set column headers _ArrayAttach($aTest, $aSample) ; add the sample data For $i = 1 To UBound($aTest) -1 ; here you can see the effect of each regular expression $aTest[$i][1] = StringRegExpReplace($aTest[$i][0], '(\A\-?)(0+)(.*\z)', '\1\3') ; strip leading zeros 000xx|-000xx ==> xx|-xx $aTest[$i][2] = StringRegExpReplace($aTest[$i][1], '(\A\-?)(\..*\z)', '${1}0\2') ; prefix zero [or not]? .xxx|-.xxx ==> 0.xxx|-0.xxx $aTest[$i][3] = StringRegExpReplace($aTest[$i][2], '(\A.+\.)(.*[^0])?(0+\z)', '\1\2') ; strip trailing zeros? x.xx000|-x.xx000 ==> x.xx|-x.xx $aTest[$i][4] = StringRegExpReplace($aTest[$i][3], '(\A.+)(\.\z)', '\1') ; strip trailing period? xxx. ==> xxx $aTest[$i][5] = (StringCompare(Execute($aTest[$i][4]), $aTest[$i][4]) = 0) ; What does the interpreter make of it? / Do you trust the conversion? Next _ArrayDisplay($aTest) #cs - RESULTS Sample - Strip leading zeros? - Prefix zero? - Strip trailing zeros? - Strip trailing period? - Trust it? 000.01000 |.01000 |0.01000 |0.01 |0.01 |True -001 |-1 |-1 |-1 |-1 |True 1.0 |1.0 |1.0 |1. |1 |True 100 |100 |100 |100 |100 |True 001.100 |1.100 |1.100 |1.1 |1.1 |True -001.100 |-1.100 |-1.100 |-1.1 |-1.1 |True 1. |1. |1. |1. |1 |True 0001000 |1000 |1000 |1000 |1000 |True -0001000 |-1000 |-1000 |-1000 |-1000 |True 00.001 |.001 |0.001 |0.001 |0.001 |True .001 |.001 |0.001 |0.001 |0.001 |True .000000001|.000000001|0.000000001|0.000000001|0.000000001|False 11111111111111111111|11111111111111111111|11111111111111111111|11111111111111111111|11111111111111111111|False -01. |-1. |-1. |-1. |-1 |True #ce Edited February 26, 2016 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
jguinch Posted February 26, 2016 Share Posted February 26, 2016 (edited) I'm not sure to understand the result you wait for... #include <Array.au3> #include 'ArrayWorkshop.au3' ; column headers Local $aTest = ['Sample', "Adding zeros", "Strip un-necessary zeros", "Trust it?"] ; data to modify Local $aSample = _ ['000.01000', _ '-001', _ '1.0', _ '100', _ '001.100', _ '-001.100', _ '1.', _ '0001000', _ '-0001000', _ '00.001', _ '.001', _ '.000000001', _ ; edge case '11111111111111111111', _ ; out of bounds '-01.'] _PreDim($aTest, 2, True) ; set column headers _ArrayAttach($aTest, $aSample) ; add the sample data For $i = 1 To UBound($aTest) -1 ; here you can see the effect of each regular expression $aTest[$i][1] = StringRegExpReplace($aTest[$i][0], "^-?\K(?=\.)", "0") $aTest[$i][2] = StringRegExpReplace($aTest[$i][1], "^-?\K0+(?=[1-9]|0\.?)|\.0*$|\.\d*[1-9]\K0+", "") $aTest[$i][3] = (StringCompare(Execute($aTest[$i][2]), $aTest[$i][2]) = 0) Next _ArrayDisplay($aTest) Edited February 26, 2016 by jguinch czardas 1 Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
czardas Posted February 26, 2016 Author Share Posted February 26, 2016 (edited) Thanks jguinch, that's a big help. I think you've pretty much understood the idea and included some things I didn't think of, or haven't used before. The string will have already been checked to make sure it only contains digits and the two other symbols (minus sign and decimal point). I'm writing a numeric sort algorithm. The fewer strings there are, the faster processing goes, because numbers can be compared against one another very quickly. I want to be able to sort googol sized integers if need be - so just using Number() is out of the question, and the method used will depend on which data types are being compared and their magnitude. The preprocessing above is relevant for all recognized strings. The ones that cannot easily be converted to numbers will be processed more slowly as strings: comparing them in different ways. I hope that explains why I want this. Edit: I'll be testing numbers against numbers, strings against strings and numbers against strings. Strings can be integers or floats, and numbers can be of any type. The method for each comparison is already worked out and requires strings first to be formatted as above. Edited February 26, 2016 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
jguinch Posted February 26, 2016 Share Posted February 26, 2016 OK, I understand (i think ). Nice project. Also, I like you ArrayWorkshop UDF czardas 1 Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now