gcue Posted June 4, 2015 Author Share Posted June 4, 2015 of course - thank you again for your valuable time Link to comment Share on other sites More sharing options...
czardas Posted June 4, 2015 Share Posted June 4, 2015 I added another edit to my previous post. Please look again. operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
gcue Posted June 4, 2015 Author Share Posted June 4, 2015 (edited) if the filename appears to have more than 1 date - then ignore Edited June 4, 2015 by gcue Link to comment Share on other sites More sharing options...
czardas Posted June 5, 2015 Share Posted June 5, 2015 (edited) It has taken me a while to get my head around this, and I'm still not 100% sure about the results. With so many possible variants, things get a little complicated. From what I understand you are looking for a fuzzy algorithm. There is always a chance that a match is not actually a date, and the more variants you allow, the greater the risk of a false positive result. Anyway after some analysis I came up with the idea of searching each variant from the left until a plausible date is encountered. Then continuing in the same vein, to try and find a time stamp afterwards. [EDIT - with numeric strings] - One delimiter ONLY is allowed between the date and the time. [when a date and time contain valid delimiters, the previous rule is disgaurded] Times are accurate to within 60 seconds if no false positive result occurs.I notice that you give some formats which include characters not allowed in file names. For this reason I have separated the delimiter options and created an exception so the code can be modified fairly easily. I have not compaired your original results with mine. I rewrote the whole thing and it's a bit rough and ready in places. I think it's closer to what you want, but you will have to test it more thoroughly. The code may form the basis of a better version later. I guess the regular expressions could possibly be improved. It seems to be working as it stands. expandcollapse popup#include <Array.au3> #include <Date.au3> #include <DTC.au3> Local $aTest = dates_array() ; This code formats the results from _ExtractDate(): Local $sExtracted, $aException For $i = 0 To UBound($aTest) -1 $sExtracted = _ExtractDate($aTest[$i][0]) If Not @error Then $sExtracted = StringRegExpReplace(StringReplace($sExtracted, ' ', '_'), '[/\:]', '') $sExtracted = _Date_Time_Convert($sExtracted, "yyyyMMdd_HHmmss", "MM/dd/yyyy hh:mm TT") ; An exception is made for AM and PM $aException = StringRegExp($aTest[$i][0], '(?i)(?: )(\d{1,2})(?:\:)(\d{1,2})( [AP]M)',3) If IsArray($aException) Then $sExtracted = StringLeft($sExtracted, 10) & _ ' ' & StringFormat('%02i:%02i', $aException[0], $aException[1]) & $aException[2] $aTest[$i][1] = $sExtracted Else $aTest[$i][2] = "ERROR" EndIf Next _ArrayDisplay($aTest) ; This function returns dates in the format yyyy/MM/DD HH:MM:SS Func _ExtractDate($sString) Local $sDateDelim = '-/_', $sTimeDelim = '-._' ; Delimiter options - can be modified. Local $sMDY = '(?:.*?)(\d{1,2})([\Q' & $sDateDelim & '\E])(\d{1,2})(\g2)(\d{4})' ; Formats *M?D?YYYY, *M?DD?YYYY, *MM?D?YYYY, *MM?DD?YYYY Local $sYMD = '(?:.*?)(\d{4})([\Q' & $sDateDelim & '\E])(\d{1,2})(\g2)(\d{1,2})' ; Formats *YYYY?M?D, *YYYY?M?DD, *YYYY?MM?D, *YYYY?MM?DD Local $sHMS = '(?:.*?)(\d{1,2})([\Q' & $sTimeDelim & '\E])(\d{1,2})(\g7)?(\d{1,2})?' ; Formats *HH?MM?SS Local $sDDD = '(?:\D*)(\d{8,})(?:\D)?(\d{4,6})?' ; Just Digits *YYYYMMDD?HHMMSS, *YYYYMMDD Local $sYY = '(?:\D*)(\d{4})(?:\D|\z)' ; Just 4 digits *YYYY Local $aFormat[6] = [$sYMD & $sHMS, $sMDY & $sHMS, $sYMD, $sMDY, $sDDD, $sYY] ; Formats $sYMD & $sHMS, $sMDY & $sHMS, $sYMD, $sMDY, $sDDD, $sYY Local $aSRE, $sDate, $sTime = ' 00:00:00' For $i = 0 To 5 $sDate = '' $aSRE = StringRegExp($sString, $aFormat[$i], 3) If Not @error Then ; Match found Switch $i Case 0 To 3 $sDate = (Mod($i, 2)) ? _ $aSRE[4] & '/' & StringFormat('%02i/%02i', $aSRE[0], $aSRE[2]) : _ $aSRE[0] & '/' & StringFormat('%02i/%02i',$aSRE[2], $aSRE[4]) If _DateIsValid($sDate) Then If $i < 2 Then $sTime = ' ' & StringFormat('%02i:%02i', $aSRE[5], $aSRE[7]) & ':' & '00' If Not _DateIsValid($sDate & $sTime) Then $sTime = ' 00:00:00' EndIf ExitLoop EndIf Case 4 ; *YYYYMMDD*HHMMSS, *YYYYMMDD For $j = 1 To StringLen($aSRE[0]) - 7 $sYear = StringMid($aSRE[0], $j, 4) If Not __IsWithin50Years($sYear) Then ContinueLoop $sDate = $sYear & '/' & StringMid($aSRE[0], $j + 4, 2) & '/' & StringMid($aSRE[0], $j + 6, 2) If _DateIsValid($sDate) Then ; Search for a valid time $sTime = ' ' & StringMid($aSRE[0], $j + 8, 2) & ':' & StringMid($aSRE[0], $j + 10, 2) & ':00' If Stringlen($sTime) = 9 And _DateIsValid($sDate & $sTime) Then ExitLoop 2 Else If UBound($aSRE) > 1 Then $sTime = ' ' & StringLeft($aSRE[1], 2) & ':' & StringMid($aSRE[1], 3, 2) & ':00' If Not _DateIsValid($sDate & $sTime) Then $sTime = ' 00:00:00' Else $sTime = ' 00:00:00' EndIf ExitLoop 2 EndIf EndIf Next Case 5 ; *YYYY If __IsWithin50Years($aSRE[0]) Then $sDate = $aSRE[0] & '/01/01' $sTime = ' 00:00:00' ExitLoop EndIf EndSwitch EndIf Next $sDate &= $sTime If StringLen($sDate) <> 19 Then Return SetError(1) ; No date found. Return $sDate EndFunc ;==> _ExtractDate ; This time window can easily be extended to further in the past. Func __IsWithin50Years($vYear, $iRange = 50) Local $iCurrentYear = @YEAR Return $vYear < $iCurrentYear And $iCurrentYear - $vYear < $iRange EndFunc ;==> __IsWithin50Years #Region - original test data Func dates_array() Local $array[65][3] ;resolved $array[0][0] = "2/3/2012 8:38 PM" $array[1][0] = "2/03/2012 08:38 PM" $array[2][0] = "02/3/2012 8:38 AM" $array[3][0] = "11/03/2012 8:38 AM" $array[4][0] = "11/03/2012 08:38 AM" $array[5][0] = "2012-12-30_14-48-34_90" $array[6][0] = "2012_12_30_14_48_34_90" $array[7][0] = "2012-12-30-14-48-34-90" $array[8][0] = "2012-12-30 14-48-34-90" $array[9][0] = "2015-04-29 03.46.36" $array[10][0] = "2015_04_29 03.46.36" $array[11][0] = "12-26-2012-bridge(1)" $array[12][0] = "12_26_2012-bridge(1)" $array[13][0] = "12-26-2012" $array[14][0] = "12_26_2012" $array[15][0] = "IMG00136-20100524-0109" $array[16][0] = "IMG00136_20100524_0109" $array[17][0] = "IMG_20000526_100019_402" $array[18][0] = "IMG-20120615-00028" $array[19][0] = "IMG_20120615_00028" $array[20][0] = "Texas-20111117-00060" $array[21][0] = "Texas_20111117_00060" $array[22][0] = "Southwest San Marcos Valley-20111110-00046" $array[23][0] = "Southwest San Marcos Valley_20111110_00046" $array[24][0] = "Long Island-Laketown-20110526-00023" $array[25][0] = "Long Island-Laketown_20110526_00023" $array[26][0] = "20141119_193702" $array[27][0] = "20141119-193702" ;still need to resolve - RESOLVED $array[28][0] = "2014071495201859" $array[29][0] = "2013072695195930" $array[30][0] = "IMG-20140619-WA0000" $array[31][0] = "IMG-20140402-WA0000" $array[32][0] = "VID-20141002-WA0001" $array[33][0] = "VID-20141009-WA0004" $array[34][0] = "IMG95201405169510533295434" $array[35][0] = "IMG95201310319519475695780" $array[36][0] = "IMG952014050695205100" $array[37][0] = "IMG952013010695192927" $array[38][0] = "Resampled952012-07-099515-09-279577" $array[39][0] = "Resampled952012-05-169519-32-049577" $array[40][0] = "Resampled952012-05-129518-02-1795365" $array[41][0] = "Resampled952012-06-109513-34-0395360" $array[42][0] = "IMG_20141003_244125_273" $array[43][0] = "IMG_20141003_244129_571" $array[44][0] = "2012-07-149519" $array[45][0] = "VID_20120415103537718" $array[46][0] = "VID_20120415103537718" $array[47][0] = "VN_20120520103037802" $array[48][0] = "VN_20121005215040254" $array[49][0] = "PicStory-2012-04-01-02-53" $array[50][0] = "2012-12-209510-42-3195121" $array[51][0] = "2012-12-219512-05-0395507" $array[52][0] = "2014-08-259507.27.29" $array[53][0] = "2013-01-29" ;should not match $array[54][0] = "0623112010" $array[55][0] = "0710122020" $array[56][0] = "0710122022" $array[57][0] = "0710122024" $array[58][0] = "0710122026" $array[59][0] = "0710122020" $array[60][0] = "0710122022" $array[61][0] = "0710122023a" $array[62][0] = "0710122024" $array[63][0] = "0710122026" $array[64][0] = "13659097338151" Return $array EndFunc ;==>dates_array #EndRegion Edited June 5, 2015 by czardas Missing a space in line 83 operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
gcue Posted June 5, 2015 Author Share Posted June 5, 2015 czardas! wow this looks complex yet concise. truly an amazing job. thank you thank you thank you! you have no idea how much this has helped me! czardas 1 Link to comment Share on other sites More sharing options...
czardas Posted June 5, 2015 Share Posted June 5, 2015 I'm happy if you are able to make use of it. Something like this can never be an exact science. Although I'm not entirely sure what you are making, I would like to hear how you get on with it. Cheers! operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
gcue Posted June 9, 2015 Author Share Posted June 9, 2015 this is working great! except i noticed something weird...ive noticed formats which resolve fine sometimes but for some reason the same formats do not resolve other timesfor instancearray elements 26 and 27 resolve great but if you will note array elements 65-81 do notalso array elements 42 and 43 resolve but elements 82 and 83 do notexpandcollapse popup#include <Array.au3> #include <Date.au3> #include <DTC.au3> Local $aTest = dates_array() ; This code formats the results from _ExtractDate(): Local $sExtracted, $aException For $i = 0 To UBound($aTest) -1 $sExtracted = _ExtractDate($aTest[$i][0]) If Not @error Then $aTest[$i][1] = $sExtracted Else $aTest[$i][2] = "ERROR" EndIf Next _ArrayDisplay($aTest) ; This function returns dates in the format yyyy/MM/DD HH:MM:SS Func _ExtractDate($sString) Local $sDateDelim = '-/_', $sTimeDelim = '-._' ; Delimiter options - can be modified. Local $sMDY = '(?:.*?)(\d{1,2})([\Q' & $sDateDelim & '\E])(\d{1,2})(\g2)(\d{4})' ; Formats *M?D?YYYY, *M?DD?YYYY, *MM?D?YYYY, *MM?DD?YYYY Local $sYMD = '(?:.*?)(\d{4})([\Q' & $sDateDelim & '\E])(\d{1,2})(\g2)(\d{1,2})' ; Formats *YYYY?M?D, *YYYY?M?DD, *YYYY?MM?D, *YYYY?MM?DD Local $sHMS = '(?:.*?)(\d{1,2})([\Q' & $sTimeDelim & '\E])(\d{1,2})(\g7)?(\d{1,2})?' ; Formats *HH?MM?SS Local $sDDD = '(?:\D*)(\d{8,})(?:\D)?(\d{4,6})?' ; Just Digits *YYYYMMDD?HHMMSS, *YYYYMMDD Local $sYY = '(?:\D*)(\d{4})(?:\D|\z)' ; Just 4 digits *YYYY Local $aFormat[6] = [$sYMD & $sHMS, $sMDY & $sHMS, $sYMD, $sMDY, $sDDD, $sYY] ; Formats $sYMD & $sHMS, $sMDY & $sHMS, $sYMD, $sMDY, $sDDD, $sYY Local $aSRE, $sDate, $sTime = ' 00:00:00' For $i = 0 To 5 $sDate = '' $aSRE = StringRegExp($sString, $aFormat[$i], 3) If Not @error Then ; Match found Switch $i Case 0 To 3 $sDate = (Mod($i, 2)) ? _ $aSRE[4] & '/' & StringFormat('%02i/%02i', $aSRE[0], $aSRE[2]) : _ $aSRE[0] & '/' & StringFormat('%02i/%02i',$aSRE[2], $aSRE[4]) If _DateIsValid($sDate) Then If $i < 2 Then $sTime = ' ' & StringFormat('%02i:%02i', $aSRE[5], $aSRE[7]) & ':' & '00' If Not _DateIsValid($sDate & $sTime) Then $sTime = ' 00:00:00' EndIf ExitLoop EndIf Case 4 ; *YYYYMMDD*HHMMSS, *YYYYMMDD For $j = 1 To StringLen($aSRE[0]) - 7 $sYear = StringMid($aSRE[0], $j, 4) If Not __IsWithin50Years($sYear) Then ContinueLoop $sDate = $sYear & '/' & StringMid($aSRE[0], $j + 4, 2) & '/' & StringMid($aSRE[0], $j + 6, 2) If _DateIsValid($sDate) Then ; Search for a valid time $sTime = ' ' & StringMid($aSRE[0], $j + 8, 2) & ':' & StringMid($aSRE[0], $j + 10, 2) & ':00' If Stringlen($sTime) = 9 And _DateIsValid($sDate & $sTime) Then ExitLoop 2 Else If UBound($aSRE) > 1 Then $sTime = ' ' & StringLeft($aSRE[1], 2) & ':' & StringMid($aSRE[1], 3, 2) & ':00' If Not _DateIsValid($sDate & $sTime) Then $sTime = ' 00:00:00' Else $sTime = ' 00:00:00' EndIf ExitLoop 2 EndIf EndIf Next Case 5 ; *YYYY If __IsWithin50Years($aSRE[0]) Then $sDate = $aSRE[0] & '/01/01' $sTime = ' 00:00:00' ExitLoop EndIf EndSwitch EndIf Next $sDate &= $sTime If StringLen($sDate) <> 19 Then Return SetError(1) ; No date found. $sDate = StringRegExpReplace(StringReplace($sDate, ' ', '_'), '[/\:]', '') $sDate = _Date_Time_Convert($sDate, "yyyyMMdd_HHmmss", "MM/dd/yyyy hh:mm TT") ; An exception is made for AM and PM $aException = StringRegExp($sString, '(?i)(?: )(\d{1,2})(?:\:)(\d{1,2})( [AP]M)',3) If IsArray($aException) Then $sDate = StringLeft($sDate, 10) & _ ' ' & StringFormat('%02i:%02i', $aException[0], $aException[1]) & $aException[2] Return $sDate EndFunc ;==> _ExtractDate ; This time window can easily be extended to further in the past. Func __IsWithin50Years($vYear, $iRange = 50) Local $iCurrentYear = @YEAR Return $vYear < $iCurrentYear And $iCurrentYear - $vYear < $iRange EndFunc ;==> __IsWithin50Years #Region - original test data Func dates_array() Local $array[84][3] ;resolved $array[0][0] = "2/3/2012 8:38 PM" $array[1][0] = "2/03/2012 08:38 PM" $array[2][0] = "02/3/2012 8:38 AM" $array[3][0] = "11/03/2012 8:38 AM" $array[4][0] = "11/03/2012 08:38 AM" $array[5][0] = "2012-12-30_14-48-34_90" $array[6][0] = "2012_12_30_14_48_34_90" $array[7][0] = "2012-12-30-14-48-34-90" $array[8][0] = "2012-12-30 14-48-34-90" $array[9][0] = "2015-04-29 03.46.36" $array[10][0] = "2015_04_29 03.46.36" $array[11][0] = "12-26-2012-bridge(1)" $array[12][0] = "12_26_2012-bridge(1)" $array[13][0] = "12-26-2012" $array[14][0] = "12_26_2012" $array[15][0] = "IMG00136-20100524-0109" $array[16][0] = "IMG00136_20100524_0109" $array[17][0] = "IMG_20000526_100019_402" $array[18][0] = "IMG-20120615-00028" $array[19][0] = "IMG_20120615_00028" $array[20][0] = "Texas-20111117-00060" $array[21][0] = "Texas_20111117_00060" $array[22][0] = "Southwest San Marcos Valley-20111110-00046" $array[23][0] = "Southwest San Marcos Valley_20111110_00046" $array[24][0] = "Long Island-Laketown-20110526-00023" $array[25][0] = "Long Island-Laketown_20110526_00023" $array[26][0] = "20141119_193702" $array[27][0] = "20141119-193702" ;still need to resolve - RESOLVED $array[28][0] = "2014071495201859" $array[29][0] = "2013072695195930" $array[30][0] = "IMG-20140619-WA0000" $array[31][0] = "IMG-20140402-WA0000" $array[32][0] = "VID-20141002-WA0001" $array[33][0] = "VID-20141009-WA0004" $array[34][0] = "IMG95201405169510533295434" $array[35][0] = "IMG95201310319519475695780" $array[36][0] = "IMG952014050695205100" $array[37][0] = "IMG952013010695192927" $array[38][0] = "Resampled952012-07-099515-09-279577" $array[39][0] = "Resampled952012-05-169519-32-049577" $array[40][0] = "Resampled952012-05-129518-02-1795365" $array[41][0] = "Resampled952012-06-109513-34-0395360" $array[42][0] = "IMG_20141003_244125_273" $array[43][0] = "IMG_20141003_244129_571" $array[44][0] = "2012-07-149519" $array[45][0] = "VID_20120415103537718" $array[46][0] = "VID_20120415103537718" $array[47][0] = "VN_20120520103037802" $array[48][0] = "VN_20121005215040254" $array[49][0] = "PicStory-2012-04-01-02-53" $array[50][0] = "2012-12-209510-42-3195121" $array[51][0] = "2012-12-219512-05-0395507" $array[52][0] = "2014-08-259507.27.29" $array[53][0] = "2013-01-29" ;should not match $array[54][0] = "0623112010" $array[55][0] = "0710122020" $array[56][0] = "0710122022" $array[57][0] = "0710122024" $array[58][0] = "0710122026" $array[59][0] = "0710122020" $array[60][0] = "0710122022" $array[61][0] = "0710122023a" $array[62][0] = "0710122024" $array[63][0] = "0710122026" $array[64][0] = "13659097338151" ;new $array[65][0] = "20150102_171408" $array[66][0] = "20150104_174204" $array[67][0] = "20150104_174353" $array[68][0] = "20150104_175104" $array[69][0] = "20150104_181751" $array[70][0] = "20150102_171408" $array[71][0] = "20150104_174204" $array[72][0] = "20150104_174353" $array[73][0] = "20150104_175104" $array[74][0] = "20150104_181751" $array[75][0] = "20150104_184735" $array[76][0] = "20150209_200557" $array[77][0] = "20150313_200638" $array[78][0] = "20150313_200914" $array[79][0] = "20150313_201126" $array[80][0] = "20150418_201504" $array[81][0] = "20150419_100142" $array[82][0] = "IMG_20150219_121547_663" $array[83][0] = "IMG_20150219_145706_239" Return $array EndFunc ;==>dates_array #EndRegionby the way i move this section to the _ExtractDate function - hope you dont mind - dont think that's making the difference right?$sDate = StringRegExpReplace(StringReplace($sDate, ' ', '_'), '[/\:]', '') $sDate = _Date_Time_Convert($sDate, "yyyyMMdd_HHmmss", "MM/dd/yyyy hh:mm TT") ; An exception is made for AM and PM $aException = StringRegExp($sString, '(?i)(?: )(\d{1,2})(?:\:)(\d{1,2})( [AP]M)',3) If IsArray($aException) Then $sDate = StringLeft($sDate, 10) & _ ' ' & StringFormat('%02i:%02i', $aException[0], $aException[1]) & $aException[2]thanks for your help! Link to comment Share on other sites More sharing options...
czardas Posted June 9, 2015 Share Posted June 9, 2015 (edited) Duh, my fault. The problem is in the function _IsWithin50Years(). I overlooked a silly thing, by focusing more attention on the other function. Replace it with the following version:; This time window can easily be extended to further in the past. Func __IsWithin50Years($vYear, $iRange = 50) Local $iCurrentYear = @YEAR Return $vYear <= $iCurrentYear And $iCurrentYear - $vYear < $iRange ; Modified EndFunc ;==> __IsWithin50Years Edited June 9, 2015 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
gcue Posted June 9, 2015 Author Share Posted June 9, 2015 worked like a charm!noticed it somehow got a date from element 54.. 0623112010 | 01/01/2010 12:00 AM Link to comment Share on other sites More sharing options...
czardas Posted June 9, 2015 Share Posted June 9, 2015 (edited) I was never quite happy with including a single 4 digit year. Try replacing line 29 with this:Local $sYY = '(?:\A|\D)(\d{4})(?:\D|\z)' ; Just 4 digits *YYYYIt should still be able to match names like myFile2015.txt, but not myFile20155.txt. It would be less inclined towards false positives if these four digit matches were excluded from the results altogether.Edit: I just modified the above expression once again. Edited June 9, 2015 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
gcue Posted June 9, 2015 Author Share Posted June 9, 2015 perfect!thank you VERY much!!! Link to comment Share on other sites More sharing options...
czardas Posted June 9, 2015 Share Posted June 9, 2015 (edited) I hope you saw my last edit. Fuzzy algorithms like this often require some testing before you get exactly what you want. Edited June 9, 2015 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
gcue Posted June 9, 2015 Author Share Posted June 9, 2015 got it - thanks for the heads up! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now