timevex Posted November 20, 2009 Share Posted November 20, 2009 I have a large set of tab separated and comma separated files that I'd like import into an sqlite data base using sqlite.au3 however I'm having trouble splitting the fields correctly due to separators within double quoted text. example;Field1,"Field 2", Field312334,"Quoted Text","city,state"Does anyone know a clean method to do string split that ignores double quoted delimiters?Would replacing quoted delimiters with a temporary string via a regular expressions replace work? I'm horrible at regular expressions, would anyone be able to provide the regular expression syntax to replace delimiters inside double quotes? Example; String: (1234,"quoted text string","quoted string containing a delimiter, example")SearchString: any double quoted commasReplaceString: "!+!"Output: (1234,"quoted text string","quoted string containing a delimiter!+! example")Thoughts?Thanks in advance!Tim Link to comment Share on other sites More sharing options...
weaponx Posted November 20, 2009 Share Posted November 20, 2009 (edited) expandcollapse popup$aRow = _StringSplitRegExp($sRow, ',(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))') #cs ---------------------------------------------------------------------------- AutoIt Version: 3.3.0.0 Author: WeaponX Script Function: Split string on regular expression Parameters: String = String to be split Pattern = Pattern to split on IncludeMatch = True / False - Indicates whether or not to include the match in the return (back-reference) Count = Number of splits to perform #ce ---------------------------------------------------------------------------- Func _StringSplitRegExp($sString, $sPattern, $sIncludeMatch = false, $iCount = 0) ;All matches will be replaced with this string Local $sReservedPattern = Chr(0) ;Local $sReservedPattern = "#" Local $sReplacePattern = $sReservedPattern ;Modify the reserve pattern to include back-reference If $sIncludeMatch Then $sReplacePattern = "$0" & $sReplacePattern ;Replace all occurences of the search pattern with a replace string $sTemp = StringRegExpReplace($sString, $sPattern, $sReplacePattern, $iCount) ;Consolewrite($sTemp & @CRLF) ;Strip trailing character if it matches the reserved pattern If StringRight($sTemp, 1) = $sReservedPattern Then $sTemp = StringTrimRight($sTemp, 1) ;Split string using entire reserved string $aResult = StringSplit($sTemp, $sReservedPattern, 1) Return $aResult EndFunc Edited November 20, 2009 by weaponx maniootek 1 Link to comment Share on other sites More sharing options...
timevex Posted November 20, 2009 Author Share Posted November 20, 2009 WeaponX,That is absolutely beautiful! I need to work with CSV's and tab separated files, Could you provide a regular expression string that will split on TAB's but not on double quoted TABs? Thank you!Tim Link to comment Share on other sites More sharing options...
Authenticity Posted November 21, 2009 Share Posted November 21, 2009 #include <Array.au3> Local $avStrings[3][3] = [['Field1,"Field 2", Field3,"try," "to,", ",beat this, ", one', ',', '"'], ["12334 'Quoted Text' 'city state'", @TAB, "'"], ["@Lorem#ispm@#dolor#@sit amet@", '#', '@']] Local $avMatches For $i = 0 To UBound($avStrings)-1 $avMatches = _StringSplitSV($avStrings[$i][0], $avStrings[$i][1], $avStrings[$i][2]) If Not @error Then _ArrayDisplay($avMatches) Next Func _StringSplitSV($sString, $sSepChr = ",", $sSpecialChr = '"') Return StringRegExp($sString, "\G(?:\Q" & $sSepChr & "\E|^)((?>[^\Q" & $sSepChr & $sSpecialChr & "\E]*(?:" & $sSpecialChr & "[^\Q" & $sSpecialChr & "\E]*" & $sSpecialChr & ")?)+)", 3) EndFunc Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now