Damein Posted July 14, 2016 Posted July 14, 2016 Not sure what to label this.. and it may be easier to use StringRegExp but I'm still a noob at that ^^; I need a way to compare two text files and find similarities. Now I know some of these already exist but I wanted one specific for what I need. Sadly, there are various types of formats that these text files can take. Here are the variants. A: plain_text_here number B: plain_text_here "number" C: plain_text_here 'number' D: plain_text_here"number" E: plain_text_here'number' F: plain_text_here"number" // text here G: plain_text_here'number' // text here H: plain_text_here "number" "number" I: plain_text_here 'number 'number' So yeah, I need a way to split the: plain_text_here and all the possible: number scenarios and group them back up to check against each other. Each of the variants are on a new line so the first thing I did was just _FileReadToArray and than start cycling through the lines. Example File plain_text_here "1" plain_text_here "2" plain_text_here '3' plain_text_here '4' '5' plain_text_here"6" I started doing it with just StringSplits but soon got lost in all the variants and wanted to see if anyone know of a better/cleaner way. Thanks in advance! Most recent sig. I made Quick Launcher W/ Profiles Topic Movie Database Topic & Website | LiveStreamer Pro Website | YouTube Stand-Alone Playlist Manager: Topic | Weather Desktop Widget: Topic | Flash Memory Game: Topic | Volume Control With Mouse / iTunes Hotkeys: Topic | Weather program: Topic | Paws & Tales radio drama podcast mini-player: Topic | Quick Math Calculations: Topic
orbs Posted July 14, 2016 Posted July 14, 2016 after reading the lines to array, try this on each line: step 1: trim any trailing comment (beginning with // ) step 2: use StringSplit with several delimiters simultaneously. the delimiters are whitespace, single quote, and double quote. step 3: loop from the final substring backward, stop when the substring is not whitespace or number (meaning, you got to the plane_text_here substring). during the loop, ignore anything that is not a number. if the plain_text_here substring does not contain whitespace or quotes, then you can make it easier - loop forward starting at 2nd substring. Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff
Moderators Melba23 Posted July 14, 2016 Moderators Posted July 14, 2016 Damein, A Regex would seem the logical way to do this - her is my poor effort which requires 2 passes: Global $aList[10] = [9] $aList[1] = "plain_1_text_here 11" $aList[2] = 'plain_2_text_here "2"' $aList[3] = "plain_3_text_here '33'" $aList[4] = 'plain_4_text_here"4"' $aList[5] = "plain_5_text_here'55'" $aList[6] = 'plain_6_text_here"6" // text here' $aList[7] = "plain_7_text_here'77' // text here" $aList[8] = 'plain_8_text_here "8" "88"' $aList[9] = "plain_9_text_here '9 '99" For $i = 1 To $aList[0] $sExtract_Text = StringRegExpReplace($aList[$i], "^(?U)(.*)[\s\x22\x27].*$", "$1") $sExtract_Numbers = StringRegExpReplace($aList[$i], "^" & $sExtract_Text & "(.*)$", "$1") $aNumbers = StringRegExp($sExtract_Numbers, "\d++", 3) For $j = 0 To UBound($aNumbers) - 1 $sExtract_Text &= " " & $aNumbers[$j] Next ConsoleWrite($sExtract_Text & @CRLF) Next No doubt a guru will come along soon and show us how to do it on one. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area
iamtheky Posted July 14, 2016 Posted July 14, 2016 (edited) #include<array.au3> Global $aList[10] = [9] $aList[1] = "plain_1_text_here 11" $aList[2] = 'plain_2_text_here "2"' $aList[3] = "plain_3_text_here '33'" $aList[4] = 'plain_4_text_here"4"' $aList[5] = "plain_5_text_here'55'" $aList[6] = 'plain_6_text_here"6" // text here' $aList[7] = "plain_7_text_here'77' // text here" $aList[8] = 'plain_8_text_here "8" "88"' $aList[9] = "plain_9_text_here '9 '99" For $i = 1 To $aList[0] $aList[$i] = stringregexpreplace(stringregexpreplace(stringregexpreplace(stringreplace(stringreplace($aList[$i], "'" , "") , '"' , '') , "//(.*)" , "") , "(\d+\s)\d+" , "$1"), "(\D)(\d)" , "$1 $2") Next _ArrayDisplay($aList) edit: I am not the guru who was spoken of earlier, and my solution is janky at best. Edited July 14, 2016 by iamtheky ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__)
SadBunny Posted July 14, 2016 Posted July 14, 2016 Not sure what you mean by plain_text, but I gather it's probably pretty much anything before the first delimiter where the delimiter is any space, single quote or double quote. Second, I assume that in: Quote I: plain_text_here 'number 'number' ... there should be a single quote behind that first number? Third, not sure what you want with those "// text here" parts, but I'm guessing you want to just leave that in, and if so then also that a space behind wouldn't matter? Maybe something like this? #include <MsgBoxConstants.au3> #include <StringConstants.au3> Global $aList[10] = [9] $aList[1] = "plain_1_text_here 11" $aList[2] = 'plain_2_text_here "2"' $aList[3] = "plain_3_text_here '33'" $aList[4] = 'plain_4_text_here"4"' $aList[5] = "plain_5_text_here'55'" $aList[6] = 'plain_6_text_here"6" // text here' $aList[7] = "plain_7_text_here'77' // text here" $aList[8] = 'plain_8_text_here "8" "88"' $aList[9] = "plain_9_text_here '9' '99'" For $i = 1 To $aList[0] $extract = StringRegExpReplace($aList[$i], "[ '""]+", " $1") ConsoleWrite("|" & $extract & "|" & @CRLF) Next This would simply change any combination of <delimiter(s)><any number of digits><delimiter(s)>. It leaves the "// text here" in, and it leaves a space after most strings. But it is very simple and maybe that's enough. I put pipe symbols around the results so you can see where the spaces are. Roses are FF0000, violets are 0000FF... All my base are belong to you.
mikell Posted July 14, 2016 Posted July 14, 2016 (edited) Assuming that "plain_text_here" contains no space(s), this *should* work #Include <Array.au3> Global $aList[10] = [9] $aList[1] = "plain_1_text_here 11" $aList[2] = 'plain_2_text_here "2"' $aList[3] = "plain_3_text_here '33'" $aList[4] = 'plain_4_text_here"4"' $aList[5] = "plain_5_text_here'55'" $aList[6] = 'plain_6_text_here"6" // text here' $aList[7] = "plain_7_text_here'77' // text here" $aList[8] = 'plain_8_text_here "8" "88"' $aList[9] = "plain_9_text_here '9 '99" For $i = 1 To $aList[0] $aList[$i] = StringRegExpReplace($aList[$i], '^\w+\K|[\s\D]+', " ") Next _ArrayDisplay($aList) Edited July 14, 2016 by mikell
jguinch Posted July 14, 2016 Posted July 14, 2016 (edited) One shot split: #Include <Array.au3> Global $aList[10] = [9] $aList[1] = "plain_1_text_here 11" $aList[2] = 'plain_2_text_here "2"' $aList[3] = "plain_3_text_here '33'" $aList[4] = 'plain_4_text_here"4"' $aList[5] = "plain_5_text_here'55'" $aList[6] = 'plain_6_text_here"6" // text here' $aList[7] = "plain_7_text_here'77' // text here" $aList[8] = 'plain_8_text_here "8" "88"' $aList[9] = "plain_9_text_here '9 '99" For $i = 1 To $aList[0] $aRes = StringRegExp($aList[$i], "(?|^(\w+)|(\d+))", 3) _ArrayDisplay($aRes) Next sorry, edited because of horrible way Edited July 14, 2016 by jguinch SadBunny 1 Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF
mikell Posted July 14, 2016 Posted July 14, 2016 jguinch, Is the reset really necessary ? $aRes = StringRegExp($aList[$i], "^\w+|\d+", 3)
jguinch Posted July 14, 2016 Posted July 14, 2016 absolutly not I was starting with the idea to use groups, so .... Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now