Marlo Posted November 18, 2012 Posted November 18, 2012 (edited) So I have a ~6Mb that is formatted like so: { "realm":{"name":"Someserver","slug":"someserver"}, "side1":{"data":[ {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]}, "Side2":{"data":[ {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]}, "Side3":{"data":[ {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]} } Now bearing in mind that the file can oft times contain 50k lines of this stuff. I started by reading the file line by line and parsing it with a simple regexp string which extracted the basic info and pushed it into a SQLite memory database but even so it takes upwards of 30 seconds to process a whole file (and it takes about 30-50% CPU usage). Here is the RegExp i used; ^.*?{.?auc":(d*).*?"item":(d*),"owner":"([w]+)","bid":(d*),"buyout":(d*),"quantity":(d*),"timeLeft":"([a-zA-Z_]+)"} I am new to RegExp so my method is probably very bad : / So does anyone know a better way for me to be doing this? My way feels way too clunky. Edited November 18, 2012 by Marlo Click here for the best AutoIt help possible.Currently Working on: Autoit RAT
KaFu Posted November 18, 2012 Posted November 18, 2012 (edited) Processing Reading the file line by line is a bottleneck. A simple way is to use _FileReadToArray(), that should be able to handle 6MB input files. Loop through the resulting array and apply your RegEx. For larger file I would recommend your own parser, reading e.g. 1MB chunks. Look for the last linebreak in the buffer (stringinstr -1) and parse the data up to that point, transfer the rest to a new buffer and read the next 1MB chunk. Splitting the lines with a RegExp should alreay be quite fast. Edited November 18, 2012 by KaFu OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2024-Oct-13) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Oct-13) HMW - Hide my Windows (2024-Oct-19) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2024-Oct-20) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16)
AZJIO Posted November 18, 2012 Posted November 18, 2012 (edited) expandcollapse popup$sText = _ '{' & @CRLF & _ '"realm":{"name":"Someserver","slug":"someserver"},' & @CRLF & _ '"side1":{"data":[' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]},' & @CRLF & _ '"Side2":{"data":[' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]},' & @CRLF & _ @CRLF & _ '"Side3":{"data":[' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]}' & @CRLF & _ '}' ; MsgBox(0, "Сообщение", $sText) $aText = StringRegExp($sText, '(?m)^.*?{"auc":(d*).*?"item":(d*),"owner":"([w]+)","bid":(d*),"buyout":(d*),"quantity":(d*),"timeLeft":"(w+)"}', 3) If Not @error Then $n = UBound($aText) Local $aText2D[$n / 7 + 1][7] = [[$n / 7]] For $i = 0 To $n - 1 Step 7 $d = $i / 7 + 1 $aText2D[$d][0] = $aText[$i] $aText2D[$d][1] = $aText[$i + 1] $aText2D[$d][2] = $aText[$i + 2] $aText2D[$d][3] = $aText[$i + 3] $aText2D[$d][4] = $aText[$i + 4] $aText2D[$d][5] = $aText[$i + 5] $aText2D[$d][6] = $aText[$i + 6] Next EndIf #include <Array.au3> _ArrayDisplay($aText2D, 'Array') Edited November 18, 2012 by AZJIO My other projects or all
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now