Jump to content

Finding combinations and remove line


Read
 Share

Recommended Posts

  • Replies 91
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Untested code that reads the data file, a file with the patterns and writes all undeleted records to an output file.

#include <file.au3>
Global $aDeletePatterns[1], $sDataFile
; Read the file that contains the patterns to delete
_FileReadToArray("DeletePatterns.txt", $aDeletePatterns)
Global $aDataFile[1]
; Read the file that contains all data records
$hFileIn = FileOpen("DataFile.txt")
$sDataFile = FileRead($hFileIn)
FileClose($hFileIn)
$aDataFile = StringSplit($sDataFile, @LF)
$hFileOut = FileOpen("DataFileOut.txt", 2)
For $i = 1 To $aDataFile[0]
    For $j = 1 To $aDeletePatterns[0]
        $aTemp = StringSplit($aDeletePatterns[$j], " ")
        $sDeletePattern = "(" & $aTemp[1] & ").+(" & $aTemp[2] & ").+(" & $aTemp[3] & ").+(" &  $aTemp[4] & ").+(" &  $aTemp[5] & ")"
        If StringRegExp($aDataFile[$i], $sDeletePattern) = 1 Then
            $aDataFile[$i] = ""
            ExitLoop
        Endif
    Next
    If $aDataFile[$i] <> "" Then FileWriteLine($hFileOut, $aDataFile[$i])
Next
FileClose($hFileOut)
Edited by water

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

It's time for me to go to bed (it's 1:30 am now). Maybe we find a solution tomorrow.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

I haven't read the whole thread, but I hazzard a guess that your text file is rather large. In this case I suggest that you split the file into smaller chunks and run the code on each of the smaller files. Otherwise I don't know how you can get around memory limitations. I may be mistaken, but that's the only reason I can think of this message appearing when you run water's code.

Edited by czardas
Link to comment
Share on other sites

There should not be any memory error with this example because the "big" file is not put into an array.

#include <File.au3>
#include <Array.au3>

Local $sFileName = "combo.txt" ; Full path of file name unless script in same directory.
Local $sNewFileName = "combobackup.txt" ; Name of file with deleted lines.

Local $sDeleteLines = _    ;
        "01 02 14 37" & @CRLF & _
        "01 03 06 07 43"

Local $aLinesEx = StringSplit(StringStripCR($sDeleteLines), @LF, 2)
Local $aMatchLineToDel
;_ArrayDisplay($aLinesEx)

Local $fileRead = FileOpen($sFileName, 0) ; Open to read
If $fileRead = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

Local $fileWrite = FileOpen($sNewFileName, 2) ; Open to write

; Read in lines of text until the EOF is reached
While 1
    Local $line = FileReadLine($fileRead)
    If @error = -1 Then ExitLoop
    Local $Flag = 0 ; Initialize Flag

    ; Check if each number of each line in $sDeleteLines exists in file line, $line.
    For $j = 0 To UBound($aLinesEx) - 1
        $aMatchLineToDel = StringSplit(StringStripWS($aLinesEx[$j], 7), " ", 2)

        For $k = 0 To UBound($aMatchLineToDel) - 1 ; Check all individual numbers of a particular line in $sDeleteLines.
            If StringInStr($line, $aMatchLineToDel[$k]) <> 0 Then
                $Flag = 1
            Else
                $Flag = 0
                ExitLoop ; Test next line in $sDeleteLines.
            EndIf
        Next
        If $Flag = 1 Then ; All numbers match for this particular line in $sDeleteLines.. So, exit For-Next loop and do not write to file
            ExitLoop ; Test next line in file.
        EndIf
    Next
    If $Flag = 0 Then
        FileWriteLine($fileWrite, $line); File handle used.
    EndIf
WEnd

FileClose($fileRead)
FileClose($fileWrite)

ShellExecute($sNewFileName)
Link to comment
Share on other sites

If you check your input file with Windows Explorer how large is it?

How much free memory does your PC have?

What operating system do you use? 32 or 64 bit?

Do you compile your script to use 64 bit?

Edited by water

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Another question: are the numbers in each row and each line always sorted?

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

Malkey,

Code is running but is slow. My PHP script a little faster - strange.

water,

If you check your input file with Windows Explorer how large is it? / 211 MB (221 384 704 bytes)

How much free memory does your PC have? windows XP / 2GHz and 3GB RAM

What operating system do you use? 32 or 64 bit? 32

Do you compile your script to use 64 bit? 32

UEZ,

Yes!!

Link to comment
Share on other sites

Read,

Run the following after changing the file name to the file that has all number strings

;
;
;
#include <array.au3>
Local $mst_fl = "put your fully qualified file name here"
Local $h_mstfl = FileOpen($mst_fl,0)
Local $a_10    = StringSplit(FileRead($h_mstfl),@crlf,3)
ConsoleWrite(UBound($a_10) & @lf)

watch the console output and tell me what number is displayed

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

The script not working??

Local $mst_fl = "DataFile.txt"

I tried too: c:itDataFile.txt

Edit

"Error allocating memory" open many many popup windows :)

Edit

To you want to know how many lines is my document?

Lines: 11651612

Read,

Run the following after changing the file name to the file that has all number strings

;
;
;
#include <array.au3>
Local $mst_fl = "put your fully qualified file name here"
Local $h_mstfl = FileOpen($mst_fl,0)
Local $a_10 = StringSplit(FileRead($h_mstfl),@crlf,3)
ConsoleWrite(UBound($a_10) & @lf)

watch the console output and tell me what number is displayed

kylomas

Edited by Read
Link to comment
Share on other sites

I modified my post because I got the "Error allocating memory" as well. Replacing _FileReadToArray with Fileread plus StringSplit works.

If you use @CRLF in your file you have to replace line

$aDataFile = StringSplit($sDataFile, @LF)
with
$aDataFile = StringSplit($sDataFile, @CRLF, 1)

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Read,

Do you have procexp.exe installed? If so, go to View|System Information|Memory and see what your ACTUAL available memory is..

If not, it can be downloaded from microsoft (do a google search on it)

Also, how did you run the script that I gave you?

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

@kylomas

I'm not 100% sure but it seems to be a bug with AutoIt. If you use my code in post #42 you won't get the memory error.

I tested my original script with a 200MB file and got the memory error on Windows 7 64 bit. After I changed the script as you see it now everyhting works just fine.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Could you please try my script in post #42 again?

Make sure that "DataFile.txt" is replaced by the filename of your 200MB data file.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...