Jump to content

File Detect Type


Trong
 Share

Recommended Posts

I had a problem when opening image files downloaded from the internet for editing, it seems that the file extension does not represent the file correctly.

For example: MyPic.png but it is a jpg file

so I wrote this function to distinguish the correct extension for the files.

If you have a better solution or function, please share:

;~ #include <File.au3>
;~ Global $sPathToFiles = 'C:\_Website_\random-img\img\horizontal'
;~ Local $aArray = _FileListToArrayRec($sPathToFiles, "*.*", $FLTAR_FILES, $FLTAR_RECUR, 0, $FLTAR_FULLPATH)

;~ Local $sFile, $orgTYPE, $curTYPE
;~ For $i = 1 To UBound($aArray) - 1
;~  $sFile = $aArray[$i]
;~  $orgTYPE = _FileDetectType($sFile)
;~  $curTYPE = __GetEXT($sFile)
;~  If $curTYPE <> $orgTYPE Then
;~      ConsoleWrite('! [' & $i & '] ' & $sFile & ' - [' & $curTYPE & ' > ' & $orgTYPE & ']' & @CRLF)
;~      FileMove($sFile, StringReplace($sFile, $curTYPE, $orgTYPE), 1)
;~  Else
;~      ConsoleWrite('- [' & $i & '] ' & $sFile & ' - [' & $curTYPE & ' > ' & $orgTYPE & ']' & @CRLF)
;~  EndIf
;~ Next

Func _FileDetectType($sFile)
    Local $iExt = __GetEXT($sFile)
    Local $sExt = __ExtHeaderProcessing($sFile, 16)
    If $sExt = '' Then $sExt = __ExtHeaderProcessing($sFile, 256)
    $sExt = StringLeft($sExt, 16)
    Local $2String = StringLeft($sExt, 2)
    If StringInStr($sExt, 'Exif') Or StringInStr($sExt, 'JFIF') Or StringInStr($sExt, 'Adobe_d') Then
        Return '.jpg'
    ElseIf StringInStr($sExt, 'avif') Then
        Return '.avif'
    ElseIf StringInStr($sExt, 'WEBP') Then
        Return '.webp'
    ElseIf StringInStr($sExt, 'GIF') Then
        Return '.gif'
    ElseIf $2String == 'BM' Then
        Return '.bmp'
    ElseIf StringInStr($sExt, '_II') Then
        Return '.eps'
    ElseIf StringInStr($sExt, 'PDF') Then
        Return '.pdf'
    ElseIf StringInStr($sExt, 'PNG') Then
        Return '.png'
    ElseIf StringInStr($sExt, 'II') Then
        Return '.tif'
    ElseIf StringInStr($sExt, '8BPS') Then
        Return '.psd'
    ElseIf StringInStr($sExt, 'L_F') Or $sExt == 'L' Then
        Return '.lnk'
    ElseIf StringInStr($sExt, 'ITSF') Then
        Return '.chm'
    ElseIf StringInStr($sExt, 'MSCF') Then
        Return '.cab'
    ElseIf StringInStr($sExt, 'ADBE') Then
        Return '.icc'
    ElseIf StringInStr($sExt, 'SQLite_f') Then
        Return '.db'
    ElseIf $sExt == 'MZ' Or $sExt == 'MZx' Or $sExt == 'MZP' Then
        If StringInStr($iExt, 'dll') Then Return '.dll'
        If StringInStr($iExt, 'tlb') Then Return '.tlb'
        Return '.exe'
    ElseIf StringInStr($sExt, 'PA30') Or StringInStr($sExt, 'TPA30') Or StringInStr($sExt, 'F_B') Or $sExt == 't_B' Or $sExt == 's_B' Or $sExt == 'z_B' Or $sExt == 'B_V' Or $sExt == '00_h' Or $sExt == 'f_h' Or $sExt == 'v' Or $sExt == 'h' Or $sExt == 'B' Then
        Return '.ico'
    Else
        Return $iExt
    EndIf
EndFunc   ;==>_FileDetectType

Func __ExtHeaderProcessing($sFile, $rCount = Default)
    Local $hOpen = FileOpen($sFile, 16)
    Local $Header = FileRead($hOpen, $rCount)
    FileClose($hOpen)
    Local $RegExNonStandard = "(?i)([^a-z0-9-_])", $RegExNoUnicode = "(*UCP)\x{2019}"
    Local $sExt = BinaryToString($Header)
    $sExt = StringRegExpReplace($sExt, $RegExNonStandard, "_")
    $sExt = StringRegExpReplace($sExt, $RegExNoUnicode, "_")
    While StringInStr($sExt, '__')
        $sExt = StringReplace($sExt, '__', '_')
    WEnd
    If StringRight($sExt, 1) == '_' Then $sExt = StringTrimRight($sExt, 1)
    If StringLeft($sExt, 1) == '_' Then $sExt = StringTrimLeft($sExt, 1)
    Return $sExt
EndFunc   ;==>__ExtHeaderProcessing

Func __GetEXT($sFile)
    Local $iExt = StringRegExpReplace($sFile, "^.*\.", "")
    If ($iExt = '') Then $iExt = StringMid($sFile, StringInStr($sFile, ".", 2, -1))
    If StringRight($iExt, 1) == '.' Then $iExt = StringTrimRight($iExt, 1)
    If StringLeft($iExt, 1) == '.' Then $iExt = StringTrimLeft($iExt, 1)
    Return '.' & $iExt
EndFunc   ;==>__GetEXT

 

Edited by Trong

Regards,
 

Link to comment
Share on other sites

I'm using IrfanView as a viewer, which just tells me that the file has been renamed with the correct extension.
Since you did it yourself, congratulations.

 p.s.
in your script rename iFile to sFile

Edited by ioa747

I know that I know nothing

Link to comment
Share on other sites

You can use TrIDLib. It's capable to ID many files based on their signature. Here is a full list with definitions.

#include-once
#include <Array.au3>

Global Const $TRID_GET_RES_NUM      = 1     ; Get the number of results
Global Const $TRID_GET_RES_FILETYPE = 2     ; Filetype descriptions
Global Const $TRID_GET_RES_FILEEXT  = 3     ; Filetype extension
Global Const $TRID_GET_RES_POINTS   = 4     ; Matching points
Global Const $TRID_GET_VER          = 1001  ; TrIDLib version
Global Const $TRID_GET_DEFSNUM      = 1004  ; Filetypes definitions loaded

$aInfo = TrIDLib_AnalyzeFile('<FilePath>')
_ArrayDisplay($aInfo)

Func TrIDLib_AnalyzeFile($sFile, $DefsPack = '', $bVerbose = False, $LibPath = 'TrIDLib.dll')
    Local $aRet, $hTrIDLib, $iTotal = 0
    $hTrIDLib = DllOpen($LibPath)
    If $hTrIDLib = -1 Then Return False
    If $bVerbose Then
        $aRet = DllCall($hTrIDLib, 'int', 'TrID_GetInfo', 'int', $TRID_GET_VER, 'int', 0, 'str', 0)
        ConsoleWrite(Round($aRet[0] / 100, 2) & @CRLF)
    EndIf
    $aRet = DllCall($hTrIDLib, 'int', 'TrID_LoadDefsPack', 'str', $DefsPack)
    If Not $aRet[0] Then
        DllClose($hTrIDLib)
        Return Null
    EndIf
    If Not FileExists($sFile) Then Return Null
    $aRet = DllCall($hTrIDLib, 'int', 'TrID_SubmitFileA', 'str', $sFile)
    If Not $aRet[0] Then
        DllClose($hTrIDLib)
        Return Null
    EndIf
    $aRet = DllCall($hTrIDLib, 'int', 'TrID_Analyze')
    If Not $aRet[0] Then
        DllClose($hTrIDLib)
        Return Null
    EndIf
    $aRet = DllCall($hTrIDLib, 'int', 'TrID_GetInfo', 'int', $TRID_GET_RES_NUM, 'int', 0, 'str', 9)
    If $aRet[0] < 1 Then
        DllClose($hTrIDLib)
        Return Null
    EndIf
    Local $aMatches[$aRet[0] + 1][4]
    $aMatches[0][0] = $aRet[0]
    For $Index = 1 To $aRet[0]
        $aRet = DllCall($hTrIDLib, 'int', 'TrID_GetInfo', 'int', $TRID_GET_RES_FILETYPE, 'int', $Index, 'str', 0)
        $aMatches[$Index][0] = $aRet[3]
        $aRet = DllCall($hTrIDLib, 'int', 'TrID_GetInfo', 'int', $TRID_GET_RES_FILEEXT, 'int', $Index, 'str', 0)
        $aMatches[$Index][1] = $aRet[3]
        $aRet = DllCall($hTrIDLib, 'int', 'TrID_GetInfo', 'int', $TRID_GET_RES_POINTS, 'int', $Index, 'str', 0)
        $aMatches[$Index][2] = $aRet[0]
        $iTotal += $aRet[0]
    Next
    If $iTotal > 0 Then
        For $Index = 1 To $aMatches[0][0]
            $aMatches[$Index][3] = Round($aMatches[$Index][2] * 100 / $iTotal, 2)
        Next
    EndIf
    DllClose($hTrIDLib)
    Return $aMatches
EndFunc

Here you can download the required dll.

Edited by Andreik

When the words fail... music speaks.

Link to comment
Share on other sites

  • Developers
10 minutes ago, locvantienls2 said:

làm thế nào để liên lạc được với bác ạ

 

This is an English forum, so please use a translator!

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

On 9/3/2024 at 5:50 PM, Trong said:

I had a problem when opening image files downloaded from the internet for editing, it seems that the file extension does not represent the file correctly.

For example: MyPic.png but it is a jpg file

so I wrote this function to distinguish the correct extension for the files.

If you have a better solution or function, please share:

;~ #include <File.au3>
;~ Global $sPathToFiles = 'C:\_Website_\random-img\img\horizontal'
;~ Local $aArray = _FileListToArrayRec($sPathToFiles, "*.*", $FLTAR_FILES, $FLTAR_RECUR, 0, $FLTAR_FULLPATH)

;~ Local $sFile, $orgTYPE, $curTYPE
;~ For $i = 1 To UBound($aArray) - 1
;~  $sFile = $aArray[$i]
;~  $orgTYPE = _FileDetectType($sFile)
;~  $curTYPE = __GetEXT($sFile)
;~  If $curTYPE <> $orgTYPE Then
;~      ConsoleWrite('! [' & $i & '] ' & $sFile & ' - [' & $curTYPE & ' > ' & $orgTYPE & ']' & @CRLF)
;~      FileMove($sFile, StringReplace($sFile, $curTYPE, $orgTYPE), 1)
;~  Else
;~      ConsoleWrite('- [' & $i & '] ' & $sFile & ' - [' & $curTYPE & ' > ' & $orgTYPE & ']' & @CRLF)
;~  EndIf
;~ Next

Func _FileDetectType($sFile)
    Local $iExt = __GetEXT($sFile)
    Local $sExt = __ExtHeaderProcessing($sFile, 16)
    If $sExt = '' Then $sExt = __ExtHeaderProcessing($sFile, 256)
    $sExt = StringLeft($sExt, 16)
    Local $2String = StringLeft($sExt, 2)
    If StringInStr($sExt, 'Exif') Or StringInStr($sExt, 'JFIF') Or StringInStr($sExt, 'Adobe_d') Then
        Return '.jpg'
    ElseIf StringInStr($sExt, 'avif') Then
        Return '.avif'
    ElseIf StringInStr($sExt, 'WEBP') Then
        Return '.webp'
    ElseIf StringInStr($sExt, 'GIF') Then
        Return '.gif'
    ElseIf $2String == 'BM' Then
        Return '.bmp'
    ElseIf StringInStr($sExt, '_II') Then
        Return '.eps'
    ElseIf StringInStr($sExt, 'PDF') Then
        Return '.pdf'
    ElseIf StringInStr($sExt, 'PNG') Then
        Return '.png'
    ElseIf StringInStr($sExt, 'II') Then
        Return '.tif'
    ElseIf StringInStr($sExt, '8BPS') Then
        Return '.psd'
    ElseIf StringInStr($sExt, 'L_F') Or $sExt == 'L' Then
        Return '.lnk'
    ElseIf StringInStr($sExt, 'ITSF') Then
        Return '.chm'
    ElseIf StringInStr($sExt, 'MSCF') Then
        Return '.cab'
    ElseIf StringInStr($sExt, 'ADBE') Then
        Return '.icc'
    ElseIf StringInStr($sExt, 'SQLite_f') Then
        Return '.db'
    ElseIf $sExt == 'MZ' Or $sExt == 'MZx' Or $sExt == 'MZP' Then
        If StringInStr($iExt, 'dll') Then Return '.dll'
        If StringInStr($iExt, 'tlb') Then Return '.tlb'
        Return '.exe'
    ElseIf StringInStr($sExt, 'PA30') Or StringInStr($sExt, 'TPA30') Or StringInStr($sExt, 'F_B') Or $sExt == 't_B' Or $sExt == 's_B' Or $sExt == 'z_B' Or $sExt == 'B_V' Or $sExt == '00_h' Or $sExt == 'f_h' Or $sExt == 'v' Or $sExt == 'h' Or $sExt == 'B' Then
        Return '.ico'
    Else
        Return $iExt
    EndIf
EndFunc   ;==>_FileDetectType

Func __ExtHeaderProcessing($sFile, $rCount = Default)
    Local $hOpen = FileOpen($sFile, 16)
    Local $Header = FileRead($hOpen, $rCount)
    FileClose($hOpen)
    Local $RegExNonStandard = "(?i)([^a-z0-9-_])", $RegExNoUnicode = "(*UCP)\x{2019}"
    Local $sExt = BinaryToString($Header)
    $sExt = StringRegExpReplace($sExt, $RegExNonStandard, "_")
    $sExt = StringRegExpReplace($sExt, $RegExNoUnicode, "_")
    While StringInStr($sExt, '__')
        $sExt = StringReplace($sExt, '__', '_')
    WEnd
    If StringRight($sExt, 1) == '_' Then $sExt = StringTrimRight($sExt, 1)
    If StringLeft($sExt, 1) == '_' Then $sExt = StringTrimLeft($sExt, 1)
    Return $sExt
EndFunc   ;==>__ExtHeaderProcessing

Func __GetEXT($sFile)
    Local $iExt = StringRegExpReplace($sFile, "^.*\.", "")
    If ($iExt = '') Then $iExt = StringMid($sFile, StringInStr($sFile, ".", 2, -1))
    If StringRight($iExt, 1) == '.' Then $iExt = StringTrimRight($iExt, 1)
    If StringLeft($iExt, 1) == '.' Then $iExt = StringTrimLeft($iExt, 1)
    Return '.' & $iExt
EndFunc   ;==>__GetEXT

 

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...