protfromkpax Posted January 8, 2023 Share Posted January 8, 2023 (edited) Hello, is it possible to use AutoIT to determine whether a file is in PDF or PDF/A (or PDF/A-1, -2, -3) format? Edited January 9, 2023 by protfromkpax Link to comment Share on other sites More sharing options...
Danp2 Posted January 8, 2023 Share Posted January 8, 2023 Have you reviewed this thread? Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
Developers Jos Posted January 8, 2023 Developers Share Posted January 8, 2023 (edited) English please!!! Moved to the appropriate AutoIt General Help and Support forum, as the Developer General Discussion forum very clearly states: Quote General development and scripting discussions. Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums. Moderation Team Edited January 8, 2023 by Jos SciTE4AutoIt3 Full installer Download page - Beta files Read before posting How to post scriptsource Forum etiquette Forum Rules Live for the present, Dream of the future, Learn from the past. Link to comment Share on other sites More sharing options...
protfromkpax Posted January 9, 2023 Author Share Posted January 9, 2023 12 hours ago, Danp2 said: Have you reviewed this thread? Yes, but the thread does not contain PDF/A... Link to comment Share on other sites More sharing options...
Danp2 Posted January 9, 2023 Share Posted January 9, 2023 I assume you tried running the code. What were the results? Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
protfromkpax Posted January 9, 2023 Author Share Posted January 9, 2023 (edited) 1 hour ago, Danp2 said: I assume you tried running the code. What were the results? The code checks if the file is in PDF format (at the beginning of PDF file is version 1.5, 1.7...), but doesn't know the difference between PDF and PDF/A. I tried to run, result: "PDF-File detected" 🙂 Edited January 9, 2023 by protfromkpax Link to comment Share on other sites More sharing options...
protfromkpax Posted January 9, 2023 Author Share Posted January 9, 2023 When the original file is from Word, I can recognize the PDF/A by the parts of the PDF file. When it contains graphics, there are "Chinese characters" inside the PDF... Link to comment Share on other sites More sharing options...
rsn Posted January 9, 2023 Share Posted January 9, 2023 Validating that a document is PDF/A is kind of a black magic test. There are test suites available that check the contents of a PDF against the standard but the standard is kind of nebulous in its definitions. The most reliable (IMHO) is from VeraPDF (https://verapdf.org/software/). It's java based but comes with a decent command line tool. The command line: verapdf.bat --format text "c:\path\to\yourfile.PDF" will simplify the output to pass/fail. Link to comment Share on other sites More sharing options...
protfromkpax Posted January 10, 2023 Author Share Posted January 10, 2023 14 hours ago, rsn said: Validating that a document is PDF/A is kind of a black magic test. There are test suites available that check the contents of a PDF against the standard but the standard is kind of nebulous in its definitions. The most reliable (IMHO) is from VeraPDF (https://verapdf.org/software/). It's java based but comes with a decent command line tool. The command line: verapdf.bat --format text "c:\path\to\yourfile.PDF" will simplify the output to pass/fail. Thanks, the GUI works but the command prompt says "access denied"... I'm writing sw for a specific professional group of users to sign a set of PDF files. Due to a change in the law we have to use PDF/A so I need a check. Since my sw runs on many other PCs, I'm looking for a solution where I can write some code in AutoIT and send it as an update... I can see that it probably won't be easy 🙂 Link to comment Share on other sites More sharing options...
rsn Posted January 10, 2023 Share Posted January 10, 2023 Not sure why you'd get an access denied in any of it. The applet isn't really "installed," just kind of copied to your user profile. Maybe a java issue? The VeraPDF test suite is actually open source (GPL3/MPL2) so if you had the time and talent (unlike me! ) you could compile your own version of the test or convert it to your language of choice. See https://github.com/verapdf. Now that I think on it, since it's so liberally licensed, you might even be able to bundle it with your app. As long as some form of java interpreter is present on the PC as well (a custom/mini build of OpenJDK would work to get around Oracle's fees for business/enterprise use). Link to comment Share on other sites More sharing options...
Solution bdr529 Posted January 10, 2023 Solution Share Posted January 10, 2023 #include <String.au3> msgbox("","",check_pdfa("AutoIt_Featured_640x480.pdf")) func check_pdfa($file_init_pdf) dim $fileopen=fileopen($file_init_pdf,16) dim $fileread=BinaryToString(FileRead ($fileopen)) fileclose($fileopen) dim $versione_pdf=stringmid($fileread,2,7) dim $_StringBetween_part=_StringBetween($fileread,"pdfaid:part='","'") dim $_StringBetween_conformance=_StringBetween($fileread,"pdfaid:conformance='","'") if not isarray($_StringBetween_part) or not isarray($_StringBetween_conformance) then $_StringBetween_part=_StringBetween($fileread,'pdfaid:part="','"') $_StringBetween_conformance=_StringBetween($fileread,'pdfaid:conformance="','"') endif if not isarray($_StringBetween_part) or not isarray($_StringBetween_conformance) then $_StringBetween_part=_StringBetween($fileread,"pdfaid:part>","<") $_StringBetween_conformance=_StringBetween($fileread,"pdfaid:conformance>","<") endif if isarray($_StringBetween_part) and isarray($_StringBetween_conformance) and ($_StringBetween_part[0]="1" or $_StringBetween_part[0]="2" or $_StringBetween_part[0]="3") and _ ($_StringBetween_conformance[0]="a" or $_StringBetween_conformance[0]="b" or $_StringBetween_conformance[0]="u") Then if $_StringBetween_part[0]&$_StringBetween_conformance[0]<>"1u" then return seterror(0,0,$versione_pdf&" PDF/A-"&$_StringBetween_part[0]&$_StringBetween_conformance[0]) Else return seterror(2,0,$versione_pdf) endif Else return seterror(1,0,$versione_pdf) endif EndFunc AutoIt_Featured_640x480.pdf rsn 1 To community goes all my regards and thanks Link to comment Share on other sites More sharing options...
protfromkpax Posted January 11, 2023 Author Share Posted January 11, 2023 15 hours ago, bdr529 said: #include <String.au3> msgbox ( "" , "" , check _ pdfa ( "AutoIt_Featured_640x480.pdf " ) ) func check _ pdfa ( $ file_init_pdf ) dim $ fileopen = fileopen ( $ file_init_ToString ) $ 1 ( FileRead ( $fileopen ) ) fileclose ( $fileopen ) dim $ versione_pdf = stringmid ( $fileread , 2 , 7 ) dim $_StringBetween_part = _StringBetween ( $fileread , "pdfaid:part='" , "'" ) dim $_StringBetween_conformance = _StringBetween ( $filereadcon , "pdfaid " : ) pokud není isarray ( $_StringBetween_part ) nebo není isarray ( $_StringBetween_conformance ) , pak $_StringBetween_part = _StringBetween ( $fileread , 'pdfaid:part="' , '"' ) $_StringBetween_conformance = _StringBetween ( $fileread , 'pdfaid:conformance="' , '"' ) endif if ne isarray ( $ _Stringray ) Between nebo ne $_StringBetween_conformance ) potom $_StringBetween_part = _StringBetween ( $fileread , "pdfaid:part>" , "<" ) $_StringBetween_conformance = _StringBetween ( $fileread , "pdfaid:conformance>" , "<" ) endif if isarray ( $_StringBetween_part ) a isarray ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) = [ 0 ] nebo $ _StringBetween_part = [ 0 ] "2" nebo $_StringBetween_part [ 0 ] = "3" ) a _ ( $_StringBetween_conformance [ 0 ] = "a" nebo $_StringBetween_conformance [ 0 ] = "b" nebo $_StringBetween_conformance [ 0 ] = "u" ) Potom , pokud $_StringBetween_part [ 0 ] & $_StringBetween_conform [ vrátit potom 1 " > < u> seterror ( 0 , 0 , $versione_pdf & " PDF/A-" &$_StringBetween_part [ 0 ] & $_StringBetween_conformance [ 0 ] ) Else return seterror ( 2 , 0 , $ versione_pdf ) endif Else return seterror ( 1 , 0 , $ versione_pdf ) endif EndFunc AutoIt_Featured_640x480.pdf 18,07 kB · 7 stažení You are a great magician! This is exactly what I was looking for. I only understand a little code, but I will learn everything. Thank you so much, I want to dance with joy! (now many nights await me on your code 🙂 ) Link to comment Share on other sites More sharing options...
bdr529 Posted January 11, 2023 Share Posted January 11, 2023 I'm the one to thank the autoit community To community goes all my regards and thanks Link to comment Share on other sites More sharing options...
rsn Posted January 12, 2023 Share Posted January 12, 2023 @bdr529 Until I read your code, I never thought to read the metadata. I open a pdf in HxD and there it is: the versions of the PDF and which levels of conformance. I didn't realize that some of the meta data is excluded from the viewable properties. Great work! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now