ghatturk Posted September 13, 2021 Share Posted September 13, 2021 Hello Friends, I am working on a project to extract data from PDF using PDF-XChange Editor. Is there any library to extract the data and tables from PDF? Thank you for any help! Ravi Link to comment Share on other sites More sharing options...
seadoggie01 Posted September 13, 2021 Share Posted September 13, 2021 I'm not familiar with that program. However if you can get access to the text then my solution has been to write a regular expression to extract the data to a 1-D array. Of course, this requires that the data is in a standard format, which is not always the case. Once you have the 1-D array, you can convert it into a 2-D array which is nearly a table All my code provided is Public Domain... but it may not work. Use it, change it, break it, whatever you want. Spoiler My Humble Contributions:Personal Function Documentation - A personal HelpFile for your functionsAcro.au3 UDF - Automating Acrobat ProToDo Finder - Find #ToDo: lines in your scriptsUI-SimpleWrappers UDF - Use UI Automation more Simply-erKeePass UDF - Automate KeePass, a password managerInputBoxes - Simple Input boxes for various variable types Link to comment Share on other sites More sharing options...
ghatturk Posted September 13, 2021 Author Share Posted September 13, 2021 (edited) [AB_012]{Temp} This is a sample text for standard form. ⌈ Table Comes here with info Data1 This is first explanation for Data1 Info1 This is the explanation for Info1 ⌋(CD_345, CD_678) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Above is the explanation of the most standard form of data repeated with different numbers in pdf. As you can see, there are 2 special characters ⌈ and ⌋, not sure how to delimit them. Also, when I extract to excel, sometimes the critical data comes without spaces as "Thisisasampletextforstandardform." PS: I was trying to play with excel if not possible with PDF. Thank you for your first input seadoggie01 Edited September 13, 2021 by ghatturk Link to comment Share on other sites More sharing options...
Musashi Posted September 13, 2021 Share Posted September 13, 2021 55 minutes ago, ghatturk said: PS: I was trying to play with excel if not possible with PDF. For me this somehow implies, that you can specify the file format in which the data are available by yourself. Could you please describe in more detail, where e.g. the data you want to extract originates from. seadoggie01 1 "In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move." Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now