Jump to content

Recommended Posts

Posted (edited)

Hi !

is there a way to convert doc files to txt files ?

The solution must be without using ms word because it use a lot of ram and cpu if there is a lot files to convert.

Thinks for your help

My goal is to make statistics about ponctuations, words, paragraphs in a given book

Edited by supergg02
Posted

Hi !

is there a way to convert doc files to txt files ?

The solution must be without using ms word because it use a lot of ram and cpu if there is a lot files to convert.

Thinks for your help

My goal is to make statistics about ponctuations, words, paragraphs in a given book

Searching on google came up with this amongst many others.


Time you enjoyed wasting is not wasted time ......T.S. Elliot
Suspense is worse than disappointment................Robert Burns
God help the man who won't help himself, because no-one else will...........My Grandmother

Posted

as i understand if you are in word and want to just save as a text file, word will prompt you saying " you will loose formatting"

thus your goal is defeated

8)

No the doc files are generated automaticly by an other ocr software and i search a solution to convert them without opening them by word or other

Posted

No the doc files are generated automaticly by an other ocr software and i search a solution to convert them without opening them by word or other

The solution that I gave you can be used in command line and therefore can easily be scripted.


Time you enjoyed wasting is not wasted time ......T.S. Elliot
Suspense is worse than disappointment................Robert Burns
God help the man who won't help himself, because no-one else will...........My Grandmother

Posted

I think your missing the point... doc files include formatting

i tried

FileCopy("C:\Questions.doc", "C:\Questions.txt")

and the txt file is gibberish.... because of doc formatting

8)

NEWHeader1.png

Posted

I found an old mail about this subject:

>wvWare might help you out. It's a library (the one used in Abiword)

>and a set of command-line tools for reading and converting MS Word

>documents. The URL is http://wvware.sourceforge.net/ . Good luck.

HTH

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Posted

I think your missing the point... doc files include formatting

i tried

FileCopy("C:\Questions.doc", "C:\Questions.txt")

and the txt file is gibberish.... because of doc formatting

8)

If the doc is run through a convertor the formatting is striped and only the text remains. What you tried was just renaming the file.


Time you enjoyed wasting is not wasted time ......T.S. Elliot
Suspense is worse than disappointment................Robert Burns
God help the man who won't help himself, because no-one else will...........My Grandmother

Posted (edited)

Searching on google came up with this amongst many others.

BigDod- I dl'd AntiWord for DOS and tested on 3 doc files. It works perfectly! Good call. (P.S. they "don't do Windows", but they do have a precompiled version for Windows if you want to spend a lot more time on it...

This is a sample of the output. It would let you count words, punctuation, paragraphs, etc.:

expires and you still need to use Outlook Web Access, refresh your browser

and log on again.

Supported browsers and operating systems

You can use Outlook Web Access with Microsoft Internet Explorer or Netscape

Navigator Web browsers from many UNIX, Apple Macintosh, or Microsoft

Windows-based computers. To use the complete set of features available with

Edited by jefhal
...by the way, it's pronounced: "JIF"... Bob Berry --- inventor of the GIF format
Posted (edited)

If the doc is run through a convertor the formatting is striped and only the text remains. What you tried was just renaming the file.

thx.... i understand that

8)

Edited by Valuater

NEWHeader1.png

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...