In order to use the contents of a Word Document (“.doc” or “.docx” extension) in a concordancer it must be converted or saved as a plain text file (“.txt” extension). I will outline two different ways you can do this below.
Method 1 (recommended)
- open the document in Word,
- do a “select all” (ctrl+A),
- “copy” (ctrl+C),
- open Notepad (found in Start > All Programs > Accessories),
- “paste” (ctrl+V) the content into Notepad,
- save the file
Method 2
- open the document in Word,
- do a “Save as” in Word (goto File > Save as),
- select “Save as type” (see image) as “plain text”,
- click “Save”,
- when the dialogue box appears (for non-English OSs) check “allow character substitution” and then click “OK”,
This can be tedious however if you have many files to convert. There are freeware programs that can automate this task. But please be careful as some programs available may be malicious, that is, adware, malware or spyware.
So many thank you’s. This doesn’t help if you have received a .docx file but you don’t have Word on your computer. “Open Word” God must hate me. Everywhere I look on the internet gives that as the first step. […]
LikeLike
If you need to open .docx try OpenOffice. It will let you open .docx . Good luck.
LikeLike
I would like to have some pseudo formatting in the textfile like empty lines between paragraphs, underlines as a second line with dashes and a character in front of lines of a list. Does anybody know a toll that does this? Example:
This is an underlined Header
————————————-
See this list:
* entry 1
* entry 2
* entry 3
LikeLike
You are talking ‘regex’ or ‘regular expressions’. These characters pertaining to layout and formatting of texts. Do a search of these terms and you will find your answer. It can be done in Microsoft Word but it takes a bit of getting used to.
LikeLike
Use regex or a find and replace to do this.
LikeLike
I worked as a proposal desktop publisher for years, through all the different versions of Word. Lots of version compatibility problems! It seems that, as soon as I learn one version and all the tricks about how to handle it, here comes another updated Word. Sigh.
LikeLike
wow!
just kidding.
Now, for real, how do you convert 50 word documents into .txt files at once?
LikeLike
You can try something like SpiceLogic. It is the only one I have tried and worked but that was a while back. Document format has also changed so it may not have kept pace.
Otherwise do a “doc-to-txt” search on your favourite search engine to find the latest. Good luck.
LikeLike
Method 1 and method 2 results are not the same. Only method 1 gives true plain text (try to save table with method 2 to see the difference).
LikeLike