How to convert Word Document files into plain-text files

In order to use the contents of a Word Document (“.doc” or “.docx” extension) in a concordancer it must be converted or saved as a plain text file (“.txt” extension). I will outline two different ways you can do this below.

Method 1 (recommended)

  1. open the document in Word,
  2. do a “select all” (ctrl+A),
  3. “copy” (ctrl+C),
  4. open Notepad (found in Start > All Programs > Accessories),
  5. “paste” (ctrl+V) the content into Notepad,
  6. save the file

Method 2

  1. open the document in Word,
  2. do a “Save as” in Word (goto File > Save as),
  3. select “Save as type” (see image) as “plain text”,
  4. click “Save”,
  5. when the dialogue box appears (for non-English OSs) check “allow character substitution” and then click “OK”,

This can be tedious however if you have many files to convert. There are freeware programs that can automate this task. But please be careful as some programs available may be malicious, that is, adware, malware or spyware.

<< back to the AntConc Tutorial Page

Spicelogic – doc to txt converter

This is a document to plain text converter called spicelogic. I am not sure how secure it is but should be OK because it is from http://www.download.com.

Word Count

Found this nice little interactive Flash interface called Word Count. Try it out. It makes one think about words in a new and different way.

Corpus Tools page

I have now added a corpus tools page with a listing of concordancers I have tried. I do not want to review them as choice is a personal thing. However, you will see here from time to time about things about certain programs which I have found interesting through usage.

“Range check error” in Paul Nation’s Range program

“Range check error”.

This is the message I get everytime I try to run the Range program using basewrd#.txt files I have created myself. The text-files seemed to have been saved properly and the original ones work fine, so there shouldn’t be a problem.

Anyone else has had trouble with this? And does anyone have a solution?