How to convert Word Document files into plain-text files

In order to use the contents of a Word Document (“.doc” or “.docx” extension) in a concordancer it must be converted or saved as a plain text file (“.txt” extension). I will outline two different ways you can do this below.

Method 1 (recommended)

  1. open the document in Word,
  2. do a “select all” (ctrl+A),
  3. “copy” (ctrl+C),
  4. open Notepad (found in Start > All Programs > Accessories),
  5. “paste” (ctrl+V) the content into Notepad,
  6. save the file

Method 2

  1. open the document in Word,
  2. do a “Save as” in Word (goto File > Save as),
  3. select “Save as type” (see image) as “plain text”,
  4. click “Save”,
  5. when the dialogue box appears (for non-English OSs) check “allow character substitution” and then click “OK”,

This can be tedious however if you have many files to convert. There are freeware programs that can automate this task. But please be careful as some programs available may be malicious, that is, adware, malware or spyware.

<< back to the AntConc Tutorial Page

39 thoughts on “How to convert Word Document files into plain-text files

  1. So many thank you’s. This doesn’t help if you have received a .docx file but you don’t have Word on your computer. “Open Word” God must hate me. Everywhere I look on the internet gives that as the first step. […]


  2. I would like to have some pseudo formatting in the textfile like empty lines between paragraphs, underlines as a second line with dashes and a character in front of lines of a list. Does anybody know a toll that does this? Example:

    This is an underlined Header

    See this list:

    * entry 1
    * entry 2
    * entry 3


    • You are talking ‘regex’ or ‘regular expressions’. These characters pertaining to layout and formatting of texts. Do a search of these terms and you will find your answer. It can be done in Microsoft Word but it takes a bit of getting used to.


  3. I worked as a proposal desktop publisher for years, through all the different versions of Word. Lots of version compatibility problems! It seems that, as soon as I learn one version and all the tricks about how to handle it, here comes another updated Word. Sigh.


    • You can try something like SpiceLogic. It is the only one I have tried and worked but that was a while back. Document format has also changed so it may not have kept pace.

      Otherwise do a “doc-to-txt” search on your favourite search engine to find the latest. Good luck.


  4. Method 1 and method 2 results are not the same. Only method 1 gives true plain text (try to save table with method 2 to see the difference).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s