Video: How to Merge Text (.Txt) Files in Command Prompt (Windows)

To think I made a perl program to do this just to find I can do it in DOS!

Pascual Pérez-Paredes academic blog

Pascual Prez-Paredes, Applied linguistics, Corpus linguistics, Corpora
SACODEYL, learner language, European projects, Research

Source: perezparedes.blogspot.com.es

See on Scoop.itApplied linguistics and knowledge engineering

View original post

Online Word Document cleaner to plain-text

Here is a nice quick simple Word “cleaner” script with source code by Jonathan Hedley. Straightforward and does what is necessary. But troublesome if one has hundreds or even thousands of documents to convert. I am still looking for a solution here.

How to convert Word Document files into plain-text files

In order to use the contents of a Word Document (“.doc” or “.docx” extension) in a concordancer it must be converted or saved as a plain text file (“.txt” extension). I will outline two different ways you can do this below.

Method 1 (recommended)

  1. open the document in Word,
  2. do a “select all” (ctrl+A),
  3. “copy” (ctrl+C),
  4. open Notepad (found in Start > All Programs > Accessories),
  5. “paste” (ctrl+V) the content into Notepad,
  6. save the file

Method 2

  1. open the document in Word,
  2. do a “Save as” in Word (goto File > Save as),
  3. select “Save as type” (see image) as “plain text”,
  4. click “Save”,
  5. when the dialogue box appears (for non-English OSs) check “allow character substitution” and then click “OK”,

This can be tedious however if you have many files to convert. There are freeware programs that can automate this task. But please be careful as some programs available may be malicious, that is, adware, malware or spyware.

<< back to the AntConc Tutorial Page