Making a text file manually

To make a text file open a text editor (e.g., Notepad in Windows or TextEdit in Mac).

Manually type or paste some text into the main window and save.

Saving from a word processor like Microsoft Word may not give a “clean” output, that is, some of the letters or punctuation may not render as plain text. This may affect the count and/or searchability of the resulting text.

Jacet 8000 Level Marker has moved

The Jacet 8000 Level Marker has moved to:
http://www.tcp-ip.or.jp/~shim/J8LevelMarker/j8lm.cgi

The errors I highlighted from two years ago still remain, however.

Developing Linguistic Corpora

Here is a nice guide for corpus linguistics entitled Developing Linguistic Corpora at:
http://ahds.ac.uk/creating/guides/linguistic-corpora/index.htm

Contributors to this online text are John Sinclair, Geoffrey Leech, Lou Burnard, Paul Thompson and Martin Wynne (editor).

This text is not written for the English teacher in mind and some technical points may be difficult to grasp.

Online concordancing

If you require just a simple concordancer then go to Concordancer at http://ec.hku.hk/vocabulary/concordancer.htm. It give only counts and highlights of the search term with no KWIC.

ConcApp 5

The concordancer software ConcApp (version 5) has now become purchaseware. It is an easy to use program recommended for those learning to use corpus and concordancers. Its best feature, though, is that it can be used for not only English and French texts but also Japanese (tested), Chinese, Thai and Russian.

While it is purchaseware it is reasonably priced (USD20 as of posting), well worth it if you are in need of concordancing in languages other than English.

The WordSmith Tools mailing list

If you use WordSmith Tools and you haven’t already become a member of Mike Scott’s WordSmith Tools mailing list then you should. Questions and queries about the program and its use can be made at this open forum. More information can be found at http://groups.google.com/group/WordSmithTools/.

Just The Word

I found this online concordancer called JustTheWord. It looks like it was made for teachers to search for (Japanese) student errors.

Errors in JACET 8000 Level Marker

I have been using the JACET8000 Level Marker (new link here) page for my research. It is a great tool. And I would like to thank the creator, Shinichi Shimizu, for it. However, when I checked the output I had found a number of errors. Below is a list of words which have been incorrectly numbered. The digit to the left (in red) of the word is what the output gives. The digit to the right (in green) is what the number or level should be:

1 march

2

1 lower

3

1 saw

3

1 means

4

1 thanks

4

1 leading

4

1 finding

4

1 thinking

4

1 china

4

1 sin

4

1 colored

4

1 saying

5

1 forward

5

1 basin

5

1 doing

5

1 making

5

1 controlled

5

1 kin

6

1 mar

8

2 audience

1

2 preferred

6

2 flatter

7

3 interpret

2

3 clothes

4

3 handicapped

6

4 including

3

4 upward

7

5 boom

4

5 ethics

6

5 summons

6

6 constructive

5

6 ragged

7

6 robin

7

7 chatter

6

8 coastline

7

9 alight

8

I had checked this with the March 2003 data as well as the publication “Daigaku Eigo Kyoiku Gakkai Kihongo ni Motozuku JACET 8000 Eitango” (ISBN 9784342788734). This information was correct at the time of posting.

[update] Checking through some of the words I found they were difficult to classify. For example, ‘saw’ could be the past tense of the verb ‘see’. In that case it would be placed in JACET 1. But as a noun – an tool to cut things in half – it would be in JACET 3. But others were clearly errors. ‘Audience’, ‘interpret’, ‘including’, ‘boom’, ‘constructive’, ‘chatter’ and ‘alight’ were numbering errors at the cut-off point between levels. The rest of the errors I cannot figure out.

KH Coder

kh coder logo
KH Coder is a corpus linguistic tool for Japanese texts. Follow this link to learn more about Japanese corpus linguistics.

A Wordsmith Tools discussion list

wordsmith tools 5.0
Mike Scott, the creator of Wordsmith Tools has a mailing list by that name. The archive is open to public, but to post one must become a member of the list, pending approval.