Getting Started with 读者 (DuZhe)
读者 (DuZhe) has reached a point where it is fairly usable, but it’s probably a bit difficult to use without a decent guide so I am writing a guide to help the early adapters. The analyzer comes with 3 basic options for text analysis: Read, Analyze, and Segment.
Let’s get started by inserting some Chinese text to analyze. If you don’t have any readily available, check out 盗墓笔记, and copy the first chapter here.
Segment
This option is very straight forward. It simply returns the input text with spaces between the words. Such as for the example above it will return something like shown below. It won’t display any definitions or save word lists. It’s only for segmenting the text.
Analyze
This option is a whole lot more interesting than the previous one. It judges the text for the perceived difficulty in relation to a typical HSK text of an equivalent level. I’m still ironing out the kinks in the formula to calculate the difficulty level, so I’m not ready to disclose the details, but stayed tune. Getting to the analysis page, you’ll be greeted with an over all difficulty score.
Below the difficulty score, you can see an analysis of the word frequencies. Next is the break down of the HSK words in the text, both new and old. The respective scores represent how well this text matches up with the difficulty of the HSK.
Finally, on this page, you can see a list of all the words in the text. The table of the words is paginate and can be filtered and searched.
Each row is selectable and the selected rows can be downloaded as a CSV file.
Read
This is where there real fun starts. The Read section let’s you read the submitted text with the help of a mouseover dictionary. The words are already pre-segmented, so there is no need to highlight or select words, just mouseover.
A cool feature about it is that the definition is always displayed on the bottom in a single row, but if a single row is not enough to show the whole definition, arrows on the left of the definition appear to indicate that the definition can be expended by clicking the word.
Clicking the word will also bring up a menu that lets you add the word to a word list or flag it as inaccurate. At the moment flagging words won’t do anything because I disabled it until I can get a better editor in place.
The words will be added to a list that can be viewed from the menu accessible from the button next to the definition.
This will pull up a table of all the words added to the list. The words can be removed by hitting the ‘x’ button in each respective row. The whole word list can be downloaded by click the down arrow.
This is pretty much all to it at the moment. The functionality, overall, is very straight forward.
Let me know how it works out for you and leave a comment! I’m looking forward to any suggestion and bug reports.
-
http://buchmann.info Peter Buchmann
-
http://www.aaginskiy.com Artem Aginskiy
Artem's Twitter
- No public Twitter messages.
























