Fun with LXXM-Corpus
The Blog of Nathan D. Smith

Once I have a text available for natural language processing, there are a few basic tasks I like to perform to kick the tires. First, I like to run the collocations method of NLTK, which gives common word pairs from the text. For the LXXM, here are the results:

If you disregard the stop words, you can get a decent idea of the fundamental thematic content of the text.

Now for the silliness, using the n-gran random text generator:

ἐν ἀρχῇ ὁδοῦ πόλεως ἐπ' ὀνόμασιν φυλῶν τοῦ Ισραηλ παρώξυναν οὐκ ἐμνήσθησαν διαθήκης ἀδελφῶν καὶ ἐξαποστελῶ πῦρ ἐπὶ Μωαβ ἐν τῷ ἐξαγαγεῖν σε τὸν ἱματισμόν

Date: 2014-06-13 20:56