Topic Analysis on the New Testament

I have been experimenting recently with Latent Dirichlet allocation for automatic determination of topics in documents. This is a popular technique, although it works better for some kinds of document than for others. Above (click to zoom) is a topic matrix for the Greek New Testament (using the stemmed 1904 Nestle text, removing 47 common words before analysis, and specifying 14 as the number of topics in advance). The size of the coloured dots in the matrix shows the degree to which a given topic can be found in a given book. The topics (and the most important words associated with them) are:

A better set of topics can probably be obtained with a bit more experimentation. Alternatively, here (as a simpler form of analysis) are the relative frequencies of some Greek words or sets of words, scaled to the range 0 to 1 for each word set (with the bar chart showing the total number of words in each New Testament book). Not surprisingly, angels appear more frequently in Revelation than anywhere else, while love is particularly frequent in 1 John:


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.