I have been experimenting recently with Latent Dirichlet allocation for automatic determination of topics in documents. This is a popular technique, although it works better for some kinds of document than for others. Above (click to zoom) is a topic matrix for the Greek New Testament (using the stemmed 1904 Nestle text, removing 47 common words before analysis, and specifying 14 as the number of topics in advance). The size of the coloured dots in the matrix shows the degree to which a given topic can be found in a given book. The topics (and the most important words associated with them) are:
- 1. theos, pas, Christos, mē, kurios – a general topic about God and Christ
- 2. megas, gē, theos, aggelos, horaō – an apocalyptic topic, found especially in Revelation
- 3. gunē, anēr, sōma, laleō, kosmos – advice about men and women, found especially in the Pauline Epistles
- 4. ginomai, theos, laos, hēmera, horaō – a general topic
- 5. paschō, psuchē, chronos, makarios, peirasmos – a topic associated especially with James and 1 Peter
- 6. hamartia, haima, pistis, diathēkē, prospherō – sin and sacrifice, associated especially with Hebrews
- 7. pantote, paraklēsis, kauchaomai, prosōpon, chairō – a topic associated with the Pauline Epistles
- 8. nomos, hamartia, pistis, dikaiosunē, ethnos – sin and the Law, a topic associated especially with John, James, and the Pauline Epistles
- 9. agapētos, ouranos, tēreō, epignōsis, krisis – heaven and judgement, a topic associated especially with 2 Peter, Colossians, and Jude
- 10. Iēsous, pisteuō, patēr, kosmos, theos – a general topic about faith in Jesus
- 11. Paulos, pas, anēr, Ioudaios, theos – a topic about Paul, found especially in Acts and Philemon
- 12. ergon, pistis, kalos, idios, pas – work and faith, a topic associated especially with the Pastoral Epistles
- 13. basileia, huios, aphiēmi, proserchomai, archomai – Jesus talking about the Kingdom in the Synoptic Gospels
- 14. legō, mē, erchomai, horaō, Iēsous – a general topic
A better set of topics can probably be obtained with a bit more experimentation. Alternatively, here (as a simpler form of analysis) are the relative frequencies of some Greek words or sets of words, scaled to the range 0 to 1 for each word set (with the bar chart showing the total number of words in each New Testament book). Not surprisingly, angels appear more frequently in Revelation than anywhere else, while love is particularly frequent in 1 John: