This content originally appeared on phpied.com and was authored by Stoyan
In this post, I talked about the letter frequency in English presented in Peter Norvig's research. And then I thought... what about my own mother tongue?
So I got a corpus of 5000 books (832,260 words), a mix of Bulgarian authors and translations, and counted the letter frequency. Here's the result in CSV format: letters.csv
Here are the results (in alphabetical order) in a graph:
And another graph, with data sorted by the frequency of letters:
ChatGPT gives a different result, even startlingly so (o is the winner at ~9.1% and a is third with 7.5%), which makes me like my letter count research even more
This content originally appeared on phpied.com and was authored by Stoyan

Stoyan | Sciencx (2024-11-01T06:18:52+00:00) Letter frequency in the Bulgarian language. Retrieved from https://www.scien.cx/2024/11/01/letter-frequency-in-the-bulgarian-language/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.