How to read this graph?
Take a language in the left axis - take Hindi for example. Look at the row - the largest circles appear in Urdu, Marathi, and Bangla - so
"The texture of Hindi is similar to Marathi, Urdu, and Bengali"
Take a language in the left axis - take Hindi for example. Look at the row - the largest circles appear in Urdu, Marathi, and Bangla - so
"The texture of Hindi is similar to Marathi, Urdu, and Bengali"
Note, Tamizh is often confused with Malayalam, and Malayalam is confused with Tamizh - this is explained as both are Dravidian languages and Malayalam was the newest language to split from old Tamizh.
Notice how Malayalam is confused by AI to be Sanskrit but Tamizh isn't.
Notice how Malayalam is confused by AI to be Sanskrit but Tamizh isn't.
Malayalam and Sanskrit have been mentioned a number of times in literature. Eighty percent of Malayalam vocabulary is constituted of Sanskrit (Kerala Sahitya Akademi, keralasahityaakademi.org, accessed: June. 2014). [Cited from paper]
The poor AI also finds it hard to distinguish Gujarati, Hindi, and Marathi - evident because they all come from the Western branches of Vedic Sanskrit - Maharashtri and Shauraseni Prakrit.
Notice that the AI also confuses Kashmiri (Koshur) and English! According to the paper, Kashmiri has intermingled with English being an international tourist destination. Kashmiri people also use English frequently as a second language.
Axomiya, Odia and Bangla all have similarities with each other - all children of Magadhi Prakrit - so they all share a lot of similarities
The similarity between English and Konkani can be attributed to the fact that Konkani is mainly spoken in Goa which has been in contact with foreign land from early period of time either due to colonization or for trade purpose. Besides, Goa is also a tourist place like Kashmir.
Similarity between Odia and Kannada might be due to the fact that Kannada is a Dravidian language and ancient Odisha comprised a large Dravidian speaking region. Though political boundaries of Odisha have shrunk in modern times, Odia has a strong Dravidian influence in it.
Read the paper here: hindawi.com
If you like IIP's work, make sure you follow us for complex research made easy to digest through intuitive visualizations.
picture of Wall-e to illustrate an AI confused by Indian languages
If you like IIP's work, make sure you follow us for complex research made easy to digest through intuitive visualizations.
picture of Wall-e to illustrate an AI confused by Indian languages
Notice that the confusion isn't bi-directional - the pesky AI confuses Telugu for Bangla but it does not confuse Bangla for Telugu.
Some similarities like why Axomiya is confused with Punjabi and why Gujarati is confused with Bangla are a mystery. We don't know why the AI finds these very unrelated languages similar.
Only time will tell.
Only time will tell.
Also, if it isn't apparent
Bigger, Darker circles = More confusion by AI to distinguish
Smaller, Lighter circles = No confusion in distinguishing
Bigger, Darker circles = More confusion by AI to distinguish
Smaller, Lighter circles = No confusion in distinguishing
What I personally find the most interesting is that Urdu is confused more with Sanskrit than Hindi is. I wonder why.
Loading suggestions...