AN ALGORITHM able to decipher 'dead languages' could throw light on one of Spain's biggest linguistic mysteries: Where the Basque tongue, euskera, comes from.
Spanish, and all of Spain's regional languages except euskera, have their roots in Latin and are known as the 'romance languages', along with, for example, French, Italian, Portuguese and Romanian.
Other groups include Germanic, which covers the Scandinavian tongues, and Celtic, which embraces Ghàidligh, Irish Gaelic and Cornish.
But euskera, said to be incredibly hard – in fact, nearly impossible – for non-native speakers to learn to a level of effective communication on any subject, appears to have no known roots; no other language on Earth has been found to be related to it.
Another, older linguistic 'mystery' is that of íbero, or the Iberian language – the indigenous tongue spoken by some of modern-day Spain's earliest human inhabitants, which stretched as far as southern France in one direction and inland Andalucía in the other.
Its native speakers would have been alive between about the seventh and first centuries BCE, or around 2,020 to 2,600 years ago, and was most in use before the Migration Era, thought to have been in the late fourth century CE (AD).
Iberian is thought to have died out in the first 200 years of the last Millennium, since the spread of the Roman Empire into what is now mainland Spain and Portugal saw Latin becoming the most-used tongue.
It is referred to as a 'Paleohispanic language', of which euskera is the only one left and has no links to any other tongue in current use.
Speakers make up just under three in 10 inhabitants of the Spanish Basque territories – the Basque Country's three provinces, and neighbouring Navarra – and three former provinces in France, just over the border; a total of around 751,500 all told, or roughly equivalent to the population of Valencia city, and of whom over 90% are on the 'Spanish side'.
If, as some linguistic experts suspect, euskera is derived from the original Iberian tongue, this would make it the oldest language in Spain in modern use.
Researchers from the Computer Sciences and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT) have developed a programme which, using only a few thousand words of a given language, can point towards its possible roots.
According to Professor Regina Barzilay of the MIT team, it works through accessing a corpus of texts of modern and ancient languages, drawing on existing linguistic history knowledge, to make comparisons.
Language evolution has largely been predictable, Professor Barzilay explains: As an example, if a given language retains or omits a complete sound, it is likely that a comparable sound-substitution will be included, so a 'p' in the 'main' tongue might be replaced with a 'b' in an offshoot language, but would probably not be replaced with a 'k', which is a completely different phonemic sound.
Working with PhD student Jiaming Luo, the pair devised an algorithm which detects microscopic changes and similarities in pronunciation to form a logical rule-base through 'chopping up' words in an ancient language.
Last year, they wrote a paper after deciphering the dead Ugaritic tongue – a semitic language which had been extinct since the 12th century BCE but was discovered by archaeologists in what is now the city of Ras Shamra in Syria – and also the so-called Linear B written language system, used in Mycenaean Greece during the end of the Bronze Age, from around 1600 to 1100 BCE.
Read full story at thinkSPAIN.com