Yeah, thank you very much for that. Yeah, so the project, it's a presidential initiative at Stanford called Silicon that just got off the ground. And in many ways is the culmination of the exact type of work that I know I've been working on for about 20 years, and my co-PIs as well, Chinese was really one of the top three, the first digitally disadvantaged language alongside say Arabic and Hindi perhaps in terms of really large world languages that were systematically blocked from technologies, until they weren't. But then it's very new waters because it's not history anymore, it's the present and the future. But no, I mean by most estimates, there are 7,000 plus living languages, not including historic scripts. And the Unicode Consortium, the Unicode is of course the gateway to text in digital spaces as this non-profit, international standard-bearing organization with character and coding, they categorize 90 plus percent as digitally disadvantaged.
So we mean we're somewhere in the order of magnitude of 6,000, 6,500 languages that to one degree or another is either completely borrowed from digital space, it's not even encoded. Or you could post a blog about it, but there's no user interface, user forget auto complete machine translation, nowhere near large language models or AI. And yes, so it's a dance, because technology is both the accelerant of a lot of these pathways, but also is something that can be leveraged, and of course is the pathway into culture and arts and global economy and so forth at the same time. So, it's a double-edged sword, and we have to ... We're finding ways to navigate that. But I can give you an example of, in our pilot phase we've launched a practitioner program and a student internship program now, and then we're going to be ramping up things as we move along. But our view is, I don't work on character encoding. I don't design fonts, I don't design keyboards personally. But we are tapped into and connected into these amazing ecologies of practitioners and experts and language experts who are.
And we're trying to identify and connect and support and augment their work, and in some cases help them overcome certain kinds of blockages, so that the work they're in many cases already doing goes from taking 15 years to 10 or five. Because the honest fact of the matter is, one of my colleagues put it and put it very bluntly is if no one does anything, language extinction will take care of this issue on its own.
And so there is a group that we're working on, I'll give you an example. There is a large language community predominantly in sort of the area of Iran, a Kurdish minority, the Locke. And by and large, the majority of letters that they use in their writing system is identical with their neighbors. And so it was already in the set, but we are collaborating with this amazing type designer, this Iranian American type designer, who is developing just a small, what they call a glyph extension. It's basically just a few more letters or symbols or glyphs that can be, we can augment the existing font and more or less like voila, suddenly a fuller scale participation in the digital space by speakers of Locke is possible. Because you can't include what you can't see. If you don't have a font for something, there's no inclusion.
We are working with another colleague who's doing something comparable, developing a font that works with this consortium of ethnic groups predominantly in Ghana. And we're talking about a community of maybe three, four, five million people, and he's one of the first, he's a Ghanaian type designer, and he's working on one of the first digital fonts that would help sort of span these ethnic groups in the language groups. And a type of font, it's serious work, but at the same time you could think of it, certain kinds of font as like an MA thesis. It's something that takes multiple years, but with concerted effort and with free time it's not something that takes 20 years or 10 years. It's something that is very achievable and within range. But here's the problem, and then I'll mute myself. Here's the problem is that everything we've talked about, Chinese especially, that comes from an era of the '80s and the ;90s of software internationalization where the big goal was market share.
I mean, we're talking about the Chinese market, we're talking about Indian market, we're talking about the Middle Eastern market. So there was a ton of interest in overcoming these forms of exclusion. But what's been left behind in this wake is predominantly these 6,000 plus languages, to be honest, for whom no one is coming urgently.