‘An unusual hobby’ How Central Asian Wikipedians are closing the local-language knowledge gap in the age of AI
In April 2025, the volunteer editors and administrators behind the Central Asian editions of Wikipedia held their first ever gathering IRL (in real life). Founded along linguistic rather than nation-state lines, these “free encyclopedias” offer reliably sourced information to millions of readers across the region, all while competing with the behemoth that is Russian Wikipedia for page views. In information environments shaped by state censorship, Russian media, and shifting language preferences, this is no small feat — and the Wikipedians who spoke to The Beet seemed clear-eyed about the fact that more challenges lie ahead. Drawing on his reporting from the inaugural Central Asian WikiCon in Tashkent, freelance journalist Dénes Jäger explores the debates about language, trust, and technology shaping Wikipedia’s development in Central Asia (and beyond) in the age of AI.
This story first appeared in The Beet, a monthly email dispatch from Meduza covering Central and Eastern Europe, the Caucasus, and Central Asia. To get the next issue delivered directly to your inbox, sign up here.
If you’ve ever asked yourself who created the Wikipedia article about Donald Trump in Uzbek, you can thank the user Umarxon III. Ten years ago, he took it upon himself to make information about the then-presidential candidate accessible in his native language. It was one of his first articles — and more than 2,000 followed. His own Wikipedia user page now resembles the chest of a highly decorated Soviet general from World War II, adorned with rows of medals and badges commemorating his achievements on the platform.
Umarxon III was among the veterans when around 100 Wikipedians gathered in Tashkent in April 2025 for Central Asian WikiCon. There, members of the Uzbek, Kazakh, Kyrgyz, and Tajik-language open encyclopedias met for the first time to discuss how to attract more volunteers and improve access to information in their mother tongues.
“Editing an encyclopedia is an unusual hobby and is not going to appeal to the majority of humankind. That’s a fact we should accept,” said Asaf Bartov during his presentation. Bartov was one of a select few Wikimedia Foundation staff members who spoke at the convention, sharing strategies and lessons from other parts of the world.
The crowd in Tashkent was gender-balanced and mostly made up of young people. Some of the Wikipedians wore traditional headwear, while others sported Wikipedia-themed socks. According to organizers, both veteran volunteers and newcomers were invited to reflect the recent growth in the number of active members.
User Nataev falls in the former group — he’s been actively editing Wikipedia articles since 2009, mostly for the Uzbek-language edition, and has witnessed the development of all the regional encyclopedias. “A conference like this was unthinkable just five years ago,” he told The Beet. “But now there’s momentum among all the communities that I’m very happy to see.”
‘I’ll try to translate’
Wikipedia editions are organized by language, not nation-state. If a language gathers a critical mass of dedicated editors, they can start their own digital encyclopedia on the platform. Most attendees at Central Asian WikiCon were users of Turkic-language Wikipedias, including the Uzbek, Kyrgyz, and Kazakh editions, as well as the Tajik edition — a modern variant of Persian written in the Cyrillic script. Speakers of smaller languages also came to Tashkent to share their experiences.
The editors of the Karakalpak Wikipedia — a Turkic language spoken in an autonomous region on the southern shore of the disappearing Aral Sea in northwestern Uzbekistan — discussed their Instagram engagement strategy. Another Wikipedian gave a presentation on a successful article-writing contest for the Bashkir-language edition, organized by a group of retired teachers from a small village in Russia’s Bashkortostan.
Multilingualism was omnipresent at the conference. During panel discussions, speakers effortlessly switched between Kyrgyz and Russian; participants asked questions in English, and the response came in Uzbek. The simultaneous translators, who covered the program in English and Russian, sometimes audibly sighed into their microphones: “The participant is speaking Kazakh right now, but I’ll try to translate.”
Nataev himself embodies this polyglot reality of Central Asia. Born into Kyrgyzstan’s Uzbek-speaking minority, he is also fluent in Kyrgyz, Russian, English, and Turkish. “I started writing for Wikipedia when I was an exchange student in the U.S. and I couldn’t find information about my hometown in English,” Nataev recalled. He stuck with it, eventually creating the Uzbek-language user group. He also helped organize Central Asian WikiCon.
“Wikipedia faces two challenges in the region,” Nataev explained. “Information about Central Asia is very limited in English or German, for example, and at the same time, there aren’t enough articles in regional languages.”
The vast difference in knowledge production is clear from the number of articles available in the various Wikipedia editions. This is also where the enduring role of the Russian language in Central Asia comes into play. With more than two million articles, Russian Wikipedia is one of the largest in the world. In contrast, the most popular Central Asian editions range from about 75,000 to 300,000 articles. (Turkmen Wikipedia has fewer than 7,000.)
Russian still holds the status of a second official language in Kazakhstan and Kyrgyzstan, whereas in Tajikistan it is considered the “language of interethnic communication.” A 2023 study by the European Neighborhood Council found that even three decades after the Soviet Union’s dissolution, Russian-language media continue to play a huge role in Central Asian people’s everyday lives — albeit to varying degrees.
In Kazakhstan, the only Central Asian country that still has a significant Russian minority, 75 percent of respondents said they read and watch news in Russian, while 54 percent preferred Kazakh-language media. By comparison, the preference for local-language news was much stronger in Kyrgyzstan (79 percent) and Uzbekistan (88 percent). The study also found that, across Central Asia, Russian-language media were more popular among urban residents and those with higher levels of education.
While current Wikimedia Statistics only provide data on click rates for Kyrgyzstan, there is reason to believe that many of Russian Wikipedia’s page views originate in Central Asia. As of December 2022, it was the most popular edition in Kazakhstan, Kyrgyzstan, Tajikistan, and Turkmenistan. The most recent Wikimedia Statistics show that Russian Wikipedia averaged five million monthly page views from Kyrgyzstan so far this year, compared to about 1.37 million for Kyrgyz Wikipedia.
‘These processes take time’
In Tashkent, discussions about the Wikipedia community’s use of the Russian language quickly led to the subject of decolonization. During one session, Kazakh Wikipedia users asked why the Russian-language edition still displays colonial toponyms by default, such as Alma-Ata instead of Almaty for Kazakhstan’s largest city.
Since the Wikimedia Foundation does not engage in direct discussions about content, board member Victoria Doronina, who is from Belarus, did not have a definitive answer. “These processes take time,” she said, suggesting that not enough Russian-language resources reflect the name change at present.
During a roundtable, Doronina proposed “decolonizing” the use of Russian in the context of Wikipedia, noting that the platform hosts a Russian-language edition, not an encyclopedia under the control of the Russian government. (The Kremlin does, however, have its own “Wikipedia alternative,” which complies with Russian laws and censorship.)
Doronina also made the case for Russian as a lingua franca for Central Asian Wikipedians. “Since many people in the region speak Russian, we can use it to foster cooperation between the different local Wikipedia editions,” she suggested. However, her pitch to create a Russian-language channel for the regional community was met with resistance.
Kyrgyz Wikipedian Mamatkazy bristled at being asked to use Russian rather than English or a mix of Central Asian languages for group communication. “If we hadn’t focused so much on the Russian language, maybe we would speak our local languages better,” he said, visibly annoyed.
Another Wikipedian from Kyrgyzstan, Aida, told the group that working on the Kyrgyz edition helped rekindle her passion for the language. However, she also expressed regret that she remains more proficient in Russian.
Still, attendees offered ample anecdotal evidence of Russian Wikipedia’s persistent popularity across Central Asia. “In Kazakhstan, the Russian-language Wikipedia is still more popular than the Kazakh edition. It is especially difficult for students to find reliable sources in our mother tongue,” said Amangeldi, a Kazakh Wikipedia administrator who has contributed to the platform for more than a decade.
During one session, Batyrbek, another Kazakh Wikipedia contributor, said he had the impression that information in regional languages is generally considered less trustworthy than information from Russian sources. “Even when people try to find information in Kazakh using a search engine, the results often appear in Russian,” he pointed out.
A Tajik Wikipedia administrator joked that his own father prefers searching for information in Russian, despite his son’s contributions of thousands of detailed encyclopedia articles in their mother tongue.
‘Numbers were the only thing on their minds’
Wikipedia’s Central Asian editions aren’t the only ones facing questions about their credibility. Last year, Elon Musk attracted a great deal of attention when he condemned Wikipedia as “an extension of legacy media propaganda.” He also publicly called for defunding the platform — which relies on donations and grants — until it “restored balance.” However, achieving balance is precisely the aim of the crowdsourced encyclopedia, which has numerous content policies in place to ensure this goal.
In addition to Wikipedia’s three core principles — “neutral point of view, verifiability, and no original research” — each edition can also institute its own set of policies. For example, Kazakh Wikipedia users can only create new articles after completing at least 100 smaller edits to existing articles, such as extending sections, adding sources, or improving layouts. Administrators, elected after regularly contributing to a given encyclopedia, usually vet the removal of longer paragraphs or the creation of new articles. Articles can also be protected from modifications, allowing only experienced users to edit them or participate in discussions.
According to Mamatkazy, many people simply do not understand how Wikipedia works. “They think anyone can just add whatever they want,” he said, adding that he finds it ironic that the same people likely trust generative AI software like ChatGPT without question.
A relatively new contributor to the Kyrgyz-language edition, Mamatkazy has focused his efforts on fundraising and organizing events for Wikipedians. “I tried to write articles, but that’s not my strong suit,” he explained, laughing.
Nevertheless, the lack of adequate information resources in Kyrgyz remains his main motivator. “A few years ago, I created an educational podcast about sex, gender, and queerness in Kyrgyz because there was simply no information about these topics available in my native language,” he said. Through his work on the Kyrgyz Wikipedia, he aims to support coverage of similarly underrepresented topics.
But not all Wikipedians join the platform for purely altruistic reasons. Uzbek Wikipedia, for example, saw an influx of volunteers after the government became involved. In 2022, the country’s Youth Affairs Agency launched WikiStipendiya, a project aimed at increasing the number of Wikipedia articles in Uzbek by organizing “edit-a-thons” and awarding scholarships totaling 320 million soums (about $30,000 at the time).
Wikipedian Nataev was involved from the beginning. “During our first meeting, a government official told us that they wanted to reach one million articles. At the time, we only had around 100,000,” he recalled. “We tried to convince them that this was unrealistic, but numbers were the only thing on their minds.”
Contrary to the advice of the experienced Wikipedians, the Uzbek government awarded prize money to those who created the most new entries, and this pure focus on quantity had its downside. “To churn out article after article, people used machine translations. As a result, we ended up with thousands of poor-quality articles — some unreadable or containing misinformation,” Nataev sighed. “Take, for example, the article about judoka Diyora Keldiyorova, who won a gold medal at the 2024 Paris Olympics. The machine-translated article absurdly stated that she had fought in World War II.”
This flood of low-grade articles damaged Wikipedia’s reputation for trustworthiness in Uzbekistan, Nataev said. To this day, he and other dedicated volunteers are working to repair the damage to their project.
Bridging the AI gap
In all likelihood, the pitfalls of using generative AI and machine translation will continue to plague Central Asian Wikipedia editions for years to come. Large language models (LLM) require massive amounts of data to function effectively. Unlike English or Russian, where machine translation works relatively well, smaller languages, including those in Central Asia, struggle due to a scarcity of high-quality data.
A recent white paper by the Stanford Institute for Human-Centered AI found that speakers of “low-resource languages” are being overlooked in the ongoing AI revolution. They are left with unreliable, biased models that fail to reflect sociocultural contexts, leaving them more vulnerable to misinformation.
Uzbek has historically been written in Arabic, Cyrillic, and modern Latin scripts, further complicating matters. Since gaining independence in 1991, Uzbekistan has adopted an erratic language policy, resulting in several revised alphabets and the coexistence of Cyrillic and Latin writing systems. This creates a significant obstacle for LLMs that rely on streamlined data.
While several Central Asian countries are home to tech hubs, Kazakhstan has shown the most ambition in the field of AI development. President Kassym-Jomart Tokayev himself has praised KazLLM, the first homegrown large language model developed at Nazarbayev University in Astana.
Kazakhstan’s government has also taken steps to encourage more Kazakh-language content on Wikipedia. According to Amangeldi, there has recently been a surge in new Wiki clubs at the Nazarbayev Intellectual schools, a network of elite state schools, where students are encouraged to create new articles. However, most entries failed to meet the standards for well-researched encyclopedic articles.
“We saw that many of the new articles were unreadable, so we quickly restricted the right to create new articles to experienced users,” Amangeldi recalled. This approach slowed growth but ensured the encyclopedia’s quality, he added.
Quality assurance is crucial for Central Asian Wikipedians. Crowdsourced encyclopedias in national languages are important spaces where volunteers can learn and express themselves in their native tongues. Moreover, these efforts contribute to a growing regional trend toward normalizing the use of local languages after nearly a century of Russian-language dominance in science and media.
As Asaf Bartov from Wikimedia pointed out, contributing to Wikipedia might seem like an “unusual hobby,” but in the context of Central Asia, it helps bridge a crucial knowledge gap, especially as language sits at the forefront of a technological leap.
Hello, I’m Eilish Hart, the editor of The Beet. Thanks for taking the time to read our work! Our newsletter delivers underreported stories like this one to subscribers once a month. Like all of Meduza’s reporting, it’s free to read but relies on support from readers like you. Please consider donating to our crowdfunding campaign.
Sign up for The Beet
Underreported stories. Fresh perspectives. From Budapest to Bishkek.
Story by Dénes Jäger for The Beet
Edited by Eilish Hart