The code also shows how Yandex can combine data from multiple services. McCrea says in one complex process, an adult’s search data may be pulled from the Yandex search tool, AppMetrica, and the company’s taxi app to predict whether they have children in their household. Some of the code categorizes whether children may be over or under 13. (Yandex’s Cherevko says people can order taxis with children’s seats, which is a sign they may be “interested in specific content that might be interesting for someone with a child.”)
One element within the Crypta code indicates just how all of this data can be pulled together. A user interface exists that acts as a profile about someone: It shows marital status, their predicted income, whether they have children, and three interests—which include broad topics such as appliances, food, clothes, and rest. Cherevko says this is an “internal Yandex tool” where employees can see how Crypta’s algorithms classify them, and they can only access their own information. “We have not encountered any incidents related to access abuse,” he says.
Yandex is going through a breakup. In November 2022, the company’s Netherlands-based parent organization, Yandex NV, announced it will separate itself from the Russian business, following Russia’s invasion of Ukraine. Internationally, the company, which will change its name, is planning to develop self-driving technologies and cloud computing, while divesting itself from search, advertising, and other services in Russia. Various Russian businessmen have been linked to the potential sale. (At the end of July, Yandex NV said it plans to propose its restructuring to shareholders later this year.)
While the uncoupling is being worked out, Russia has been trying to consolidate its control of the internet and increasing censorship. A slew of new laws requires more companies and government services in the country to use home-grown tech. For instance, this week, Finland and Norway’s data regulators blocked Yandex’s international taxi app from sending data back to Russia due to a new law, which comes into force in September, that will allow the Federal Security Service (FSB) access to taxi data.
These nationalization efforts coupled with the planned ownership change at Yandex are creating concerns that the Kremlin may soon be able to use data gathered by the company. Stanislav Shakirov, the CTO of Russian digital rights group Roskomsvoboda and founder of tech development organization Privacy Accelerator, says historically Yandex has tried to resist government demands for data and has proved better than other firms. (In June, it was fined 2 million rubles ($24,000) for not handing data to Russian security services.) However, Shakirov says he thinks things are changing. “I am inclined to believe that Yandex will be attempted to be nationalized and, as a consequence, management and policy will change,” Shakirov says. “And as a consequence, user data will be under much greater threat than it is now.”
Bakunov, the former Yandex engineer, who reviewed some of McCrea’s findings at WIRED’s request, says he is scared by the potential for the misuse of data going forward. He says it looks like Russia is a “new generation” of a “failed state,” highlighting how it may use technology. “Yandex here is the big part of these technologies,” he says. “When we built this company, many years ago, nobody thought that.” The company’s head of privacy, Cherevko, says that within the restructuring process, “control of the company will remain in the hands of management.” And its management makes decisions based on its “core principles.”
But the leaked code shows, in one small instance, that Yandex may already share limited information with one Russian government-linked company. Within Crypta are five “matchers” that sync fingerprinting events with telecoms firms—including the state-backed Rostelecom. McCrea says this indicates that the fingerprinting events could be accessible to parts of the Russian state. “The shocking thing is that it exists,” McCrea says. “There’s nothing terribly shocking within it.” (Cherevko says the tool is used for improving the quality of advertising, helping it to improve its accuracy, and also identifying scammers attempting to conduct fraud.)
Overall, McCrea says that whatever happens with the company, there are lessons about collecting too much data and what can happen to it over time when circumstances change. “Nothing stays harmless forever,” she says.