China’s surveillance system — a bogeyman for the media, a technological revolution, or a grim dystopia that awaits the world? American journalist Jeffrey Kane set out to find out. He went straight to Xinjiang — a region, on the one hand infamous for its “sanatoriums” for Uyghurs, and on the other hand, the Chinese capital of AI research. Following his trip, Kane wrote the book “Государство строгого режима”, the translation of which was published by Individuum. We publish, with minor abridgments, the chapter “Deep Neural Network.”
In dealing with foreign partners, Chinese state-supported corporations practised what in business is called “forced technology transfer.” To access China’s closed market, foreign companies typically had to strike deals with Chinese partners. One informal requirement was the transfer to Chinese firms of so-called sensitive technologies — semiconductors, medical equipment and oil-and-gas gear.
Under World Trade Organization rules, such a demand is illegal; yet American companies, albeit reluctantly, disclosed trade secrets in the hope of gaining access to 1.4 billion potential customers in China.
As China began collecting data on its citizens — tracking the use of apps and services like WeChat — the prospect of leading the expanding and lucrative AI industry attracted many local tech start-ups. Chinese AI researchers, increasingly numerous, watched closely the breakthroughs taking place in the United States, the world’s AI leader. Chinese firms hoped to unlock AI secrets by recruiting talented Chinese developers who studied abroad and worked at Microsoft and Amazon, luring them home with high salaries and calls to patriotism. By the early 2010s, Chinese programmers were closing in on creating a deep neural network — the holy grail of a surveillance state; a system capable of learning and identifying patterns across millions of images and data points.
For many years AI researchers relied on the so‑called rule‑based programming approach. They wired a program into a computer to recognize a cat, telling it: “Look for a circle with two triangles on top.” This approach made sense because computers lacked the processing power for more. Yet it also constrained AI: not all images of cats are a perfect circle with two triangles on top, and not all circles with triangles on top are cats.
More modern technology — deep neural networks — offered a number of advantages. Operators no longer needed to perform monotonous, tedious work of manually categorising images and data, or then write rules for the AI system. Instead, software learned to connect disparate data, scanning vast amounts of information, and then learning from it. Subsequently the program could refine the algorithm to solve the task for which it was created. The fewer operators controlled and constrained the software, the more AI applications emerged for companies. Deep neural networks learned to operate unmanned vehicles, assist doctors in diagnosing, and flag credit-card fraud.
Until 2012, the idea that a deep neural network could influence the market was regarded as nonsense. No matter how hard Microsoft Research Asia and new start‑ups tried, their efforts bore little fruit. In 2012, AI developers in China and Silicon Valley told me that creating a neural network would be a golden opportunity for Microsoft. In May 2011 Microsoft acquired Skype, the popular worldwide calling and video conferencing service, in what was then the industry’s largest deal. If Skype or Microsoft Windows could recognise voice and faces, it would be a breakthrough. It would lay the groundwork for real‑time translation features and cybersecurity systems based on facial recognition.
In 2011, in Beijing, I met a group of young Chinese researchers who worked themselves to the bone trying to solve a whole set of thorny questions. The main ones were: “How can a computer system learn to ‘see’ and ‘perceive’ a human? How can it hear and recognise a person’s voice? Can AI learn to talk?”
“Now is the right moment,” one of them told me after work at dinner. “The Internet and social media can serve as data sources for AI. We can gather data on clicks, purchases and people’s preferences.”
He said that in 2005 fewer than 10% of China’s population were online, but they quickly became the world’s most active users of social networks, mobile apps and mobile payments. In 2011 almost 40% of the population, about 513 million people, had an Internet connection of their own. All these users left information about their purchases and online actions that could be used to teach neural networks to solve a wide range of tasks, including monitoring users.
In the same year, two junior researchers working with the world‑renowned AI researcher Geoffrey Hinton, a professor of computer science at the University of Toronto and a Google employee, made a breakthrough in hardware. The researchers realised they could use graphics processors (GPUs) — devices that accelerate graphics in computer games — to speed up the processing of data by deep neural networks. AI developers could use GPU‑oriented methods of rendering shapes and images on screen to train neural networks to search for patterns.
Previously, building a neural network was prohibitively expensive. But the cost of the key equipment on which the software runs fell thanks to GPUs. For many years they grew cheaper even as memory and computing power increased.
With improvements in hardware and the growth of data sets, the moment arrived for a deep neural network that could process this data.
By trial and error, the Microsoft team under Dr. Sun Jiyan found a solution: increase the number of “layers” in the neural network, allowing the AI to continually update its knowledge as information passes through. The layers of a neural network resemble clusters of neurons that receive data, process it, and then pass it to later layers for further processing — thus AI knows more about the subject under analysis.
In theory, more layers meant better thinking; in practice, it proved harder. One problem was that signals vanished after passing each layer, hindering Microsoft researchers from training the system.
In 2012, the system learned to recognise images with eight neural layers. By 2014, with thirty. Increasing the number of layers, the researchers achieved a breakthrough in the computer’s ability to recognise objects in videos and images. “We did not even believe that this single idea could be so important,” said Dr. Sun.
China’s tech ecosystem began to attract venture capitalists, who turned their attention away from the traditional finance and tech hubs in Silicon Valley and New York. They sought to act quickly in two sectors where there was immense potential for a surveillance ecosystem: facial recognition and speech recognition.
The first major investment went to facial recognition technology.
In 2013, Sinovation Ventures, a Kai-Fu Lee–founded AI-focused venture firm, backed the developing facial-recognition platform Megvii (MegaVision). The amount was not disclosed. Then SenseTime (Megvii’s Hong Kong–based rival, founded in 2014) released the first algorithm capable of identifying people under certain conditions with accuracy—surpassing human eye performance—and claimed to have outperformed Facebook; this marked a milestone in the AI industry.
According to Yan Fan, head of SenseTime’s development department and a former Microsoft employee, public-safety applications proved to be a profitable market.
“There is high, competitive demand driven by ‘smart city’ systems and video surveillance,” — he told Forbes Asia.
But facial-recognition software needed the most advanced semiconductors. Where would they come from?
SenseTime and other AI companies in China turned to American firms for semiconductors. It turned out that their US counterparts were interested in Chinese software for mobile apps and policing systems. American telecoms operator Qualcomm struck a deal with Megvii: in exchange for semiconductors, Qualcomm gained the right to use Megvii’s AI software in its devices.
“In China there is explosive demand,” noted Li Xu, co‑founder and CEO of SenseTime, at a tech conference in June 2016 during a joint appearance with Jeff Herbold, Nvidia’s Vice President for Ventures Development.
Seven to eight years after its founding in 1993, Nvidia became a leading GPU maker. It was now poised to reap the profits of the coming AI boom.
Shortly after, Nvidia began striking high‑profile deals with Chinese facial-recognition firms. Using GPUs produced by Nvidia and its main rival Intel, world‑class supercomputers were built at the Cloud Computing Center in Urumqi, opened in 2016, used for surveillance. In one day these computers processed more surveillance footage than a person would in a year.
“In China I see cameras on every streetlight,” Herbold observed. “It seems that almost everything is monitored. But the problem is the video ends up in a control room where some guy waits for something to happen. Isn’t all this supposed to be automated?”
Li Xu acknowledged the Chinese government’s interest in public safety, as well as the fact that “the existing surveillance system was severely hampered by the lack of an intelligent control mechanism, particularly in processing video.”
He proposed an alternative path.
Li Xu knew that Nvidia’s chip technology, borrowed from similar graphics-processing approaches, played a “fundamental” role in his work, and that to sustain facial recognition Nvidia deploys 14,000 such chips across servers in Asia.
“I feel we are in for a long collaboration,” Nvidia’s Herbold told him at a business conference. He may not have intended it, but his words sounded ominous. By 2015 all the components of the surveillance ecosystem had fallen into place: software learned to recognise faces, scan text messages and emails, and identify patterns in written language and people’s interactions.
Now investors began pouring money into the next key element: software capable of understanding and processing human voice.
In the late 1990s, young, promising researcher Liu Qingfeng left an internship at Microsoft Research Asia and devoted himself to his own startup iFlyTek, aiming to develop advanced voice-recognition technology.
“I told him he was a talented young researcher, but China was far behind the American giants of the speech-recognition industry, such as Nuance, and there would be fewer consumers of this technology in China,” Kai-fu Lee wrote. “Credit to Liu: he ignored my advice and threw himself into iFlyTek.”
In 2010 iFlyTek established a laboratory in Xinjiang devoted to developing speech-recognition technology for translating Uyghur into Mandarin. Soon that technology would be used for surveillance and monitoring of Uyghur populations. By 2016 iFlyTek was supplying Kashgar with some 25 systems of “voice fingerprints” that created unique vocal signatures aiding identification and tracking.
“All these companies were coming to Xinjiang before my eyes,” recalls Irfan. “I saw their hardware, their software.” Dozens of Uyghurs who fled Xinjiang after 2014 recalled spotting these companies’ logos on equipment. The presence of these firms in Xinjiang is reflected in government tenders circulating on the internet, in official corporate reports, human-rights assessments, American sanctions documents, and in reports in Chinese state media. “But many did not find this dangerous. The mood was: ‘We are simply fighting crime’,” Irfan notes.
From 2010 to 2015, Huawei — the national tech symbol of China — finally entered the Xinjiang market, developing cloud services in collaboration with local police. Huawei (roughly translated: «China Holds Out Hope») was founded by former military engineer Ren Zhengfei with startup capital of $3,000. In the 1980s the company began developing telephone switches — copying foreign models. As one of the early advocates of a government‑driven acceleration of technology, Huawei became known in China and abroad for its surveillance hardware and networking equipment, and expanded its footprint in the smartphone market.
Ren Zhengfei, described by former colleagues as a partly mysterious figure who spoke in parables about streams and mountain summits, harboured grand plans for global expansion. These could be realised only if Western democracies could be persuaded that Huawei was not tied to the Chinese Communist Party and would not use its technology for spying. At the same time Huawei’s leadership sought to sell networking equipment to Xinjiang authorities, viewing “public safety” as a profitable business.
“In 2015 we were at a team-building event,” William Plummer, a former American diplomat who served as Huawei’s vice president for external relations in Washington, told me. “Someone showed a slide with the heading: ‘What is Huawei about?’ The first point read: ‘For internal Huawei — a Chinese company supporting the Chinese Communist Party.’ Then came: ‘For abroad — an independent company following internationally recognised corporate practice.’”
Essentially they meant that China’s rules must be followed at home, while foreign countries’ rules apply abroad. But to include this in a presentation… even this slide was compromising.
By 2015, the final element of the surveillance ecosystem was in reach: cheaper video-surveillance camera technology — inexpensive enough to spread on an industrial scale. Into Xinjiang came Hikvision, the world’s largest supplier of surveillance cameras. It supplied millions of cameras that allowed authorities to monitor the population. The cameras were so advanced that they could identify people from fifteen kilometres away and used AI software from iFlyTek, SenseTime and others to analyse faces and voices.
“Skynet,” the radical and all‑encompassing state surveillance system, conceived a decade earlier, could now become a reality.
All this technological leap, in a sense a sinister synthesis of elements that culminated in a state built on AI, did not go unnoticed. By 2010, the United States had begun to grow wary — the world’s leading antagonist to China on the international stage.
American policymakers suspected Huawei and its partners were a cover for the People’s Liberation Army to exploit backdoors in hardware and software for cyber‑espionage purposes.
“With a high degree of confidence, we can state that the growing role of multinational companies and foreign persons in the information technology and services supply chains in the United States creates a threat of continuous covert sabotage,” read documents from the U.S. National Security Agency (NSA). Those records were disclosed by whistleblower Edward Snowden in 2010.
Moreover, Snowden’s leaks showed that the NSA tracked twenty Chinese hacking groups attempting to breach U.S. government networks, as well as Google and other company systems. The NSA also hoped to reach the undersea cables laid by Huawei to monitor communications of targets it considered high‑priority, in Cuba, Iran, Afghanistan and Pakistan.
NSA infiltrated the Huawei headquarters, monitored communications of its top executives, and carried out an operation code‑named Shotgiant, aimed at identifying links between Huawei and the People’s Liberation Army. The NSA then attempted to use Huawei technologies sold to other countries and organisations to penetrate Huawei servers and telephone networks and wage cyberattacks on those countries. All this was achieved through expansive hacking capabilities, and backdoors created in collaboration with American telecom companies that allowed mass surveillance of foreign nationals to overcome standard technological barriers.
Representatives of the United States and China talked of a cold war, notwithstanding their trade and tech ties. In 2012, a Congressional committee released the results of a yearlong investigation, stating it had obtained documents from former Huawei employees suggesting the company provided its services to China’s cyber‑warfare arm.
The U.S. authorities began focusing on Ren Zhengfei’s daughter, Meng Wanzhou, widely known as “Cathy.” As Huawei’s socialite, Cathy hosted business events that included Q&As with Alan Greenspan and other guests. The FBI and the Department of Homeland Security tracked Cathy’s business activity and Huawei. They suspected Cathy supervised a front company in Iran called Skycom, which violated U.S. sanctions by doing business with Iranian telecommunications companies.
“We provided the U.S. government with information about Huawei in Iran, about Skycom, and that it was an independent company, even though she sat on the board for two years — we supplied these assurances because that is what we were told, and it seemed better that way. But it was all a lie. Skycom employees in Iran walked around with Huawei business cards,” Plummer later recalled.
Plummer said that in 2013 he was contacted by Huawei’s senior management. Homeland Security agents subjected Meng to extra screening, delaying her before boarding a flight at John F. Kennedy Airport as she returned from one of her glamorous business events.
“They held her computer, tablet and both phones for four hours,” Plummer recounted. “The authorities copied everything.” Meng was released. Huawei’s leadership prepared for a looming legal battle, shutting the Skycom office in Tehran and distancing themselves from Skycom.
But while the United States smeared Huawei and China as enabling backdoors for state‑backed hacking, NSA was caught installing its own backdoors in American network products supplied to China.
Der Spiegel, the German newspaper, obtained a fifty‑page catalog created by NSA’s Advanced Network Technology Division, which monitored the most secure networks. The NSA gained access to shipments of American Cisco network equipment destined for China and secretly installed surveillance devices. Cisco later said it did not know that its own government had hacked it.
Another NSA product, HALLUXWATER, turned out to be a backdoor implant. It hacked Huawei firewalls, enabling the NSA to implant malware and control device memory.
“There is nothing unexpected about this kind of controlling behaviour by the United States,” Ren Zhengfei told reporters in London. “But now it has been proven.”
With geopolitics in play, China’s government opened its cards. The country holds about 93% of the world’s rare earth reserves used in batteries and displays in iPhones and televisions, including lithium and cobalt. In September 2010, a Chinese fishing trawler collided with two Japanese coast guard vessels near the disputed Senkaku Islands. Japan arrested the trawler captain for allegedly violating its fishing rights and control of the region — an area China also claims.
China struck back by blocking rare earth exports to Japan, putting at risk production of the highly popular Toyota Prius, which relies on rare earths for its engine. A little over two weeks later, Japan freed the crew members without charging them.
“China and Japan are meaningful neighbours with important responsibilities in the international community,” Japanese Prime Minister Naoto Kan said in New York at the United Nations, attempting to calm nerves.
But as the Cold War intensified, it became apparent that the United States and China differed in their technological strategies.
China sought to steal American technologies, including trade secrets and intellectual property, to hand them to private Chinese firms seeking to outpace Silicon Valley.
The United States, in turn, aimed to infiltrate Huawei and other Chinese firms. Their aim wasn’t simply to steal Chinese tech and transfer it to private companies like Amazon and Google, but to collect information about potential links to military structures and threats to U.S. national security by China and its companies.
During a June 2014 reporting trip through Beijing, Shanghai and Shenzhen’s tech hub, I sensed the rise of a self-proclaimed nationalism — almost tangible — especially among Chinese youth.
Talk of a new cold war seemed to trigger a sense of unease, reinforced by state propaganda.
One local tech executive tried to explain China’s newfound swagger. “You know Google left China,” he told me with pride. After four years on the Chinese market, Google closed its Chinese search site in 2010 amid a clash over hacks and censorship. “But that doesn’t matter,” he explained. “We have our own search engine, Baidu. Now we have our own companies. The world is changing, and I hope Silicon Valley and the NSA won’t dominate forever.”
“But don’t you think that if China is to reach the level of Silicon Valley, it will have to open up the Internet,” I asked. “Researchers need to access the information necessary to create quality tech.” “That doesn’t matter either,” he replied. “In China our technologies are tied to the future of our country. We do not have the same explicit separation of powers as in the United States. Our only goal is to make China great. We want to be on equal terms with the Americans, so that no one looks down on us again.”
Translation from English by Dmitry Vinogradov. Published for Jeffrey Kane’s work. The State of the Strict Regime. Inside China’s Digital Dystopia. Moscow: Individuum, 2023.
