Man Made Intelligence Is Destroying The Web


When users logged onto HBO Max at the end of May, they were greeted by an unusual phenomenon. Traditionally, when accessing the site, HBO would prompt individuals to verify their humanity through the familiar "I am not a robot" checkbox or by selecting specific squares containing stoplights from image grids. 


However, this time was different. Users were now confronted with a complex series of puzzles instead. These perplexing tasks varied from counting dots on dice images to listening to brief audio clips and identifying the one with a recurring sound pattern. These unconventional challenges, supposedly designed to confirm that users were indeed human, were not exclusive to HBO. Across various platforms, users encountered increasingly impossible puzzles, such as discerning nonexistent objects like a horse made entirely of clouds. 


The motivation behind these new requirements can be attributed to advancements in AI technology. With tech companies training their bots using previous versions of captchas, these programs have become incredibly adept at overcoming standard challenges. Consequently, humans now find themselves having to exert more effort to prove their humanness in order to access online services. However, enigmatic captchas only scratch the surface of how AI is revolutionizing the inner workings of the internet. 


Since the introduction of ChatGPT last year, tech companies have rushed to integrate this AI technology into their platforms, often restructuring their established core products to do so. The ease of generating seemingly authoritative text and visuals with a simple click threatens to undermine the internet's delicate structures, turning the online experience into a bewildering maze. As the allure of AI grips the web, researchers have discovered its potential to exacerbate critical concerns on the internet, such as misinformation and privacy violations. Moreover, it has added to the existing frustration of mundane tasks like spam deletion and logging into websites. 


Christian Selig, the creator of Apollo, a popular Reddit app, emphasized that he does not foresee AI leading to the downfall of society but acknowledges its capacity to significantly impact the internet. As things currently stand, AI has already transformed the internet into a nightmarish realm.


Internet Disruption


Reddit, the internet's longstanding unofficial front page, owes much of its longevity to the dedicated volunteers who moderate its diverse communities. According to estimates, these Reddit moderators contribute approximately $3.4 million worth of unpaid work annually. To carry out their duties effectively, they rely on tools like Apollo, an app that has provided advanced moderation capabilities for nearly a decade. However, in a surprising turn of events in June, users were met with an unexpected announcement: Apollo was shutting down. As the company sought to capitalize on the burgeoning field of AI, third-party apps became casualties of this pursuit.


Apollo and similar interfaces depend on access to Reddit's application programming interface (API), a software that facilitates data exchange between apps. Previously, Reddit permitted free data scraping, believing that more tools would attract more users and facilitate the platform's growth. However, AI companies have now begun leveraging Reddit's extensive repository of online human interaction to train their models. Eager to capitalize on this newfound interest, Reddit implemented costly pricing for access to their data. Consequently, Apollo and other apps were left with no choice but to shut down, precipitating a month-long period of protest and unrest among the Reddit community. Despite alienating the very communities it depends on, the company remained steadfast in its refusal to accommodate them.


According to a report by Europol, an astonishing 90% of internet content is projected to be AI-generated in the near future. This surge in AI-generated content, fueled by data-scraping practices that undermine the reliability of once-trusted websites, has inundated the web with an abundance of questionable material. Martijn Pieters, a software engineer based in Cambridge, recently witnessed the decline of Stack Overflow, a premier online platform for technical questions and answers. Having contributed to and moderated the platform for over a decade, Pieters observed a sudden nosedive in its quality. This deterioration came in the wake of Prosus, the company behind Stack Overflow, allowing AI-generated answers and implementing charges for AI firms to access their data. In response, prominent moderators staged a strike, arguing that the subpar AI-generated content contradicted the site's core purpose of offering high-quality question-and-answer content.


NewsGuard, a company specializing in tracking misinformation and evaluating the credibility of information websites, has identified nearly 350 online news outlets that are predominantly generated by AI with minimal human oversight. Examples include Biz Breaking News and Market News Reports, which churn out generic articles across various topics such as politics, technology, economics, and travel. Many of these articles are replete with unverified claims, conspiracy theories, and hoaxes. When NewsGuard tested ChatGPT, an AI model, to assess its propensity for disseminating false narratives, it failed every single time in spreading accurate information. Gordon Crovitz, NewsGuard's co-CEO, stressed that unless AI models are refined and safeguarded against manipulation, they will become the most significant source of persuasive misinformation on an unprecedented scale in internet history. Europol's report echoes these concerns, predicting that an astounding 90% of internet content will soon be AI-generated.


While AI-generated news websites currently lack substantial audiences, their rapid growth foreshadows the potential for AI-generated content to distort information on social media platforms. Filippo Menczer, a computer science professor and the director of Indiana University's Observatory on Social Media, has already uncovered networks of bots flooding social media sites like X (formerly Twitter) and Facebook with copious amounts of content generated by ChatGPT. Although AI bots currently exhibit discernible characteristics, experts suggest that they will soon become more adept at mimicking human behavior, evading detection systems—developed by Menczer and social networks alike.


Aside from combating malicious actors on user-run sites and social media platforms, another critical concern arises: the erosion of reliable information verification through search engines. Microsoft and Google are poised to prioritize bot-generated summaries over traditional search-result links. These summaries, however, are ill-equipped to discern fact from fiction. When we conduct searches on Google, we not only receive answers but also gain contextual understanding within the broader internet landscape. We meticulously filter these results, ultimately selecting sources we consider trustworthy. In contrast, a chatbot-powered search engine truncates this experience, stripping away context such as website addresses. Consequently, it can "parrot" plagiarized answers, which, as described by NewsGuard's Crovitz, sound "authoritative and well-written" but are "completely false."


Synthetic content has inundated e-commerce platforms such as Amazon and Etsy. A mere fortnight prior to the scheduled publication of a technical textbook by Christopher Cowell, a curriculum engineer hailing from Portland, Oregon, he stumbled upon a newly listed book on Amazon bearing the exact same title. It soon dawned on Cowell that the book had been generated by artificial intelligence (AI), with the publisher likely obtaining the title from Amazon's prerelease list and inputting it into software like ChatGPT. Likewise, on Etsy—a platform renowned for its assortment of handcrafted, artisanal products—AI-generated artworks, mugs, and books have permeated the marketplace.


Consequently, discerning authenticity from fabrication in the online realm is rapidly becoming an arduous task. While misinformation has long plagued the internet, AI is poised to escalate our preexisting predicaments to unprecedented levels.


A fraudulent extravaganza


In the short term, the rise of AI will present a range of significant security and privacy challenges. Online scams, which have been steadily on the rise since November, will become more difficult to detect due to the AI's ability to customize them for each individual target. Recent research by John Licato, a computer science professor at the University of South Florida, has demonstrated the capability to engineer scams specifically tailored to an individual's preferences and behavior, even with minimal information obtained from public websites and social media profiles. 


One significant indicator of high-risk phishing scams, where attackers impersonate trusted entities like banks to gather sensitive information, is the presence of typos or subpar graphics. However, in an AI-powered fraudulent network, these red flags will no longer exist. Hackers are leveraging free text-to-image and text generators, such as ChatGPT, to turn them into powerful spam engines. This generative AI technology could potentially be used to incorporate a person's profile picture into a brand's personalized email campaign or create a video message from a politician with an artificially modified voice, focusing on subjects that the recipient is interested in. 


As AI continues to shape the internet, it will increasingly seem to be designed and controlled by machines. This shift is already underway. Darktrace, a cybersecurity firm, has reported a staggering 135% rise in malicious cyber campaigns since the beginning of 2023. Criminals are increasingly employing bots to compose phishing emails that are longer, error-free, and less likely to trigger spam filters.


In the near future, hackers may require less effort to obtain sensitive information. Currently, hackers often employ various indirect methods such as hidden trackers on websites and purchasing compromised data sets from the dark web to spy on individuals. However, security researchers have discovered that AI bots embedded within apps and devices could potentially steal confidential information for hackers. As OpenAI and Google AI models actively scour the web, hackers can obscure malicious codes within websites and instruct the bots to execute them without any human intervention.


For instance, imagine using Microsoft Edge, a browser integrated with the Bing AI chatbot. Since the chatbot constantly scans the webpages visited, it can inadvertently pick up concealed malicious code while browsing a website. This code could prompt the Bing AI chatbot to impersonate a Microsoft employee, entice the user with a new offer of free Microsoft Office usage, and illicitly request their credit card details. Florian Tramèr, an assistant professor of computer science at ETH Zürich, finds these "prompt injection" attacks concerning, especially with the increasing integration of AI smart assistants into various apps like email clients, browsers, and office software. Consequently, these assistants have easy access to personal data.


Tramèr states, "Something like a smart AI assistant that manages your email, calendar, purchases, etc., is just not viable at the moment because of these risks."



Deceased connectivity

As AI continues to disrupt community-led initiatives such as Wikipedia and Reddit, the internet is increasingly being perceived as engineered and controlled by machines. According to Toby Walsh, an artificial intelligence professor at the University of New South Wales, this trend risks breaking the familiar web landscape. Moreover, it poses challenges for AI developers because the proliferation of AI-generated content diminishes the availability of original data crucial for improving their models—Microsoft, Google, and other tech companies will face this predicament.


Walsh emphasizes that the effectiveness of AI today stems from human effort and ingenuity. However, he warns that if the next generation of generative AI relies solely on output from its predecessor, the quality will significantly decline. A study conducted by the University of Oxford in May underscored this concern, revealing that training AI on data generated by other AI systems leads to degradation and eventual collapse. Consequently, the online information landscape will suffer a decline in quality.


Drawing on the analogy of the "dead internet" theory, Licato, a professor at the University of South Florida, likens the current state of the web to a scenario where highly frequented platforms like Reddit become inundated with bot-written articles and comments. In response, companies will deploy counter-bots to sift through and filter automated content. The theory predicts that, eventually, the majority of content creation and consumption on the internet will no longer be a product of human endeavor.


Licato acknowledges the challenging nature of envisioning such a scenario but believes it is increasingly probable given the current trajectory. Personally, I find myself in agreement. Recently, the online spaces I used to frequent have been overwhelmed either by AI-generated content and identities or by the relentless race to keep up with their AI-driven competitors, resulting in the compromise of core services. If this trend persists, the internet will undergo irreversible changes.


Note: Shubham Agarwal, a freelance technology journalist from Ahmedabad, India, has contributed to prominent publications such as Wired, The Verge, and Fast Company.