Cover Image

A new dimension of AI revolution: protecting personal data with self-hosted AI

25 March 2024 - Reading time: 5 minutes

In recent years, the world has witnessed a remarkable surge in the development and application of Artificial Intelligence (AI) technologies. The rapid advancement in Large Language Models (LLMs), which have enabled machines to process and generate human-like language with unprecedented accuracy, has transformed the way we communicate, work, and interact with each other. From virtual assistants to language translation services, email filters, and "AI-enhanced" social media platforms - AI is in almost every aspect of modern life. As we happily embark on this journey into a world where AI is increasingly intertwined with our daily lives, we must also think of and confront the challenges that come with it. 

As we embrace the benefits of this technological revolution, it’s crucial that we also examine the elephant in the room: corporate dominance. The behemoths of the AI market – companies like OpenAI, Meta, Google, and Microsoft – are at the forefront of innovation, offering a wide range of AI-driven services that now become essential to our daily lives. AI is "bolt-on" into everything those large players offer (for our benefit of course). As we surrender our precious personal data to these corporations, maybe we should also ask ourselves: what’s in it for them?

These companies collect and store vast amounts of data from users’ queries and metadata often containing sensitive information. The problem is that these corporations do not provide transparency into how this data is processed and protected. They have neither the incentive nor the obligation to prioritise our privacy over their profits. Remember rule number one: if you are not paying for the product - you are the product. But would anything change if we paid for the service? More privacy or more security? Doubt it.

Fun fact: OpenAI, for instance, was initially established as a non-profit organisation with the noble goal of advancing artificial intelligence research. However, it has since suddenly transformed into a "very-much-for-profit" company valued at 80 billion USD (value for 25/05/2024). This shift in focus substantially transformed how the company behaves: now the monetisation of every single service has become an ultimate goal. And we, users, accustomed to the AI-based services, have no choice, but to pay someone to collect more of our data and make even more money on it. It's a perfect business model indeed!

It is well-known that lack of accountability in the IT world has already led to multiple high-profile data breaches, exposing millions of user’s personal information to cybercriminals. The consequences are far-reaching, from compromised identities and financial security to erosion of trust in these corporations. Is there any reason to believe that the story will be different in the case of the large providers of AI services? Doubt it.

As we move forward in this era of rapid AI development, we must prioritize data privacy and security. We must think twice before sharing our most intimate thoughts or personal details with AI assistants like ChatGPT and Bard or anything similar. Instead, let us look forward to a future where we can run our own AI systems at home or in the office, reaping the benefits of this technology while maintaining control over our digital lives.

One of the key hurdles in developing self-hosted AI solutions has been the need to compress large language models into smaller ones that can be processed on consumer-grade hardware. This process, known as quantization, involves mapping high-precision model data to lower-precision ones, effectively “shrinking” the model and, at the same time, trying to maintain its functionality.

This technology has already made good progress, enabling the development of small language models that can run reasonably quickly even on consumer-grade hardware using open-source solutions. Now it is also possible to train and deploy small LLMs using a relatively modest home PC with an Nvidia or Radeon graphics card. Some innovations work on your web browser only! This is opening a whole new world of possibilities for self-hosted AI solutions. 

What does this mean for us? It means that we can harness the power of AI without sacrificing our most valuable asset – our data. We can unlock the full potential of AI while maintaining control over our digital lives. And, as we move forward in this era of rapid AI development, we still must prioritise data privacy and security. Check the Internet - there are multiple solutions there already!

As the era of self-hosted AI dawns, we stand at the threshold of a transformative moment in the history of technology. By embracing this approach, we can reclaim our data sovereignty, liberate ourselves from corporate control, and unlock the truly boundless potential of Artificial Intelligence. So, what’s holding you back? Just give it a try!

[The article is also published on LinkedIn]

Hit Counter

82