Introduction to Generative AI
Generative artificial intelligence has seen an incredible popularity surge in 2022. Big Think has called it ‘the technology of the year’, and judging from the amount of attention and VC support generative AI startups have been gaining this year, this claim is more than justified. Moreover, tech experts say that in the next few years, not only will the development of generative AI not slow down but will also rapidly increase, conquering new and new fields.
In this post, written in collaboration with Serokell’s AI developers, I’ll take a closer look at what generative AI is and how it works, as well as outline common use cases and perspectives for the future.
What is generative AI?
Before we go any further, I’d like to bore you with a simple definition of what generative AI is. Feel free to skip this section if you’re already familiar with it.
Generative AI is like a genie you wish you had in your pocket. It’s a type of artificial intelligence that is able to create new data: text, audio, video, and images. Everyone has days when you’re simply not in the mood to write another email, article, or line of code. Generative AI is there to support your creative process.
How does it work?
Generative AI usually uses unsupervised or semi-supervised learning to process large amounts of data and generate original outputs. For example, if you want your AI to be able to paint like Van Gogh, you need to feed it as many paintings by this artist as possible. The neural network that is at the base of generative AI is able to learn the characteristic traits or features of the artist’s style and then apply it on command. The same process is accurate for models that write texts and even books, create interior and fashion designs, non-existent landscapes, music, and more.
Generative AI normally uses GANs or transformers to achieve results.
GANs (generative adversarial networks) consist of two parts: generative and discriminative.
The generative neural network is able to create outputs on request. It has been exposed to the necessary data and has learned certain patterns. However, to improve, it needs help from the discriminative NN.
The second element of the model (the discriminative NN) tries to distinguish between the real-world data and the ‘fake’ data generated by the model. Every time the first model succeeds in fooling the second one, it gets rewarded. This is why this algorithm is also often called an adversarial model. This mechanism allows the model to improve without human input.
If you would like to learn more in detail about how GANs work, you can watch this video.
Another technique that demonstrates impressive results with generative data is transformers.
Transformers use a sequence of data rather than individual data points when transforming the input into the output, and that makes them much more efficient at processing the data when the context matters. Transformers are often used to translate or generate texts since texts are more than just words chunked together. Moreover, transformers are helpful to create foundation models. They are used when engineers are working on algorithms that are able to transform a natural language request into a command, for example, generate an image or text based on user description.
In this video, you can find out more about how transformers are used in generative AI.
A brief history of generative artificial intelligence
Since the early 2010s, artificial intelligence as a field has been going through a period of active growth and development. Journalists across the globe hurried to infuse panic: now that AI can learn by itself, it will steal our jobs and drive the world into economic collapse, singularity is near, beware.
However, soon after that most people realized that the exciting perspective of being dominated by the machines was rather unrealistic. Not because AI has proved itself to be a ‘good guy’ and followed all the Asimov’s laws of robotics. The problem was that AI turned out to be mostly stupid (just like us 🙃). So you still have to go to work everyday.
Sure, our AI systems can analyze lots of data, make calculations very fast, detect faces in metro stations, and then report to the government. However, all of these are examples of narrow artificial intelligence, when the algorithm is good at performing just one skill. I’m sure you will agree: one can hardly call this true intelligence that can compete with humans in creativity.
However, this year suddenly everything might change. Yes, I know that many people have said it before, mostly AI startup founders that want to sell you their AI services. I’m aware that ‘revolution’ and ‘AI’ and ‘innovation’ have become buzzwords that automatically make your brain go numb as soon as you hear them. But just let me explain.
The latest projects in the fields of generative AI have shown that we actually have finally learned to make something incredible. Last year, GPT-3 was an obvious leader in what concerned generating content. This year, GPT-3 is still strong, after all it is able to generate text, code, and images using prompts and natural language commands. However, everybody was obviously blown away with a new project, MidJourney, of course, that doesn’t just generate something but creates digital art that actually makes sense.
And MidJourney isn’t the only project deserving attention. There are hundreds of startups that are using the capabilities of generative AI to automate creative work and promise to revolutionize the field.
VC’s also demonstrate a particular interest in generative artificial intelligence startups this year. Experts say that their interest is motivated by the latest improvements in this area and real benefits that generative AI can bring across multiple industries.
Future of generative AI
AI that is able to create images, videos, and texts is today often used by designers, artists, and other creatives. However, generative AI is much more practical than you might think. For example, one of the most widely known projects in generative AI is Grammarly that helps anyone who uses their computer to write, be it for study, work, or personal use, to write in English more efficiently and with less mistakes.
Investors that are supporting generative AI today are excited to see how it can be used in biotech helping to discover new drugs. In fact, according to Gartner, 50% of all drug discovery in 2025 will be done with the help of generative artificial intelligence. Pharmaceuticals and medtech will be just some of the industries that will greatly benefit from generative AI. Marketing is another field that experts believe will be revolutionized by generative AI: by 2025, 30% of outbound marketing messages from large organizations will be created artificially.
Generative AI and no code
Generative AI is important not only by itself but also because it makes us one step closer to the world where we can communicate with computers in natural language rather than in a programming language. With the help of generative AI, models become multimodal, which means they are able to process several modalities at a time, such as text and images, which expands their areas of application and makes them more versatile.
The process of simplification and democratization of human-machine interaction also positively influences the quality of the models itself since more people, including experts, are involved in their training. That means that generative models are much more than just fun or crazy art that you can generate when you have nothing better to do. In fact, generative AI might be that next step in the evolution of AI that we have all been waiting for.
If you want to learn more about what technological trends to expect from 2023, read our recent post.
Bonus: what will artificial intelligence of the future look like?
I’ve asked several text to image models to generate art that would show what the AI of the future will look like. Caution: the results may surprise you!
MidJourney is an image generation tool released by a research lab with the same name.
All image generation in MidJourney currently happens through interactions with their Discord bot or web app (only for subscribers). The free plan gives you 25 credits that you can use to generate publicly visible images. After you purchase a subscription, you get more credits and other benefits.
There is no API to use MidJourney in your application at the moment.
Pictures generated by MidJourney demonstrate an upcoming unity of humans and machines as the future of artificial intelligence. And what do you think? Will the AI machines of the future look like cyborgs? 🤔
DALL-E’s take on the subject is artistic and definitely futuristic, but much less conveniently aesthetic than MidJourney’s one. This is mostly due to the different datasets used in training the models.
ruDALL-E is a project created by Sber that works similarly to DALL-E but is entirely open-source.
ruDALL-E’s version is much less artistic and more psychedelic. You can clearly see that it looks like a merge of different photos scraped from online rather than a separate masterpiece.
I also decided to explore some of the less known projects in the fields of generative AI.
One of them is NightCafe. This web app can take a text prompt that you provide and create an AI dream inspired by the keywords that you used.
Hotpot is another project that I decided to explore. This app mostly helps people to edit photos, for example, using AI to automatically color old photos, remove objects and background from photos. However, their AI has also managed to successfully generate an image that demonstrates a bit of a scary and suspenseful future of artificial intelligence.
Which art did you like the most and why? What do you think about the future of generative AI? Let us know on Twitter!