You might have heard of checkpoints in the context of machine learning, especially in generative AI image creation. What is a Stable Diffusion checkpoint? Is it a model or is it something different?
What are checkpoints
Checkpoints and models are fundamental concepts in machine learning that are related but distinct. It can get a bit confusing when the terms are commonly used interchangeably.
A model is a complex algorithm trained to make predictions based on input data. The process of model training is where the model learns patterns and information from a given training dataset.
A checkpoint is a snapshot during the training that captures the state of a model at a specific stage in the training process. In other words, checkpoints are a type of AI models. There are other types of Stable Diffusion models like LoRAs, LoCONs, LoHAs, LECOs and so on, but we will only be looking at checkpoints today.
Think of checkpoints as save points in a video game, allowing you to capture the state of your model at specific intervals during training. When you use a checkpoint, you are able to generate images using the concepts and knowledge it has learnt up to the checkpoint.
Types of checkpoints
If you know me well enough, you’re most likely aware of how fussy I am about organising things. I group my Stable Diffusion checkpoints based on the output they are able to produce. There are several ways to group checkpoints, including trained knowledge, image output, and realism.
I don’t group my checkpoints based on trained knowledge, but it is useful to know how they are trained to understand what they are capable of. If you are way of using checkpoints trained on copyrighted material, knowing how they are created would be key.
You can also group checkpoints by the level of realism they can achieve. The realism here generally refers to the proportions I find this a good starting point to identify the best checkpoint to use. Of course, some checkpoints are capable of multiple levels of realism.
However, I don’t group my checkpoints by realism. I prefer to sort my checkpoints based on their image output capability. There are checkpoints that can achieve different types of output, but I
find that the better checkpoints are generally specialised for a particular type of usage instead of being able to produce different looks.
Note that other than the categorisation by type, the other groupings can be subjective and are just a general way to group the checkpoints to make it easier to organise them. You’ll find that there many Stable Diffusion checkpoints fall under a few categories in actual usage.
All these different categorisation can be a little confusing that’s why I created my Stable Diffusion checkpoint databases to help me track what the checkpoints are capable of.
Checkpoint types – trained knowledge
One way of grouping Stable Diffusion checkpoints is based on how they are trained.
Models like the SD 1.4 or SD 1.5 models are models trained by Stability AI on a large dataset. Model creators can create similar base models by training a new model with their own dataset. These are referred to as trained checkpoints.
You can also fine-tune a model by using a base model as a starting point to train your dataset. This base model can be the SD 1.4 or SD 1.5 checkpoints, or another checkpoint. Fine-turning is done to adapt an existing model for a specific task or dataset, such as a particular art style, person or character.
Both base models and fine-tuned models are referred to as trained checkpoints.
Checkpoints can also be combined to blend the trained knowledge together, either to improve the quality or to mix different art styles together. These are called merged checkpoints, often denoted with a “Mix” in the checkpoint’s name.
Checkpoint types – image output
The main way I group my Stable Diffusion checkpoints is by the type of output they are able to generate.
So, let’s look at the types of photos you can generate. These are some of the broad looks people create:
- Photorealistic – hyperrealistic images that resemble photographs
- Digital painting – concept or fantasy art images that mimics realism with artistic expression
- Render – 3D-rendered image style
- Anime – anime style with exaggerated proportions
- Illustration – distinct brush strokes, including line art and sketches
Photorealism is an art style that tries to mimic realism in paintings. Photorealistic checkpoints are capable of generating hyperrealistic images that look like photographs. Do not confuse the photorealistic style with the amount of realism it generates.
Get my Top 10 Most Popular Realistic Checkpoints database when you sign up for my newsletter.
Digital painting checkpoints
Digital painting checkpoints generate images with realistic look, but the texture is less realistic than photorealistic checkpoints. They balance detail with artistic interpretation, allowing for greater stylistic flexibility, such as visible brush strokes or a more painterly quality, depending on the training data and model design.
The images they create are reminiscent of digital and traditional artwork. I use these checkpoints if I want a concept art or digital art look.
Render checkpoints are often trained with 3D-rendered images and mimic rendering styles, such as Disney’s Pixar style. These checkpoints produce images with render-like qualities. The images created have realistic lighting, but often with texture and details of 3D models.
A popular look is the 3D Niji style from Midjourney. You can find Stable Diffusion trained on 3D Niji images.
Anime checkpoints generate images with the distinctive anime style, including exaggerated proportions, expressions, and hair colours and styles. I generally group checkpoints for manga and anime fan art here, unless the lines are so loose that they fall under illustration checkpoints instead.
The use of generative AI to create anime-style images is immensely popular and a major driving force in the development of AI image generation. Thus, you’ll find many anime checkpoints covering different anime styles.
I prefer to group comic checkpoints here as well, unless they have such a high level of realism that warrants their grouping under digital painting checkpoints.
Illustration checkpoints produce images with distinctive brush strokes. These could range from wet to dry media, including oil painting, water colour, line art, and sketches. The checkpoints are trained to mimic the brush strokes of the particular medium.
General purpose checkpoints
Some checkpoints are trained to be able to produce different image styles. These are referred to as general purpose checkpoints. They are the Swiss Army knives checkpoints that lets you create a variety of styles without having to swap checkpoints.
Checkpoint types – realism
When I look at realism, I consider both the human proportions and how three-dimensional the images look. This is more subjective than the image output because you can often alter the level of realism through prompting.
Nevertheless, I prefer to also group the realism to help me track what the checkpoints can achieve with these categories:
- Realistic – realistic proportions
- Semi-realistic – 3D look with almost realistic proportions
- 2.8D – between 2.5D and 3D look
- 2.5D – non-flat shading
- 2D – flat-shading
Realistic checkpoints generate people with life-like proportions and details. These includes both photorealistic and digital painting checkpoints that both aim to replicate the look for real-world or high-fidelity art.
Semi-realistic checkpoints create characters with a three-dimensional look but the proportions are not quite life-like. These are often anime or comic style look with some level of fantastical proportions, or render checkpoints.
2.8D checkpoints straddle between 2.5D and 3D look, with more realism than 2.5D but not quite 3D level of realism. 2.8D is not an actual technical style, and I did not use this category initially. However, th number of checkpoints targeting this specific look has led to me adding it as a distinct category on its on.
These checkpoints are often anime or digital painting checkpoints with a very stylised look.
2.5D checkpoints are have more realistic shading to give the subjects more depth and definition compared to the 2D look. Like 2.8D checkpoints, these are commonly anime or digital painting checkpoints with a stylised look.
2D checkpoints have flat shading look of traditional anime style. Most anime checkpoints can produce the 2D look. However, this art style extends beyond just anime, including any sort of two-dimensional artistic styles.
Other types of categories
I focus mainly on portraits, hence I only look at these few features in the checkpoints when groping them. There are other checkpoints that specialises in generating environments, icons, logos, or backgrounds.
Since I rarely generate these kind of images, I won’t talk much about them for now.
Choosing Stable Diffusion checkpoints
How do you know which one is the best one? It depends on the type of images you are looking to generate and your preferred workflow.
Stay tuned for guides on choosing checkpoints and my review of my favourite checkpoints.