📢 Gate Square Exclusive: #PUBLIC Creative Contest# Is Now Live!
Join Gate Launchpool Round 297 — PublicAI (PUBLIC) and share your post on Gate Square for a chance to win from a 4,000 $PUBLIC prize pool
🎨 Event Period
Aug 18, 2025, 10:00 – Aug 22, 2025, 16:00 (UTC)
📌 How to Participate
Post original content on Gate Square related to PublicAI (PUBLIC) or the ongoing Launchpool event
Content must be at least 100 words (analysis, tutorials, creative graphics, reviews, etc.)
Add hashtag: #PUBLIC Creative Contest#
Include screenshots of your Launchpool participation (e.g., staking record, reward
Nvidia shrinks AI image generation method to the size of a WhatsApp message
Perfusion, Nvidia's solution for high storage demands of AI image generation
Nvidia researchers have developed a new AI image generation technique that enables highly customized text-to-image models with minimal storage requirements.
According to a paper published on arXiv, the proposed method, called "Perfusion," can add new visual concepts to existing models, using only 100KB of parameters per concept.
Source: Nvidia Research
As the paper's authors describe, Perfusion works by "making small updates to the internal representation of the text-to-image model."
More specifically, it makes carefully calculated changes to the part of the model that connects textual descriptions to the generated visual features. Applying small parametric edits to the cross-attention layer allows Perfusion to modify the way textual input is converted to images. .
So Perfusion didn't completely retrain the text-to-image model from scratch. Instead, it slightly tweaks the mathematical transformations that turn text into images. This allows it to customize the model to generate new visual concepts without requiring much computing power or model retraining.
The perfusion method requires only 100kb.
Perfusion achieves these results with two to five orders of magnitude fewer parameters than competing techniques.
While other methods can require hundreds of megabytes to gigabytes of storage per concept, Perfusion requires only 100KB, comparable to a small image, text, or WhatsApp message.
This drastic reduction could make it more feasible to deploy highly customized AI art models.
According to co-author Gal Chechik,
"Infusion not only enables more accurate personalization at a fraction of the model size, but also enables the use of more complex cues and the incorporation of individually learned concepts at inference time."
The method can use the individually learned notions of "teddy bear" and "teapot" to generate creative images such as "a teddy bear sailing in a teapot".
Source: Nvidia Research
Possibility of efficient personalization
Perfusion's unique ability to personalize AI models using only 100KB per concept opens up countless potential applications:
This approach paves the way for individuals to easily customize text-to-image models with new objects, scenes, or styles, thereby eliminating the need for costly retraining. Perfusion's efficiency of 100KB parameter updates per concept allows models customized using the technology to be implemented on consumer devices, enabling on-device image creation.
One of the most compelling aspects of this technology is the potential it offers for sharing and collaboration around AI models. Users can share their personalized concepts as small additional files, avoiding sharing tedious model checkpoints.
In terms of distribution, models tailored to specific organizations can be more easily disseminated or deployed at the edge. As the practice of text-to-image generation continues to become more mainstream, the ability to achieve such dramatic size reductions without sacrificing functionality will be critical.
It's worth noting, however, that Perfusion primarily provides model personalization rather than full generative capabilities itself.
Restrictions and releases
While promising, the technique does have some limitations. The authors point out that key choices during training can sometimes overgeneralize a concept. More research is still needed to seamlessly combine multiple personalized ideas into a single image.
The authors note that Perfusion's code will be available on their project page, indicating an intention to publicly release the method in the future, possibly pending peer review and official research publications. However, since the work is currently only published on arXiv, the exact details of public availability remain unclear. On this platform, researchers can upload papers before formal peer review and publication in journals/conferences.
While Perfusion's code has yet to be accessed, the authors' proposed plans mean that such highly efficient, personalized AI systems could, in due course, find their way into the hands of developers, industry, and creators.
With the development of AI art platforms such as MidJourney, DALL-E 2, and Stable Diffusion, techniques that allow for greater user control could be critical for real-world deployment. With neat efficiency improvements like Perfusion, Nvidia seems determined to maintain its edge in a rapidly evolving environment.