5 min read
我更新了这个工具帮你出好图!不来看看吗?

前言
最近做了(更新)了两个小工具,都是开源的,分享给大家。 一个是 ComfyUI 的 LLM 插件的更新,结合 oneapi,几乎可以调用任意大模型,帮你补全提示词,反推提示词等等,简直是给 ComfyUI 工作流插上了量子发动机!
另一个是可以将日常的闪念,同步到 Obsidian 笔记的插件,这样 Obsidian 作为个人知识管理中心的地位又巩固了很多~😂
今天先介绍 ComfyUI-LLM 插件的更新以及介绍如何应用到 ComfyUI工作流中,比较基础,但是实用!
ComfyUI 大模型插件
项目地址
https://github.com/leoleelxh/ComfyUI-LLMs
项目介绍
其实这个项目我已经发布比较久了,最近做了一次大更新。
- 重新组织了配置方式
- 融合视觉模型方案。可以在一个节点调用多个视觉模型
- 支持最新的 openai 视觉模型
- 视觉模型列表:
- Openai
- Google gemini
- Ali
- Glm 其实其他符合 openai 规范的模型都可以接入,不过需要跟 oneapi 结合才有这种效果。
视觉识别
支持各种视觉模型。
通过这个图片反推,输出图片。
注意:这里内置的图像识别提示词,只强调光影(也就是影调),大家可以自行修改,如可以增加人物的动作、表情等。
Please provide a detailed description of this image, including:
- The main subject (s) and their appearance
- The setting and environment
- Colors, lighting, and visual elements
- Any notable details or unique features
- The overall mood and atmosphere
Describe as if you are explaining the image to someone who cannot see it.
文本聊天
可以做的事情也很多,主要还是用于提示语补全语整理。 支持的模型,按需要进行配置,我是通过 oneapi 统一接入的,很方便。
提示语补全
输入:
中国美女,生活照,半身照
输出:
圣诞节快到了,穿上圣诞节衣服吧~🥱
中国美女,半身照,圣诞节性感服装
提示词处理
通常我们出了补全提示词,还需要对提示语进行整理。这时候我们也可以用到这个节点,只要在提示语基础上再串联一个节点即可。
前面的文章中,我就用到了这个处理过程。可以回顾看看:
再例如,还是上面的案例,假设我要将反推出来的影调类似的图,转为组图,那么可以再后面串联一个 chat 节点,用于处理提示词,这里简单写一写,中文也是没问题的:
将画面的主体,修改为多角度拼图描述语,左中右几张图,分别对应人物/主体的多个动作表情。注意需要保持主体的一致性。使用英文输出。
Gpt 处理得还是很好的:
The visual composition features a young woman consistently as the main subject across three images, showcasing her from multiple angles and with varying expressions and gestures.
**Left Image:**
The young woman is captured in a relaxed, slightly angled pose, her body turned to the left while her face remains facing forward. She exhibits a serene smile, her eyes softly glancing to the side. Her medium-length black hair with soft waves frames her face elegantly. She’s seen gracefully touching the lace collar of her soft pink blouse with her right hand, highlighting the intricate lace detailing. The lighting gently illuminates her features, emphasizing the sophisticated and delicate design of her attire.
**Center Image:**
In a straightforward, frontal view, the young woman maintains direct eye contact with the camera, her expression poised and confident. Her delicate earrings are more prominently visible, adding to her subtle elegance. The soft pink blouse with intricate lace around the neckline is clearly showcased. Her gentle, natural makeup enhances her fair complexion, with a touch of blush and pink lipstick complementing her overall look. The neutral beige background creates a serene backdrop, keeping the focus firmly on her composed demeanor.
**Right Image:**
The young woman is angled to the right, with her face turned slightly toward the camera. She appears thoughtful and reflective, her eyes cast downward, and a serene smile barely touching her lips. Her left hand is delicately placed on her chin, showcasing the lace detail on her sleeve. The soft waves of her black hair cascade gently over her shoulders. The neutral beige background continues to complement her soft, elegant attire and the overall warm, inviting atmosphere of the scene.
In all three images, the subject remains consistent, exuding grace and poise through different expressions and subtle gestures. Each angle and pose highlights different aspects of her elegant and serene appearance.
如果人肉敲,估计写断手了~😂
再试试圣诞装的女孩和其他。
是不是鲜活很多了~哈哈😄
其实以上这些应用,也不新鲜了,各位有经验的大佬全当看看工具吧~。还在入门的小伙伴,真的可以把 llm 工具融入到工作流中,这样可以方便很多~👍。
关于模型选择和价格
最后,模型这么多,怎么选择呢?还有大家还是会关注价格,毕竟线上的 API 是需要付费的。 这里我建议大家用阿里的和 glm 、gemini 的,尤其是 gemini,有很多免费额度,我用了这么久,就没付过钱😂。死贵的是 openai,但是效果也最好。当然,Claude 就更厉害了,更贵,效果更好!
本地运行当然也可以用 ollama 方案或者 caption 2 之类的,但是机器性能堪忧啊,需要不断的清理显存才行,还不如线上的方便。
更多 AI 辅助设计和设计灵感趋势,请关注公众号(设计小站):sjxz 00。