stable-diffusion-videos 用图片生成视频开源解决方案

微wx笑 2023-06-15【人工智能】80 0 0关键字：

stable_diffusion_videos是一个用于使用Stable Diffusion模型生成视频的开源项目。它的主要功能是:1. 使用Stable Diffusion模型生成相对应图像序列2. 对序列中的图像应用各

项目安装方法：
使用方法:

制作视频
制作音乐视频
使用用户界面

stable_diffusion_videos是一个用于使用Stable Diffusion模型生成视频的开源项目。
I2k无知

它的主要功能是:I2k无知

1. 使用Stable Diffusion模型生成相对应图像序列I2k无知

2. 对序列中的图像应用各种后期处理技术,如补帧、平滑等I2k无知

3. 将处理后的图像序列渲染成视频I2k无知

GitHub:I2k无知

https://github.com/ivu4e/stable-diffusion-videos I2k无知

I2k无知

项目使用Stable Diffusion v1.4模型,可以生成512x512分辨率的图像。然后使用各种技术将图像序列转换为流畅的视频。I2k无知

项目安装方法：

python 版本要 3.8 I2k无知

下载Python 3.8的安装程序。可以去Python官网的下载页面下载:https://www.python.org/downloads/release/python-380/I2k无知

1	`pip` `install` `stable_diffusion_videos`

国内直接使用以上命令安装可能比较慢，建议使用下面的命令安装I2k无知

1	`pip` `install` `-i https://mirrors.aliyun.com/pypi/simple/` `stable_diffusion_videos`

注意，其中一个依赖 basicsr 可能会一直安装失败，需要单独安装I2k无知

git clone  
cd BasicSR
pip install -r requirements.txt  -i https://mirrors.aliyun.com/pypi/simple/
python setup.py develop

参考：https://github.com/XPixelGroup/BasicSR I2k无知

https://blog.csdn.net/hadoopdevelop/article/details/127815761 I2k无知

安装 stable_diffusion_videos 的过程中可能会遇到以下错误：I2k无知

 Requirements should be satisfied by a PEP 517 installer.
 If you are using pip, you can try `pip install --use-pep517`
  
  note: This error originates from a subprocess, and is likely not a problem wit
h pip.
error: metadata-generation-failed
 
× Encountered error while generating package metadata.
╰─> See above for output.
 
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
 
注意：这个错误来自于一个子进程，很可能不是一个关于
h的问题。
error: metadata-generation-failed
× 在生成包的元数据时遇到了错误。
╰-> 输出见上文。
note: 这是上面提到的软件包的问题，不是pip的问题。
提示：详见上文。

这时需要在安装命令后面添加参数：I2k无知

1	`pip` `install` `-i https://mirrors.aliyun.com/pypi/simple/` `stable_diffusion_videos --use-pep517`

I2k无知

使用方法:

查看示例文件夹中的示例脚本👀I2k无知

制作视频

注意：对于 Apple M1 架构，请torch.float32改用，因为torch.float16在 MPS 上不可用。I2k无知

from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch
 
pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")
 
video_path = pipeline.walk(
    prompts=['a cat', 'a dog'],
    seeds=[42, 1337],
    num_interpolation_steps=3,
    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.
    output_dir='dreams',        # Where images/videos will be saved
    name='animals_test',        # Subdirectory of output_dir where images/videos will be saved
    guidance_scale=8.5,         # Higher adheres to prompt more, lower lets model take the wheel
    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default
)

I2k无知

制作音乐视频

新的！通过提供音频文件的路径，可以将音乐添加到视频中。音频将通知插值率，以便视频移动到节拍🎶I2k无知

from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch
 
pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")
 
# Seconds in the song.
audio_offsets = [146, 148]  # [Start, end]
fps = 30  # Use lower values for testing (5 or 10), higher values for better quality (30 or 60)
 
# Convert seconds to frames
num_interpolation_steps = [(b-a) * fps for a, b in zip(audio_offsets, audio_offsets[1:])]
 
video_path = pipeline.walk(
    prompts=['a cat', 'a dog'],
    seeds=[42, 1337],
    num_interpolation_steps=num_interpolation_steps,
    audio_filepath='audio.mp3',
    audio_start_sec=audio_offsets[0],
    fps=fps,
    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.
    output_dir='dreams',        # Where images/videos will be saved
    guidance_scale=7.5,         # Higher adheres to prompt more, lower lets model take the wheel
    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default
)

I2k无知

使用用户界面

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface
import torch
 
pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")
 
interface = Interface(pipeline)
interface.launch()