Hugging Face diffusers の GitHub リポジトリを Fork して FLUX.1 [schnell] を Google Colab PRO でデバッグ実行

1. 概要

こちらのブログに記載した内容の続きになります。

Google Colab のエディタよりも使い慣れたエディタで diffusers のスクリプトに修正を加えながら FLUX.1 [schnell] 等のモデルを動かす方法として、上記のブログの方法を試してみました。上記のブログの方法ですと、ローカル PC 上で任意のエディタを使用してスクリプトに修正を加え、Google Colab で動作させることができます。例えば、g ドライブにマウントされた Google Drive 内のスクリプトを Windows の emacs 等で編集することができます。

ただ、本日、上記のブログで試したのとは別の PC で Google Drive に git clone しておいたスクリプトを編集しようとしたところ、git の動作に不具合が発生しているようでした。

そのため、GitHub に diffusers の fork を作成し、調査等で試したスクリプトの修正はそちらに反映させることにしました。こちらは、そのときの備忘録のようなブログになります。

2. Debug 実行を試すためのサンプルの Google Colab のページ

こちらのリンク先にデバッグ実行を試すためのサンプルの Google Colab のページを用意しました。

Hugging Face のdiffusers の GitHub リポジトリを fork し、デバッグ実行用のブレイクポイントを FluxTransformer2DModel.forward の先頭に追加しました。ブレイクポイントは下記の行を追加することでセットしています。これを追加しただけのブランチを investigate_flux.1_schnell というブランチ名で用意しました。

import ipdb; ipdb.set_trace()

こちらのリンク先のコードセルを順に実行していくと、最後のコードセルを実行したときに下記のログのようなデバッグ実行を試すことができます。

> /content/diffusers/src/diffusers/models/transformers/transformer_flux.py(680)forward()
    679 
--> 680         if joint_attention_kwargs is not None:
    681             joint_attention_kwargs = joint_attention_kwargs.copy()

ipdb> c
> /content/diffusers/src/diffusers/models/transformers/transformer_flux.py(680)forward()
    679 
--> 680         if joint_attention_kwargs is not None:
    681             joint_attention_kwargs = joint_attention_kwargs.copy()

ipdb> c
> /content/diffusers/src/diffusers/models/transformers/transformer_flux.py(680)forward()
    679 
--> 680         if joint_attention_kwargs is not None:
    681             joint_attention_kwargs = joint_attention_kwargs.copy()

ipdb> c
> /content/diffusers/src/diffusers/models/transformers/transformer_flux.py(680)forward()
    679 
--> 680         if joint_attention_kwargs is not None:
    681             joint_attention_kwargs = joint_attention_kwargs.copy()

ipdb> u
> /usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py(1786)_call_impl()
   1785                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1786             return forward_call(*args, **kwargs)
   1787 

ipdb> u
> /usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py(1775)_wrapped_call_impl()
   1774         else:
-> 1775             return self._call_impl(*args, **kwargs)
   1776 

ipdb> u
> /content/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py(944)__call__()
    943                 with self.transformer.cache_context("cond"):
--> 944                     noise_pred = self.transformer(
    945                         hidden_states=latents,

ipdb> until 1004
> /content/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py(1004)__call__()
   1003         else:
-> 1004             latents = self._unpack_latents(latents, height, width, self.vae_scale_factor)
   1005             latents = (latents / self.vae.config.scaling_factor) + self.vae.config.shift_factor

ipdb> p latents.shape
torch.Size([1, 4096, 64])
ipdb> n
> /content/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py(1005)__call__()
   1004             latents = self._unpack_latents(latents, height, width, self.vae_scale_factor)
-> 1005             latents = (latents / self.vae.config.scaling_factor) + self.vae.config.shift_factor
   1006             image = self.vae.decode(latents, return_dict=False)[0]

ipdb> pp latents.shape
torch.Size([1, 16, 128, 128])
ipdb> n
> /content/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py(1006)__call__()
   1005             latents = (latents / self.vae.config.scaling_factor) + self.vae.config.shift_factor
-> 1006             image = self.vae.decode(latents, return_dict=False)[0]
   1007             image = self.image_processor.postprocess(image, output_type=output_type)

ipdb> p latents.shape
torch.Size([1, 16, 128, 128])
ipdb> n
> /content/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py(1007)__call__()
   1006             image = self.vae.decode(latents, return_dict=False)[0]
-> 1007             image = self.image_processor.postprocess(image, output_type=output_type)
   1008 

ipdb> whatis image
<class 'torch.Tensor'>
ipdb> p image.shape
torch.Size([1, 3, 1024, 1024])
ipdb> n
> /content/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py(1010)__call__()
   1009         # Offload all models
-> 1010         self.maybe_free_model_hooks()
   1011 

ipdb> pp image
[<PIL.Image.Image image mode=RGB size=1024x1024 at 0x7BC4AE734C80>]
ipdb> c

3. fork 環境のメモ

備忘録のようなメモになります。

3.1. GitHub を fork

Hugging Face のdiffusers の GitHub リポジトリ右上の fork ボタンをクリックし、fork しました。

3.2. fork したリポジトリを git clone してコマンドを実行

Windows 11 の GitBash で下記のコマンドを順に実行しました。

3.2.1. git clone

$ git clone git@github.com:fukagai-takuya/diffusers.git
$ cd diffusers/

3.2.2. diffusers の修正を fork したリポジトリに反映させるための準備

$ git remote add upstream https://github.com/huggingface/diffusers.git
$ git remote -v
origin  git@github.com:fukagai-takuya/diffusers.git (fetch)
origin  git@github.com:fukagai-takuya/diffusers.git (push)
upstream        https://github.com/huggingface/diffusers.git (fetch)
upstream        https://github.com/huggingface/diffusers.git (push)

3.2.3. diffusers の main ブランチの更新を fork したリポジトリの main ブランチに反映させる

diffusers のスクリプトを試すときは、下記のコマンドを実行し、fork したリポジトリの main ブランチに最新の更新を反映させます。

$ git fetch upstream
$ git checkout main
$ git merge upstream/main
$ git push origin main

3.2.4. ブレイクポイントをセットしただけのテスト用ブランチの作成

下記のコマンドで新しいブランチ investigate_flux.1_schnell を作成します。

$ git checkout -b investigate_flux.1_schnell
Switched to a new branch 'investigate_flux.1_schnell'

エディタで src/diffusers/models/transformers/transformer_flux.py を編集し、下記のようにブレイクポイントをセットする行を追加しました。

$ git diff
diff --git a/src/diffusers/models/transformers/transformer_flux.py b/src/diffusers/models/transformers/transformer_flux.py
index 16c526f43..95a468200 100644
--- a/src/diffusers/models/transformers/transformer_flux.py
+++ b/src/diffusers/models/transformers/transformer_flux.py
@@ -675,6 +675,8 @@ class FluxTransformer2DModel(
             If `return_dict` is True, an [`~models.transformer_2d.Transformer2DModelOutput`] is returned, otherwise a
             `tuple` where the first element is the sample tensor.
         """
+        import ipdb; ipdb.set_trace()
+
         if joint_attention_kwargs is not None:
             joint_attention_kwargs = joint_attention_kwargs.copy()
             lora_scale = joint_attention_kwargs.pop("scale", 1.0)

下記のコマンドで修正を加えたブランチ investigate_flux.1_schnell を fork したリポジトリに push しました。

$ git add .
$ git commit -m "Added ipdb.set_trace() at FluxTransformer2DModel.forward"
$ git push origin investigate_flux.1_schnell

1. 概要

2. Debug 実行を試すためのサンプルの Google Colab のページ

3. fork 環境のメモ

返信を残す返信をキャンセル