voipnuggets · akashjss · Feb 27, 2025 · Feb 27, 2025 · Feb 27, 2025 · Feb 28, 2025
diff --git a/README.md b/README.md
@@ -1,17 +1,21 @@
-# Flux Generator: macOS MLX-Powered Image & Music Generation with Open WebUI compatable API for Image generation
+# Flux Generator: macOS MLX-Powered Image, Music & Video Generation with Open WebUI compatable API for Image generation
 
 ## Features
 
 - Text-to-image generation
-- Text-to-music generation (NEW!)
+- Text-to-music generation
+- Text-to-video generation (NEW!)
 - Multiple model options:
   - Image Generation:
     - black-forest-labs Flux schnell/dev
     - stabilityai sdxl-turbo/stable-diffusion-2-1
   - Music Generation:
     - facebook/musicgen-medium
+  - Video Generation:
+    - Wan-AI/Wan2.1-T2V-1.3B
 - Customizable image size and generation parameters
 - Advanced music generation controls
+- Video generation with adjustable frames, resolution, and parameters
 - Memory usage reporting
 - API compatibility for Image generation for third-party UIs like Open WebUI
 - Unified server for both UI and API
@@ -22,26 +26,37 @@
 This repository utilizes the MLX framework, designed specifically for Apple Silicon, to provide optimized performance for:
 - Black Forest Flux and Stable Diffusion image generation
 - Facebook's MusicGen audio generation
+- Wan-AI's text-to-video generation
 
 MLX leverages the unified memory architecture of Apple's M-series chips, enabling faster and more efficient computations.
 
 ### Why MLX?
 
 * **Performance:** Experience significant speed improvements compared to other frameworks on Apple Silicon.
-* **Local Execution:** Run Stable Diffusion models locally on your Mac, ensuring data privacy and enabling offline use.
+* **Local Execution:** Run models locally on your Mac, ensuring data privacy and enabling offline use.
 * **Fine-Tuning:** MLX provides a great environment for fine-tuning models on apple silicon.
+* **Memory Efficiency:** Optimized memory management for large models and video generation.
 
 For more examples of what MLX can do, check out the official mlx-examples repository: [https://github.com/ml-explore/mlx-examples](https://github.com/ml-explore/mlx-examples)
 
-This repository is designed to give apple silicon users a fast and easy way to generate images locally.
+This repository is designed to give apple silicon users a fast and easy way to generate content locally.
 
 ## UI Screenshots:
 ![Flux image generation Tab in UI](flux_app_ui_imagegen.jpg)
 ![Flux music generation Tab in UI](flux_app_ui_musicgen.jpg)
+![Flux video generation Tab in UI](flux_app_ui_videogen.jpg)
 
+## System Requirements
 
-## Example Generation
+- macOS with Apple Silicon (M1/M2/M3)
+- Python 3.10+ (tested with python3.11)
+- MLX framework
+- 32GB RAM recommended (especially for video generation)
+- Additional libraries for audio and video processing
 
+## Example Generations
+
+### Image Generation
 Here's an example image generated using the Flux model:
 
 ![Moonset over ocean](generated_moonset.png)
@@ -53,6 +68,19 @@ Parameters:
 - Steps: 2
 - CFG Scale: 4.0
 
+### Video Generation
+Parameters for optimal video generation:
+- Number of frames: 16-32
+- Resolution: 256x256 to 512x512
+- Steps: 50 (recommended)
+- Guidance Scale: 7.5 (adjustable)
+- FPS: 8 (default)
+
+Example prompts for video generation:
+- "A blooming flower timelapse, highly detailed"
+- "Waves crashing on a beach at sunset"
+- "Space journey through colorful nebulas"
+
 ## Requirements
 
 - macOS with Apple Silicon (M1/M2/M3)
@@ -179,9 +207,11 @@ Once the server is running (either via `run_flux.sh` or manually):
 2. Choose your desired generation mode:
    - 🖼️ Image Generation: Enter a prompt, select a model and click generate
    - 🎵 Music Generation: Enter a music description and adjust parameters
+   - 🎥 Video Generation: Enter a video description and adjust parameters
 3. On first use, models will be downloaded:
    - Image models: approximately 30 GB
    - MusicGen model: approximately 3.5 GB
+   - Video model: approximately 5.68 GB
 4. Download progress will be visible in the terminal
 5. Once downloaded, generation will begin
 
@@ -201,9 +231,19 @@ The music generation interface provides several parameters to control the output
 - **Top K**: Controls diversity of the output (50-500)
 - **Guidance Scale**: Controls how closely to follow the prompt (1.0-10.0)
 
+### Video Generation Parameters
+
+The video generation interface provides several parameters to control the output:
+
+- **Number of frames**: Controls the length of the generated video (16-32)
+- **Resolution**: Controls the resolution of the generated video (256x256 to 512x512)
+- **Steps**: Controls the number of steps for video generation (50 recommended)
+- **Guidance Scale**: Controls how closely to follow the prompt (7.5 adjustable)
+- **FPS**: Controls the frames per second of the generated video (8 default)
+
 ## API Integration
 
-The application provides an API that can be used with third-party UIs like Open WebUI.
+The application provides an image generation API that can be used with third-party UIs like Open WebUI.
 Check this tutorial for Open WebUI integration instructions:
 [Tutorial](https://voipnuggets.com/2025/02/18/flux-generator-local-image-generation-on-apple-silicon-with-open-webui-integration-using-flux-llm/
 )
@@ -345,13 +385,17 @@ The Flux server requires model files to be downloaded before use. You can downlo
 
    # Download MusicGen model
    huggingface-cli download facebook/musicgen-medium
+
+   # Download Video Generation model
+   huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B
    ```
 
 3. Using the command-line interface:
-   Note: Each Flux model is approximately 24GB in size, the SD models are bigger. The download includes:
-   - Model weights (flux1-{model}.safetensors)
-   - Autoencoder (ae.safetensors)
-   - Text encoders and tokenizers
+   Note: Total Model files sizes:
+   - Each Flux model: ~32GB
+   - Each Stable Diffusion model: ~34GB
+   - MusicGen model: ~11GB
+   - Video model: ~22GB
 
    # Download command for all models
    ```bash
@@ -360,6 +404,7 @@ The Flux server requires model files to be downloaded before use. You can downlo
    huggingface-cli download stabilityai/stable-diffusion-2-1-base
    huggingface-cli download stabilityai/sdxl-turbo
    huggingface-cli download facebook/musicgen-medium
+   huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B
    ```
 
 Model Repos:
@@ -368,6 +413,7 @@ https://huggingface.co/black-forest-labs/FLUX.1-dev
 https://huggingface.co/stabilityai/stable-diffusion-2-1-base
 https://huggingface.co/stabilityai/sdxl-turbo
 https://huggingface.co/facebook/musicgen-medium
+https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B
 
 Model files are stored in the HuggingFace cache directory (`~/.cache/huggingface/hub/`).
 

diff --git a/build.py b/build.py
@@ -0,0 +1,116 @@
+"""
+PyInstaller build script for Flux Generator
+
+Prerequisites:
+> python3 -m venv venv
+> source venv/bin/activate (or source venv/bin/activate.fish for fish shell)
+> python3 -m pip install -r requirements.txt
+> python3 -m pip install pyinstaller
+> python3 build.py
+
+Platform specific libraries that MIGHT be needed:
+MacOS:
+- brew install portaudio
+"""
+
+import os
+import platform
+import sys
+import PyInstaller.__main__
+
+def build(signing_key=None):
+    app_name = 'Flux\\ Generator'
+    compile(signing_key)
+
+    macos = platform.system() == 'Darwin'
+    if macos and signing_key:
+        # Codesign
+        os.system(
+            f'codesign --deep --force --verbose --sign "{signing_key}" dist/{app_name}.app --options runtime')
+
+        zip_name = zip()
+
+        if signing_key:
+            keychain_profile = signing_key.split('(')[0].strip()
+            # Notarize
+            os.system(f'xcrun notarytool submit --wait --keychain-profile "{keychain_profile}" --verbose dist/{zip_name}')
+            input(f'Check whether notarization was successful using \n\t xcrun notarytool history --keychain-profile {keychain_profile}.\nYou can check debug logs using \n\t xcrun notarytool log --keychain-profile "{keychain_profile}" <run-id>')
+
+            # Staple
+            os.system(f'xcrun stapler staple dist/{app_name}.app')
+
+            # Zip the signed, stapled file
+            zip_name = zip()
+
+def compile(signing_key=None):
+    # Path to main application script
+    app_script = 'flux_app.py'
+
+    # Common PyInstaller options
+    pyinstaller_options = [
+        '--clean',
+        '--noconfirm',
+
+        # --- Basics --- #
+        '--name=Flux Generator',
+        '--icon=resources/icon.png',  # You'll need to create this
+        '--windowed',  # GUI application
+
+        # Where to find necessary packages
+        '--paths=./venv/lib/python3.11/site-packages',
+
+        # Required imports
+        '--hidden-import=mlx',
+        '--hidden-import=gradio',
+        '--hidden-import=fastapi',
+        '--hidden-import=transformers',
+        '--hidden-import=huggingface_hub',
+        '--hidden-import=soundfile',
+        '--hidden-import=scipy',
+        '--hidden-import=tqdm',
+
+        # Static files and resources
+        '--add-data=musicgen:musicgen',
+        '--add-data=resources/*:resources',
+
+        # Main script
+        app_script
+    ]
+
+    # Platform-specific options
+    if platform.system() == 'Darwin':  # MacOS
+        if signing_key:
+            pyinstaller_options.extend([
+                f'--codesign-identity={signing_key}'
+            ])
+
+    # Run PyInstaller
+    PyInstaller.__main__.run(pyinstaller_options)
+    print('Done. Check dist/ for executables.')
+
+def zip():
+    # Zip the app
+    print('Zipping the executables')
+    app_name = 'Flux\\ Generator'
+    zip_name = 'Flux-Generator'
+
+    if platform.system() == 'Darwin':  # MacOS
+        if platform.processor() == 'arm':
+            zip_name = zip_name + '-MacOS-M-Series' + '.zip'
+        else:
+            zip_name = zip_name + '-MacOS-Intel' + '.zip'
+        # Special zip command for macos to keep the complex directory metadata intact
+        zip_cli_command = 'cd dist/; ditto -c -k --sequesterRsrc --keepParent ' + app_name + '.app ' + zip_name
+
+    os.system(zip_cli_command)
+    return zip_name
+
+if __name__ == '__main__':
+    apple_code_signing_key = None
+    if len(sys.argv) > 1:
+        apple_code_signing_key = sys.argv[1]  # python3 build.py "Developer ID Application: ... (...)"
+        print("apple_code_signing_key: ", apple_code_signing_key)
+    elif len(sys.argv) == 1 and platform.system() == 'Darwin':
+        input("Are you sure you don't want to sign your code? ")
+
+    build(apple_code_signing_key) 
diff --git a/create_icns.sh b/create_icns.sh
@@ -0,0 +1,25 @@
+#!/bin/bash
+
+# Create iconset directory
+mkdir -p resources/icon.iconset
+
+# Copy PNG files to iconset with correct names
+cp resources/icon_16x16.png resources/icon.iconset/icon_16x16.png
+cp resources/icon_32x32.png resources/icon.iconset/icon_32x32.png
+cp resources/icon_32x32.png resources/icon.iconset/icon_16x16@2x.png
+cp resources/icon_64x64.png resources/icon.iconset/icon_32x32@2x.png
+cp resources/icon_128x128.png resources/icon.iconset/icon_128x128.png
+cp resources/icon_256x256.png resources/icon.iconset/icon_256x256.png
+cp resources/icon_256x256.png resources/icon.iconset/icon_128x128@2x.png
+cp resources/icon_512x512.png resources/icon.iconset/icon_512x512.png
+cp resources/icon_512x512.png resources/icon.iconset/icon_256x256@2x.png
+cp resources/icon_1024x1024.png resources/icon.iconset/icon_512x512@2x.png
+
+# Create icns file
+iconutil -c icns resources/icon.iconset
+
+# Clean up temporary files
+rm -rf resources/icon.iconset
+rm resources/icon_*.png
+
+echo "Created resources/icon.icns" 
diff --git a/create_icon.py b/create_icon.py
@@ -0,0 +1,60 @@
+from PIL import Image, ImageDraw
+import os
+
+def create_icon(size=1024):
+    # Create a new image with a white background
+    icon = Image.new('RGBA', (size, size), (255, 255, 255, 0))
+    draw = ImageDraw.Draw(icon)
+
+    # Colors
+    primary_color = (64, 156, 255)    # Blue for water/flux
+    accent_color = (255, 128, 64)     # Orange for music/energy
+
+    # Calculate dimensions
+    padding = size // 8
+    center = size // 2
+    radius = (size - 2 * padding) // 2
+
+    # Draw the main circular gradient (representing flux/flow)
+    for r in range(radius, 0, -1):
+        alpha = int(255 * (r / radius))
+        color = (*primary_color, alpha)
+        draw.ellipse(
+            [center - r, center - r, center + r, center + r],
+            fill=color
+        )
+
+    # Draw the music wave pattern
+    wave_height = size // 4
+    wave_width = size // 16
+    wave_y = center + radius // 2
+
+    # Draw three wave bars with varying heights
+    heights = [0.8, 1.0, 0.6]  # Relative heights for visual interest
+    for i, height in enumerate(heights):
+        x = center + (i - 1) * (wave_width * 2)
+        h = int(wave_height * height)
+        draw.rounded_rectangle(
+            [x - wave_width//2, wave_y - h//2,
+             x + wave_width//2, wave_y + h//2],
+            radius=wave_width//4,
+            fill=(*accent_color, 200)
+        )
+
+    # Save in different sizes for macOS
+    if not os.path.exists('resources'):
+        os.makedirs('resources')
+
+    # Save the main icon
+    icon.save('resources/icon.png')
+
+    # Create .icns compatible sizes
+    sizes = [16, 32, 64, 128, 256, 512, 1024]
+    for s in sizes:
+        resized = icon.resize((s, s), Image.Resampling.LANCZOS)
+        resized.save(f'resources/icon_{s}x{s}.png')
+
+    print("Icon files generated in resources/ directory")
+
+if __name__ == '__main__':
+    create_icon()