A streamlined, Mac-native port of Tencent's text-to-video AI model, optimized specifically for Apple Silicon. Generate high-quality videos from text descriptions with native performance on your M1/M2/M3 Mac.
This project makes HunyuanVideo accessible on Apple Silicon Macs by:
- Leveraging native Metal acceleration through MLX
- Optimizing memory usage for Mac hardware
- Simplifying the setup process
- Providing Mac-specific performance tuning
# One-line setup
/install_mlx.sh
# Download model weights (requires Hugging Face token)
python download_weights.py
# Copy environment example and configure
cp .env.example .env
# Add your token: HF_TOKEN=your_token_here
Before running download_weights.py
, make sure you have:
- A Hugging Face account and access token (get it from https://huggingface.co/settings/tokens)
- Created a
.env
file with your token and memory settings
The system now uses an optimized generation approach with consistent fp16 precision:
# Memory-efficient settings (recommended)
python sample_video_mps.py \
--mmgp-mode \
--mmgp-config configs/mmgp_mlx.json \
--video-size 544 960 \
--prompt "your prompt here" \
--video-length 13 \
--precision fp16 \
--vae-precision fp16 \
--text-encoder-precision fp16 \
--text-encoder-precision-2 fp16
# After stable generation, try:
python sample_video_mps.py \
--mmgp-mode \
--mmgp-config configs/mmgp_mlx.json \
--video-size 720 720 \
--prompt "your prompt here" \
--video-length 25 \
--precision fp16 \
--vae-precision fp16 \
--text-encoder-precision fp16 \
--text-encoder-precision-2 fp16
The model has specific requirements for video length:
- Must satisfy: (video_length - 1) % 4 == 0
- Valid lengths are: 5, 9, 13, 17, 21, 25, etc.
- Single frame generation (video_length=1) is also supported
The optimized generation system provides:
- Consistent fp16 precision across all components
- Aggressive memory cleanup between operations
- Conservative memory watermark settings
- VAE tiling support
- Forced autocast for mixed precision
Start with smaller resolutions and gradually increase based on stability:
RAM | Initial Resolution | Max Resolution | Video Length |
---|---|---|---|
32GB | 544x960 (540p) | 720x720 | 13 frames |
64GB | 544x960 (540p) | 720x1280 | 21 frames |
64GB+ | 720x720 | 720x1280 | 25 frames |
# Conservative settings (recommended for all configurations)
PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.3
PYTORCH_MPS_LOW_WATERMARK_RATIO=0.2
For optimal performance:
- Start with 540p resolution (544x960)
- Use shorter video length (13 frames)
- Close other applications
- Monitor resources with
python monitor_resources.py
- Clear Python environment between runs
- Gradually increase settings once stable
The included Gradio interface provides:
- Memory-optimized default settings
- Resolution recommendations
- Real-time feedback
- Detailed error messages with optimization suggestions
Run the web interface:
python gradio_server.py
- Mac-Native Performance: Built specifically for Apple Silicon using Metal acceleration
- Enhanced Memory Optimization:
- Consistent fp16 precision across components
- Aggressive memory cleanup
- Conservative watermark settings
- Forced autocast for mixed precision
- VAE tiling support
- Multiple Resolutions: Support for various video sizes and aspect ratios
- Real-time Monitoring: Built-in resource monitoring
- Easy Setup: Streamlined installation process
- Start with 544x960 resolution
- Use fp16 precision for memory efficiency
- Conservative watermark ratio (0.3)
- Memory-efficient processing
- Apple Silicon Mac (M1/M2/M3)
- macOS 12.3 or later
- 32GB RAM minimum
- Python 3.10 or later
If you encounter memory issues:
- Ensure conservative memory settings are used:
PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.3 PYTORCH_MPS_LOW_WATERMARK_RATIO=0.2
- Start with 544x960 resolution
- Use 13 frames video length
- Monitor memory with
python monitor_resources.py
- Clear Python environment between runs
- Close other applications
If you encounter precision errors:
- Ensure all precision flags are set to fp16:
--precision fp16 \ --vae-precision fp16 \ --text-encoder-precision fp16 \ --text-encoder-precision-2 fp16
- Use valid video lengths: 5, 9, 13, 17, 21, 25, etc.
- Clear Python environment between runs
Licensed under the Tencent Hunyuan Community License - see LICENSE.txt for details.
- Original HunyuanVideo by Tencent
- MLX framework by Apple