HivisionIDPhotos: AI-Powered ID Photo Generator
A lightweight, offline-capable AI tool for automatic ID photo generation with portrait matting, background replacement, and layout photo creation.
- Step 1
Project Overview
HivisionIDPhotos is an intelligent algorithm system for producing ID photos. It uses AI models for face detection, portrait matting, and automatic photo adjustments to generate standard ID photos from user images.
Key capabilities:
- Lightweight matting (purely offline, CPU-only inference)
- Generates standard ID photos and layout photos in various sizes
- Supports pure offline or edge-cloud inference
- Multiple matting model options for different use cases
- FastAPI and Gradio interfaces for API and web usage
- Step 2
Environment Requirements
Ensure your system meets the following requirements before installation:
- Python: >= 3.7 (primarily tested on Python 3.10)
- Operating System: Linux, Windows, or macOS
- Memory: At least 4GB RAM (16GB+ recommended for beast mode)
- Disk Space: ~500MB for the project and model weights
python --version # Should be 3.7 or higher - Step 3
Clone the Repository
Clone the HivisionIDPhotos repository from GitHub:
git clone https://github.com/Zeyi-Lin/HivisionIDPhotos.git cd HivisionIDPhotos - Step 4
Set Up Python Virtual Environment
Create and activate a Python virtual environment. Using conda or venv is recommended to isolate dependencies.
# Using conda (recommended) conda create -n hivision python=3.10 conda activate hivision # Or using venv python -m venv hivision_env source hivision_env/bin/activate # On Linux/macOS # source hivision_env\Scripts\activate # On Windows - Step 5
Install Dependencies
Install the core dependencies and application dependencies. The project is split into base requirements (core functionality) and app requirements (Gradio/FastAPI for web interface).
pip install -r requirements.txt pip install -r requirements-app.txt - Step 6
Download Model Weights
The project requires pre-trained model weights for matting and face detection. You can either download them using a script or manually.
# Method 1: Use the download script python scripts/download_model.py --models all # Method 2: Download manually and place in hivision/creator/weights/ - Step 7
Manual Model Download (Optional)
If the download script does not work, download these models manually and save them to
hivision/creator/weights/:| Model File | Size | Source | Description | |---|---|---|---| |
modnet_photographic_portrait_matting.onnx| 24.7MB | MODNet | Official matting weights | |hivision_modnet.onnx| 24.7MB | Release | Improved matting for color backgrounds | |rmbg-1.4.onnx| 176.2MB | BRIA AI | High-quality matting | |birefnet-v1-lite.onnx| 224MB | BiRefNet | Best-quality matting, GPU required⚠ Heads up: At least one matting model is required to run the project. HIVISION_MODNET is the default and recommended for CPU-only inference. - Step 8
Face Detection Model Setup (Optional)
By default, the project uses MTCNN for face detection (offline, fast, works on CPU). You can also use RetinaFace (higher accuracy, slower CPU) or Face++ (online API, highest accuracy).
RetinaFace Setup (Offline, High Accuracy, Moderate CPU Speed):
# Download RetinaFace weights and place in hivision/creator/retinaface/weights/ curl -L https://github.com/Zeyi-Lin/HivisionIDPhotos/releases/download/pretrained-model/retinaface-resnet50.onnx \ -o hivision/creator/retinaface/weights/retinaface-resnet50.onnx - Step 9
GPU Acceleration Setup (Optional)
For NVIDIA GPU acceleration with the
birefnet-v1-litemodel (~16GB VRAM recommended), install CUDA-enabled libraries:For CUDA 12.x and cuDNN 8:
pip install onnxruntime-gpu==1.18.0 pip install torch --index-url https://download.pytorch.org/whl/cu121⚠ Heads up: CUDA installations are backward compatible. If you have CUDA 12.6 but the available torch build is for 12.4, you can still install the 12.4 version. - Step 10
Run Gradio Demo
Launch the interactive web interface for generating ID photos. This is the simplest way to use the tool.
python app.py - Step 11
Using the Gradio Interface
After running
app.py, open http://127.0.0.1:7860 in your browser. The interface allows you to:- Upload a photo
- Choose output size (standard sizes for various countries)
- Select matting model
- Choose background color
- Apply beauty effects
- Generate layout photos (6-inch, A4, etc.)
- Enable face alignment and rotation correction
⚠ Heads up: If you see an error about missing models, ensure you downloaded at least one matting model and placed it in the `hivision/creator/weights/` directory. - Step 12
Python Inference CLI
Use the command-line interface for batch processing or scripting:
1. Create ID Photo:
python inference.py \ -i demo/images/test0.jpg \ -o ./output.png \ --height 413 \ --width 295 - Step 13
CLI Inference Options
2. Portrait Matting Only (extract person from background):
python inference.py -t human_matting \ -i demo/images/test0.jpg \ -o ./matting.png \ --matting_model hivision_modnet - Step 14
Add Background to Transparent Image
3. Add Background Color to Transparent PNG:
python inference.py -t add_background \ -i ./output.png \ -o ./output_colored.jpg \ -c 4f83ce \ -k 30 \ -r 1 - Step 15
Generate Layout Photo
4. Create Six-Inch Layout Photo (for ID card printing):
python inference.py -t generate_layout_photos \ -i ./output_colored.jpg \ -o ./layout.jpg \ --height 413 \ --width 295 \ -k 200 - Step 16
Deploy FastAPI Backend
Run the project as an API service for programmatic access:
python deploy_api.py - Step 17
API Service Features
The FastAPI backend provides RESTful endpoints for:
- ID photo generation
- Portrait matting
- Background color addition
- Layout photo creation
- Batch processing support
CURL Request Example:
curl -X POST "http://localhost:8080/api/v1/idphoto" \ -F "image=@path/to/photo.jpg" \ -F "height=413" \ -F "width=295" \ -F "matting_model=hivision_modnet" \ -F "background_color=4f83ce" - Step 18
Docker Deployment
Deploy the application using Docker for consistent environments across systems.
# Pull the official image docker pull linzeyi/hivision_idphotos # Or build from local Dockerfile (after placing model weights) docker build -t linzeyi/hivision_idphotos . - Step 19
Run Docker Containers
Run Both Simultaneously:
docker compose up -d - Step 20
Environment Variables
The project supports several configuration options via environment variables:
| Variable | Type | Description | |---|---|---| |
FACE_PLUS_API_KEY| Optional | Face++ API key for online face detection | |FACE_PLUS_API_SECRET| Optional | Face++ API secret | |RUN_MODE| Optional | Set tobeastfor faster inference (models stay in memory) |docker run -d -p 7860:7860 \ -e FACE_PLUS_API_KEY=your_key \ -e FACE_PLUS_API_SECRET=your_secret \ -e RUN_MODE=beast \ linzeyi/hivision_idphotos - Step 21
Performance Reference
Benchmark results on Mac M1 Max 64GB (CPU only, non-GPU acceleration):
| Model Combination | Memory Usage | Inference Time (512x715) | Inference Time (764x1146) | |---|---|---|---| | MODNet + MTCNN | 410MB | 0.207s | 0.246s | | MODNet + RetinaFace | 405MB | 0.571s | 0.971s | | BiRefNet-lite + RetinaFace | 6.20GB | 7.063s | 7.128s |
⚠ Heads up: BiRefNet-lite requires significant memory (~6GB) and works best with GPU acceleration. - Step 22
Technology Stack
Key technologies and frameworks used:
| Category | Technology | Purpose | |---|---|---| | Language | Python 3.7+ | Core implementation | | Deep Learning | ONNX Runtime | Model inference | | Image Processing | OpenCV | Image manipulation | | Matting Models | MODNet, RMBG-1.4, BiRefNet | Portrait extraction | | Face Detection | MTCNN, RetinaFace, Face++ | Face localization | | Web Framework | FastAPI | REST API backend | | UI Framework | Gradio | Web interface | | Containerization | Docker | Deployment | | Machine Learning | NumPy, PIL | Numerical operations |
- Step 23
Model Architecture Details
The matting models use deep neural networks:
MODNet (Hivision Variant):
- Lightweight architecture optimized for real-time performance
- Runs efficiently on CPU
- Good balance of speed and quality
RMBG-1.4 (BRIA AI):
- Vision Transformer (ViT)-based architecture
- Higher quality matting
- Slower inference (177MB model)
BiRefNet V1-lite:
- Bidirectional refinement network
- State-of-the-art matting quality
- Requires GPU for practical inference speed
# View available models in the codebase from hivision.creator.choose_handler import HUMAN_MATTING_MODELS print(HUMAN_MATTING_MODELS) - Step 24
Troubleshooting
Issue 3: CUDA/GPU not working Solution: Verify cuDNN is installed or try CPU-only mode with
onnxruntimeinstead ofonnxruntime-gpu.python app.py --port 8080 --host 0.0.0.0 - Step 25
Advanced Customization
2. Modify preset colors: Edit
demo/assets/color_list_EN.csv(name, hex)3. Add custom watermark fonts: Place font files in
hivision/plugin/font/and updatehivision/plugin/watermark.py.Standard,413,295 One inch,567,413 Two inches,626,413 - Step 26
Community Projects and Extensions
Several community-built extensions exist:
- HivisionIDPhotos-ComfyUI: ComfyUI workflow for ID photo processing
- HivisionIDPhotos-cpp: C++ version for better performance
- HivisionIDPhotos-windows-GUI: Windows desktop app
- HivisionIDPhotos-wechat-weapp: WeChat mini program
# Explore community projects at: # https://github.com/Zeyi-Lin/HivisionIDPhotos
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.