# Latest-dependency CosyVoice stack for the Ubuntu 26.04 Quadro M4000 CUDA lane. # # Keep the stack loose so pip can take current compatible releases, but cap # transformers below 4.53.0. Matrix verification in the Arc B50 and M4000 # workspaces showed: # - transformers 4.52.4: generated the requested text correctly # - transformers 4.53.0 and newer tested through 4.57.6: GPU run completed, # but generated semantically broken speech unrelated to the input text huggingface_hub>=0.30.0 conformer>=0.3.2 diffusers>=0.29.0 gradio>=5.4.0 gdown>=5.1.0 hydra-core>=1.3.2 HyperPyYAML>=1.2.3 inflect>=7.3.1 librosa>=0.10.2 lightning>=2.2.4 matplotlib>=3.7.5 modelscope>=1.20.0 networkx>=3.1 numpy>=1.26.4 omegaconf>=2.3.0 onnx>=1.16.0 onnxruntime>=1.18.0 openai-whisper>=20250625 pyarrow>=18.1.0 pyworld>=0.3.4 rich>=13.7.1 soundfile>=0.12.1 torchcodec>=0.13,<0.14 transformers>=4.51.3,<4.53 x-transformers>=2.11.24 wetext>=0.0.4 wget>=3.2