使用NEU的TurboVNC

NEU的TurboVNC：

https://ood.explorer.northeastern.edu/pun/sys/dashboard/batch_connect/sys/desktop-native-courses/session_contexts/new

当然新建的时候选gpu版，并且选择合适的核数和内存，就不多说了

根据检查，此VNC是Rocky Linux9.3版本，包含python3.9.18，CUDA version是12.3，但其他深度学习框架基本上全都没有

具有gcc11.5，wget和curl，git是2.39.3

首先安装miniconda

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p ~/miniconda3

然后

#初始化
~/miniconda3/bin/conda init
#重新加载bash
source ~/.bashrc
#查看是否安装成功
conda --version

此时可以观察到conda自带的python是3.13.11，为了稳定性考虑则使用3.11

但直接create环境会出现报错：CondaToSNonInteractiveError

经过查询，原因是需要先接受用户协议

conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/msys2

检查是否成功

conda config --show channels

然后创建3.11环境

conda create -n llm-py311 python=3.11 -y
conda activate llm-py311
# 验证
python --version
which python

安装需要的环境

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia -y

或者使用pip

pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121

然后安装hugging face和相关库

pip install transformers==4.40.0 datasets==2.18.0 accelerate==0.28.0
pip install evaluate==0.4.1 peft==0.10.0 trl==0.8.6
pip install bitsandbytes==0.43.0 xformers==0.0.26 flash-attn==2.6.3
pip install numpy pandas matplotlib scikit-learn scipy seaborn

安装监控和工具

pip install tensorboard==2.16.0 wandb==0.17.9 tqdm
pip install safetensors==0.4.5 protobuf==5.28.1 huggingface-hub==0.25.2
pip install nltk==3.9.1 sentencepiece==0.2.0 tokenizers==0.19.1
pip install ipython jupyter jupyterlab notebook
pip install black flake8 pylint isort

配置Jupyter

python -m ipykernel install --user --name llm --display-name "LLM Training (PyTorch 2.3)"
jupyter notebook --generate-config

2025

12 24

Older

点击