bash scripts/train/dummy_run.sh #5

KingBoyAndGirl · 2024-04-24T06:35:24Z

An error occurred running the command under v6.1:
Bash scripts/train/dummy_run.sh

bash scripts/train/dummy_run.sh 
Current working directory: /code/VisualRWKV/VisualRWKV-v6/v6.1
INFO:pytorch_lightning.utilities.rank_zero:########## work in progress ##########
[2024-04-24 06:32:01,729] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  NVIDIA Inference is only supported on Ampere and newer architectures
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.2
 [WARNING]  using untested triton version (2.2.0), only 1.0.0 is known to be compatible
INFO:pytorch_lightning.utilities.rank_zero:
############################################################################
#
# RWKV-5 BF16 on 1x1 GPU, bsz 1x1x8=8, deepspeed_stage_2
#
# Data = dummy_data/dummy.json (json), ProjDir = out/dummy
#
# Epoch = 0 to 0 (will continue afterwards), save every 1 epoch
#
# Each "epoch" = 1000 steps, 8000 samples, 2048000 tokens
#
# Model = 6 n_layer, 512 n_embd, 256 ctx_len
#
# Adam = lr 1e-05 to 1e-05, warmup 0 steps, beta (0.9, 0.99), eps 1e-08
#
# Found torch 2.2.2+cu121, recommend 1.13.1+cu117 or newer
# Found deepspeed 0.14.2, recommend 0.7.0 (faster than newer versions)
# Found pytorch_lightning 1.9.4, recommend 1.9.5
#
############################################################################

INFO:pytorch_lightning.utilities.rank_zero:{'load_model': '', 'model_path': None, 'wandb': '', 'proj_dir': 'out/dummy', 'random_seed': -1, 'data_file': 'dummy_data/dummy.json', 'data_type': 'json', 'vocab_size': 65536, 'ctx_len'
: 256, 'epoch_steps': 1000, 'epoch_count': 1, 'epoch_begin': 0, 'epoch_save': 1, 'micro_bsz': 8, 'n_layer': 6, 'n_embd': 512, 'dim_att': 512, 'dim_ffn': 1792, 'pre_ffn': 0, 'head_size_a': 64, 'head_size_divisor': 8, 'lr_init': 1
e-05, 'lr_final': 1e-05, 'warmup_steps': 0, 'beta1': 0.9, 'beta2': 0.99, 'adam_eps': 1e-08, 'grad_cp': 0, 'dropout': 0, 'weight_decay': 0, 'weight_decay_final': -1, 'ds_bucket_mb': 200, 'vision_tower_name': 'dummy', 'image_folde
r': 'dummy_data/images/', 'grid_size': -1, 'detail': 'low', 'freeze_rwkv': 0, 'freeze_proj': 0, 'image_position': 'first', 'logger': False, 'enable_checkpointing': False, 'default_root_dir': None, 'gradient_clip_val': 1.0, 'grad
ient_clip_algorithm': None, 'num_nodes': 1, 'num_processes': None, 'devices': '1', 'gpus': None, 'auto_select_gpus': None, 'tpu_cores': None, 'ipus': None, 'enable_progress_bar': False, 'overfit_batches': 0.0, 'track_grad_norm':
 -1, 'check_val_every_n_epoch': 100000000000000000000, 'fast_dev_run': False, 'accumulate_grad_batches': 16, 'max_epochs': 1, 'min_epochs': None, 'max_steps': -1, 'min_steps': None, 'max_time': None, 'limit_train_batches': None,
 'limit_val_batches': None, 'limit_test_batches': None, 'limit_predict_batches': None, 'val_check_interval': None, 'log_every_n_steps': 100000000000000000000, 'accelerator': 'gpu', 'strategy': 'deepspeed_stage_2', 'sync_batchnor
m': False, 'precision': 'bf16', 'enable_model_summary': True, 'num_sanity_val_steps': 0, 'resume_from_checkpoint': None, 'profiler': None, 'benchmark': None, 'reload_dataloaders_every_n_epochs': 0, 'auto_lr_find': False, 'replac
e_sampler_ddp': False, 'detect_anomaly': False, 'auto_scale_batch_size': False, 'plugins': None, 'amp_backend': None, 'amp_level': None, 'move_metrics_to_cpu': False, 'multiple_trainloader_mode': 'max_size_cycle', 'inference_mode': True, 'my_timestamp': '2024-04-24-06-32-02', 'betas': (0.9, 0.99), 'real_bsz': 8, 'run_name': '65536 ctx256 L6 D512'}

args.vision_tower_name:dummy
Traceback (most recent call last):
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/dummy/resolve/main/preprocessor_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1403, in hf_hub_download
    raise head_call_error
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1261, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1674, in get_hf_file_metadata
    r = _request_wrapper(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 369, in _request_wrapper
    response = _request_wrapper(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 393, in _request_wrapper
    hf_raise_for_status(response)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 352, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-6628a764-2c4260cf0dbed452255dbfd8;4333d138-2758-4327-9e40-e4e0067a7854)

Repository Not Found for url: https://huggingface.co/dummy/resolve/main/preprocessor_config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/code/VisualRWKV/VisualRWKV-v6/v6.1/train.py", line 178, in <module>
    args.image_processor = AutoImageProcessor.from_pretrained(args.vision_tower_name)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py", line 360, in from_pretrained
    config_dict, _ = ImageProcessingMixin.get_image_processor_dict(pretrained_model_name_or_path, **kwargs)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/image_processing_utils.py", line 334, in get_image_processor_dict
    resolved_image_processor_file = cached_file(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/utils/hub.py", line 421, in cached_file
    raise EnvironmentError(
OSError: dummy is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bash scripts/train/dummy_run.sh #5

bash scripts/train/dummy_run.sh #5

KingBoyAndGirl commented Apr 24, 2024

bash scripts/train/dummy_run.sh #5

bash scripts/train/dummy_run.sh #5

Comments

KingBoyAndGirl commented Apr 24, 2024