Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bash scripts/train/dummy_run.sh #5

Open
KingBoyAndGirl opened this issue Apr 24, 2024 · 0 comments
Open

bash scripts/train/dummy_run.sh #5

KingBoyAndGirl opened this issue Apr 24, 2024 · 0 comments

Comments

@KingBoyAndGirl
Copy link

An error occurred running the command under v6.1:
Bash scripts/train/dummy_run.sh

bash scripts/train/dummy_run.sh 
Current working directory: /code/VisualRWKV/VisualRWKV-v6/v6.1
INFO:pytorch_lightning.utilities.rank_zero:########## work in progress ##########
[2024-04-24 06:32:01,729] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  NVIDIA Inference is only supported on Ampere and newer architectures
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.2
 [WARNING]  using untested triton version (2.2.0), only 1.0.0 is known to be compatible
INFO:pytorch_lightning.utilities.rank_zero:
############################################################################
#
# RWKV-5 BF16 on 1x1 GPU, bsz 1x1x8=8, deepspeed_stage_2
#
# Data = dummy_data/dummy.json (json), ProjDir = out/dummy
#
# Epoch = 0 to 0 (will continue afterwards), save every 1 epoch
#
# Each "epoch" = 1000 steps, 8000 samples, 2048000 tokens
#
# Model = 6 n_layer, 512 n_embd, 256 ctx_len
#
# Adam = lr 1e-05 to 1e-05, warmup 0 steps, beta (0.9, 0.99), eps 1e-08
#
# Found torch 2.2.2+cu121, recommend 1.13.1+cu117 or newer
# Found deepspeed 0.14.2, recommend 0.7.0 (faster than newer versions)
# Found pytorch_lightning 1.9.4, recommend 1.9.5
#
############################################################################

INFO:pytorch_lightning.utilities.rank_zero:{'load_model': '', 'model_path': None, 'wandb': '', 'proj_dir': 'out/dummy', 'random_seed': -1, 'data_file': 'dummy_data/dummy.json', 'data_type': 'json', 'vocab_size': 65536, 'ctx_len'
: 256, 'epoch_steps': 1000, 'epoch_count': 1, 'epoch_begin': 0, 'epoch_save': 1, 'micro_bsz': 8, 'n_layer': 6, 'n_embd': 512, 'dim_att': 512, 'dim_ffn': 1792, 'pre_ffn': 0, 'head_size_a': 64, 'head_size_divisor': 8, 'lr_init': 1
e-05, 'lr_final': 1e-05, 'warmup_steps': 0, 'beta1': 0.9, 'beta2': 0.99, 'adam_eps': 1e-08, 'grad_cp': 0, 'dropout': 0, 'weight_decay': 0, 'weight_decay_final': -1, 'ds_bucket_mb': 200, 'vision_tower_name': 'dummy', 'image_folde
r': 'dummy_data/images/', 'grid_size': -1, 'detail': 'low', 'freeze_rwkv': 0, 'freeze_proj': 0, 'image_position': 'first', 'logger': False, 'enable_checkpointing': False, 'default_root_dir': None, 'gradient_clip_val': 1.0, 'grad
ient_clip_algorithm': None, 'num_nodes': 1, 'num_processes': None, 'devices': '1', 'gpus': None, 'auto_select_gpus': None, 'tpu_cores': None, 'ipus': None, 'enable_progress_bar': False, 'overfit_batches': 0.0, 'track_grad_norm':
 -1, 'check_val_every_n_epoch': 100000000000000000000, 'fast_dev_run': False, 'accumulate_grad_batches': 16, 'max_epochs': 1, 'min_epochs': None, 'max_steps': -1, 'min_steps': None, 'max_time': None, 'limit_train_batches': None,
 'limit_val_batches': None, 'limit_test_batches': None, 'limit_predict_batches': None, 'val_check_interval': None, 'log_every_n_steps': 100000000000000000000, 'accelerator': 'gpu', 'strategy': 'deepspeed_stage_2', 'sync_batchnor
m': False, 'precision': 'bf16', 'enable_model_summary': True, 'num_sanity_val_steps': 0, 'resume_from_checkpoint': None, 'profiler': None, 'benchmark': None, 'reload_dataloaders_every_n_epochs': 0, 'auto_lr_find': False, 'replac
e_sampler_ddp': False, 'detect_anomaly': False, 'auto_scale_batch_size': False, 'plugins': None, 'amp_backend': None, 'amp_level': None, 'move_metrics_to_cpu': False, 'multiple_trainloader_mode': 'max_size_cycle', 'inference_mode': True, 'my_timestamp': '2024-04-24-06-32-02', 'betas': (0.9, 0.99), 'real_bsz': 8, 'run_name': '65536 ctx256 L6 D512'}

args.vision_tower_name:dummy
Traceback (most recent call last):
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/dummy/resolve/main/preprocessor_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1403, in hf_hub_download
    raise head_call_error
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1261, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1674, in get_hf_file_metadata
    r = _request_wrapper(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 369, in _request_wrapper
    response = _request_wrapper(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 393, in _request_wrapper
    hf_raise_for_status(response)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 352, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-6628a764-2c4260cf0dbed452255dbfd8;4333d138-2758-4327-9e40-e4e0067a7854)

Repository Not Found for url: https://huggingface.co/dummy/resolve/main/preprocessor_config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/code/VisualRWKV/VisualRWKV-v6/v6.1/train.py", line 178, in <module>
    args.image_processor = AutoImageProcessor.from_pretrained(args.vision_tower_name)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py", line 360, in from_pretrained
    config_dict, _ = ImageProcessingMixin.get_image_processor_dict(pretrained_model_name_or_path, **kwargs)
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/image_processing_utils.py", line 334, in get_image_processor_dict
    resolved_image_processor_file = cached_file(
  File "/opt/conda/envs/VisualRWKV/lib/python3.10/site-packages/transformers/utils/hub.py", line 421, in cached_file
    raise EnvironmentError(
OSError: dummy is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant