Thesis/log/testrun_5cafe61a

45 lines
4.2 KiB
Plaintext
Raw Permalink Normal View History

2023-06-24 04:11:17 +00:00
2023-05-17 06:48:57,396 - testrun_5cafe61a - [INFO] - {'dataset': 'icews14', 'name': 'testrun_5cafe61a', 'gpu': '1', 'train_strategy': 'one_to_n', 'opt': 'adam', 'neg_num': 1000, 'batch_size': 128, 'l2': 0.0, 'lr': 0.0001, 'max_epochs': 500, 'num_workers': 0, 'seed': 42, 'restore': False, 'lbl_smooth': 0.1, 'embed_dim': 400, 'ent_vec_dim': 400, 'rel_vec_dim': 400, 'bias': False, 'form': 'plain', 'k_w': 10, 'k_h': 20, 'num_filt': 96, 'ker_sz': 9, 'perm': 1, 'hid_drop': 0.5, 'feat_drop': 0.2, 'inp_drop': 0.2, 'drop_path': 0.0, 'drop': 0.0, 'in_channels': 1, 'out_channels': 32, 'filt_h': 1, 'filt_w': 9, 'image_h': 128, 'image_w': 128, 'patch_size': 8, 'mixer_dim': 256, 'expansion_factor': 4, 'expansion_factor_token': 0.5, 'mixer_depth': 16, 'mixer_dropout': 0.2, 'log_dir': './log/', 'config_dir': './config/', 'test_only': False, 'grid_search': True}
2023-05-17 06:49:44,802 - concurrent.futures - [ERROR] - exception calling callback for <Future at 0x7efb51b74160 state=finished raised BrokenProcessPool>
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "/opt/conda/envs/kgs2s/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/torch/storage.py", line 222, in _load_from_bytes
return torch.load(io.BytesIO(b))
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/torch/serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/torch/serialization.py", line 930, in _legacy_load
result = unpickler.load()
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/torch/serialization.py", line 876, in persistent_load
wrap_storage=restore_location(obj, location),
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/torch/serialization.py", line 175, in default_restore_location
result = fn(storage, location)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/torch/serialization.py", line 155, in _cuda_deserialize
return torch._UntypedStorage(obj.nbytes(), device=torch.device(location))
RuntimeError: CUDA out of memory. Tried to allocate 678.00 MiB (GPU 0; 31.72 GiB total capacity; 0 bytes already allocated; 593.94 MiB free; 0 bytes reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/externals/loky/_base.py", line 26, in _invoke_callbacks
callback(self)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/parallel.py", line 385, in __call__
self.parallel.dispatch_next()
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/parallel.py", line 834, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/parallel.py", line 901, in dispatch_one_batch
self._dispatch(tasks)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/parallel.py", line 819, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 556, in apply_async
future = self._workers.submit(SafeFunction(func))
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/externals/loky/reusable_executor.py", line 176, in submit
return super().submit(fn, *args, **kwargs)
File "/opt/conda/envs/kgs2s/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 1129, in submit
raise self._flags.broken
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.