Frequently Asked Questions¶

We list some common troubles faced by many users and their corresponding solutions here. Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them.

Installation¶

KeyError: “xxx: ‘yyy is not in the zzz registry’”

The registry mechanism will be triggered only when the file of the module is imported. So you need to import that file somewhere. More details can be found at KeyError: “MaskRCNN: ‘RefineRoIHead is not in the models registry’”.
“No module named ‘mmcv.ops’”; “No module named ‘mmcv._ext’”
1. Check your os, python, torch (+cuda) versions, e.g. python 3.10 and torch 2.8.0+cu128.
  1. If you upgraded/changed your pytorch version since you installed onedl-mmcv, you need to install oneld-mmcv again.
2. Check if your combination has a prebuilt wheel.
  1. Go to our package index and check if your cuda + torch version is in the list.
  2. Click your version and go to the onedl-mmcv index.
  3. Check if your python version is supported (and if your OS is supported too).
3. If there is a prebuilt wheel
  1. Uninstall the existing onedl-mmcv pip uninstall onedl-mmcv
  2. Install while forcing binary: mim install onedl-mmcv --only-binary=onedl-mmcv.
  3. If this fails, check the logs of running the command with the -v flag. Is the index looking in the right place? (if not, check again your pytorch+cuda version). It could be your manylinux is not the same (use pip debug --python-version 3.10 --verbose | grep manylinux to check).
4. If there is not a prebuilt wheel, you need to build from source:
  1. Uninstall existing mmcv in the environment using pip uninstall mmcv
  2. Install onedl-mmcv following the installation instruction or Build MMCV from source
“invalid device function” or “no kernel image is available for execution”
1. Check the CUDA compute capability of you GPU
2. Run python mmdet/utils/collect_env.py to check whether PyTorch, torchvision, and MMCV are built for the correct GPU architecture. You may need to set TORCH_CUDA_ARCH_LIST to reinstall MMCV. The compatibility issue could happen when using old GPUS, e.g., Tesla K80 (3.7) on colab.
3. Check whether the running environment is the same as that when mmcv/mmdet is compiled. For example, you may compile mmcv using CUDA 10.0 bug run it on CUDA9.0 environments
“undefined symbol” or “cannot open xxx.so”
1. If those symbols are CUDA/C++ symbols (e.g., libcudart.so or GLIBCXX), check whether the CUDA/GCC runtimes are the same as those used for compiling mmcv
2. If those symbols are Pytorch symbols (e.g., symbols containing caffe, aten, and TH), check whether the Pytorch version is the same as that used for compiling mmcv
3. Run python mmdet/utils/collect_env.py to check whether PyTorch, torchvision, and MMCV are built by and running on the same environment
“RuntimeError: CUDA error: invalid configuration argument”

This error may be caused by the poor performance of GPU. Try to decrease the value of THREADS_PER_BLOCK and recompile mmcv.
“RuntimeError: nms is not compiled with GPU support”

This error is because your CUDA environment is not installed correctly. You may try to re-install your CUDA environment and then delete the build/ folder before re-compile mmcv.
“Segmentation fault”
1. Check your GCC version and use GCC >= 5.4. This usually caused by the incompatibility between PyTorch and the environment (e.g., GCC < 4.9 for PyTorch). We also recommend the users to avoid using GCC 5.5 because many feedbacks report that GCC 5.5 will cause “segmentation fault” and simply changing it to GCC 5.4 could solve the problem
2. Check whether PyTorch is correctly installed and could use CUDA op, e.g. type the following command in your terminal and see whether they could correctly output results
```
python -c 'import torch; print(torch.cuda.is_available())'
```
3. If PyTorch is correctly installed, check whether MMCV is correctly installed. If MMCV is correctly installed, then there will be no issue of the command
```
python -c 'import mmcv; import mmcv.ops'
```
4. If MMCV and PyTorch are correctly installed, you can use ipdb to set breakpoints or directly add print to debug and see which part leads the segmentation fault
“libtorch_cuda_cu.so: cannot open shared object file”

onedl-mmcv depends on the share object but it can not be found. We can check whether the object exists in ~/miniconda3/envs/{environment-name}/lib/python3.7/site-packages/torch/lib or try to re-install the PyTorch.
Compatibility issue between MMCV and MMDetection; “ConvWS is already registered in conv layer”

Please install the correct version of MMCV for the version of your MMDetection following the installation instruction.

Usage¶

“RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one”
1. This error indicates that your module has parameters that were not used in producing loss. This phenomenon may be caused by running different branches in your code in DDP mode. More datails at Expected to have finished reduction in the prior iteration before starting a new one.
2. You can set find_unused_parameters = True in the config to solve the above problems or find those unused parameters manually
“RuntimeError: Trying to backward through the graph a second time”

GradientCumulativeOptimizerHook and OptimizerHook are both set which causes the loss.backward() to be called twice so RuntimeError was raised. We can only use one of these. More datails at Trying to backward through the graph a second time.