Benchmarking results while training jointly on Nr3D and Sr3D datasets:

Method Nr3D Sr3D
Overall Easy Hard View-dep. View-indep. Overall Easy Hard View-dep. View-indep.
ReferIt3D 35.6 43.6 27.9 32.5 37.1 40.8 44.7 31.5 39.2 40.8
Text-Guided-GNNs 37.3 44.2 30.6 35.8 38.0 45.0 48.5 36.9 45.8 45.0
InstanceRefer 38.8 46.0 31.8 34.5 41.9 48.0 51.1 40.5 45.4 48.1
3DRefTransformer 39.0 46.4 32.0 34.7 41.2 47.0 50.7 38.3 44.3 47.1
3DVG-Transformer 40.8 48.5 34.8 34.8 43.7 51.4 54.2 44.9 44.6 51.7
FFL-3DOG 41.7 48.2 35.0 37.1 44.7 - - - - -
TransRefer3D 42.1 48.5 36.0 36.5 44.9 57.4 60.5 50.2 49.9 57.7
LanguageRefer 43.9 51.0 36.6 41.7 45.0 56.0 58.9 49.3 49.2 56.3
SAT 49.2 56.3 42.4 46.9 50.4 57.9 61.2 50.0 49.2 58.3
3D-SPS 51.5 58.1 45.1 48.0 53.2 62.6 56.2 65.4 49.2 63.2
LAR 48.9 56.1 41.8 46.7 50.2 59.35 63.0 51.2 50.0 59.1
MVT 55.1 61.3 49.1 54.3 55.4 64.5 66.9 58.8 58.4 64.7
MVT+CoT3DRef 57.0 63.2 49.7 54.6 57.2 75.4 79.6 65.3 64.9 75.9