What is the difference in computational cost at inference time between object detection and semantic segmentation?

Asked Feb 15 '21 at 14:10

Active Dec 07 '22 at 22:49

Viewed 320 times

I am aware that YOLO (v1-5) is a real-time object detection model with moderately good overall prediction performance. I know that UNet and variants are efficient semantic segmentation models that are also fast and have good prediction performance.

I cannot find any resources comparing the inference speed differences between these two approaches. It seems to me that semantic segmentation is clearly a more difficult problem, to classify each pixel in an image, than object detection, drawing bounding boxes around objects in the image.

Does anyone have good resources for this comparison? Or a very good explanation to why one is computationally more demanding that the other?

asked Feb 15 '21 at 14:10

JStrahl

What is the difference in computational cost at inference time between object detection and semantic segmentation?

0 Answers0