YOLOv5 Controversy — Is YOLOv5 Real?

Aug 15, 2020

Apr 23rd, 2020 — YOLOv4 was released…

…June 10th 2020, YOLOv5 was also released.

Marvelous ain’t it…at how fast we are progressing in our research and technology. I mean to get the next generation of the popular object detection framework so soon after its predecessor was just released. Is YOLOv5 really here or is it a ruse ? We’ll investigate the evidence as objectively as possible, right now in this article, so stay tuned.

Source : https://github.com/ultralytics/yolov5

For those who don’t know what YOLO is, it a real-time object detection framework and stands for You Only Look Once. Meaning the image is only passed once through the FCNN or fully convolutional neural network. I will not go into the technical details of how YOLO works, as I’ve already have 2 videos on my YouTube Channel explained YOLOv1 originated by Joseph Redmon et. Al. all the way to YOLOv4 upgrade by Bochkovskiy et. al.

For those of you are interested in my course, there will be a link in the description where you can enroll in the full YOLOv4 course when it gets released. We cover the implementation of YOLOv4, training and inference as well as building cross platform object detection apps using PyQT. Click HERE

Part 1 — What has Occurred.

Okay, so back to YOLOv5. Glenn Jocher the founder and CEO of Ultralytics released its open source implementation of YOLOv5 repo on GitHub [https://github.com/ultralytics/yolov5], which supposedly said to be the state of the art among all known YOLO implementations according to the Ultralytics GitHub page.

Source : https://github.com/ultralytics/yolov5

Based on their results is shows how well it outperformed EfficientDet which is Googles open source object detection framework, but what I find strange is that while they do not explicitly show their comparison with YOLOv4, YOLOv5 is said to be able to achieve fast detection at 140FPS running on a Tesla P100 in comparison to YOLOv4 which bench-marked at a measly 50 FPS stated on an article published on the Roboflow blog titled YOLOv5 is Here: State-of-the-Art Object Detection at 140 FPS by Joseph Nelson and Jacob Solawetz [https://blog.roboflow.ai/yolov5-is-here/] .

Source: https://blog.roboflow.ai/yolov5-is-here/

Furthermore, they mentioned that “YOLOv5 is small at only 27 Megabytes”. What, that is ridiculously small compared to the 244 megabytes of YOLOv4 with darknet architecture…Whaaat.. That’s nearly 90 percent small than YOLOv4. That’s craaazzy.

In terms of accuracy, “YOLOv5 performs on par with YOLOv4.”

So essentially looking at the claims in which YOLOv5 is said to be Extremely fast, light in terms of its model size but on par in terms of accuracy with the YOLOv4 benchmark.

Just food for thought if PlayStation or Xbox released a new console that had the same graphics performance, maybe faster load times but in a smaller package would that constitute this new console as a next gen console or just a light-weight version of the current-gen console like the PS4 Slim or Xbox One S? Let me know in the comments what you think.

Part 2 — Questions

So some further questions that crossed my mind are can you claim or name a technology even opensource ones as your own even though you were not the original creator. Eeeh..Im not sure, this one is a debatable one. Does using the exact same framework and just modifying a bit give you the right to brand it as your own but with an increment in the version number, in this case YOLO with version 5. Well I guess this depends on the original creator or creators of the framework. You may or may not have heard of the original creator Joseph Redmon whom tweeted in February 2020 that he would step down from his research of his brain child YOLO due to the societal impact their work was having. He stated:

“ I loved the work but the military applications and privacy concerns eventually became impossible to ignore”.

Redmon had created 3 iterations of YOLO in partnership with Ali Faradi.

Now later this year YOLOv4 appeared in April 2020 but by none of the original authors but rather by Bochkovskiy et. al. The paper was published and peer reviewed, GitHub code uploaded to the AlexeyAB/darknet repo and everything seemed fine, the technological upgrade was great and well received in the computer vision community. So does this mean that if Bochkovskiy et. al. did it, then anyone else can take the YOLO framework, make some improvements and increment the version number? Well that’s exactly what happened.

Glenn Jocher, you know the founder and CEO of Ultralytics dropped YOLOv5 like a bomb, BOOM. So you must still be wondering… okay Ritz… tell us now… so is YOLOv5 legit or it a ruse or a lie. Okay, okay, okay I know you want the answer, but hold on a bit right, lets first examine the evidence.

Part 3 — The Evidence

Let’s get first things out of the way. Ultralytics at the time of this investigation does not have published a peer reviewed paper on YOLOv5. So that already tells you, that they don’t have merit. I get that writing a paper takes time and that Ultralytics is a business and not a research group. However how do you trust their implementation if the paper has not been published. When YOLOv4 was released Bochkovskiy et. al. published the paper along with their impressive results.

Part 3.1 — Communities Reaction

Secondly, To determine the legitimacy of YOLOv5, we have a look to the community and how this “Next Gen” model has been received, including their analysis and evaluations. So if we go over here to Google and type in YOLOv5 issues, and lets scroll down a source like Kaggle.

Source : https://www.kaggle.com/c/global-wheat-detection/discussion/158371

We can see a comment by Mr. Hurtik who states:

“ YOLOv5 is just renamed YOLOv3, The graph seems nice but it is misleading. He then leads us to a link to the YOLOv4 author AlexeyAB’s repo.”

So on this GitHub forum [GitHub.com/AlexeyAB/darknet/issue/5920]. Daniel Barry posted a 2 source links that claimed that YOLOv5 is here. We have the Ultralytics YOLOv5 repo and Roboflow blog that I have mentioned earlier. And then this this third link is a discussion here on Y-Combinator based on the 2 aforementioned sources.

If we delve into the community discussion [https://news.ycombinator.com/item?id=23478151], we can see that a lot of people are doubting and even calling the YOLOv5 model as bullsh*t [Uhm, Ritz… Language!]. There are statements that say YOLOv5 was not tested against YOLOv4 under the same conditions, in other words, we weren’t comparing apples with apples.

Source : https://news.ycombinator.com/item?id=23478151

A person by the alias Anthiras, stated that the YOLOv5 article from Roboflow, as it seems highly unlikely that a 90 % smaller model would provide a similar accuracy. And that The YOLO V5 repo itself shows performance comparable to YOLOv4”

There is also speculation from Joshvm stating:

“I don’t think YOLOv5 is semantically very informative. But by the way, if you read the issues from a while back you’ll see that AlexeyAB’s fork basically scooped them, hence the version bump. Ultralytics probably would have called this YOLOv4 otherwise. This repo has been in the works for a while.”

Source : https://news.ycombinator.com/item?id=23478151

Part 3.2 — Bochkovskiy’s Evaluation

Back to the AlexeyAB GitHub discussion we see Alexey’s comments stating that the roboflow.ai blog has invalid comparison results. He goes on to explain that the latency shouldn’t be measure with batch size of 32 but rather with a batch equal to 1. So latency is the time of a complete data processing cycle, it cannot be less than processing a whole batch, which can take up to 1 second depends on batch-size. The higher batch — the higher latency.

Source: GitHub.com/AlexeyAB/darknet/issue/5920

In terms of the claim that YOLOv5 is small (27Mb). Alexey goes on to destroy YOLOv5 in a statement saying and I quote

“They compared size of models of small ultralytics-YOLOv5-version YOLOv5s (27 MB) with very low accuracy 26–36% AP on Microsoft COCO with big YOLOv4 (245 MB) with very high accuracy 41–43% AP on Microsoft COCO”

In terms of speed saying that YOLOv5 is fast at 140 FPS. I quote Alexey again:

“They compared speed of very small and much less accurate version of ultralytics-YOLOv5 with very accurate and big YOLOv4. They did not provide the most critical details for comparison: what exactly YOLOv5 version was used s,l,x,… what training and testing resolutions were used, and what test batch was used for both YOLOv4 vs ultralytics-YOLOv5. They did not test it on the generally accepted Microsoft COCO dataset, with exactly the same settings, and they did not test it on the Microsoft COCO CodaLab-evaluation server, to reduce the likelihood of manipulation.”

So as you can see this is not really looking good for Ultralytics and their so called YOLOv5 implementation. But thus far we have been look at things from one side of the coin. Lets get some insight into the responses from Glen Jocher from Ultralytics as well as from the guys at Roboflow.ai.

Part 4 — The Response

So, we’ve heard the roars from the community and you must be thinking to yourself… Okay Ritz so these guys seem a bit sketchy and YOLOv5 cannot be trusted… wait wait wait hold up. Lets first get the rebuttal from the defendant before drawing our conclusions.

So Glenn- Jocher From Ultralytics wrote a mini essay in response to his YOLOv5 release and naming [https://github.com/ultralytics/yolov5/issues/2]. The crux of his response is that there is an intention for them to write a paper to showcase these results and training methodologies however they are extremely limited on resources and need to maintain a balance to keep their business afloat. — Okay I get that, these are tough times and some companies do push beta products out. I get that…

Source: https://github.com/ultralytics/yolov5/issues/2

He goes on to say their models, referring to their YOLOv5 implementation is neither static nor complete at this time. I’m fine with this also. however with regards to their claims that his YOLOv5 implementation was better than YOLOv4 should not have been made and he should make it clear that this project is under development not only in text on his repository [which he has done], but also in his comparisons, evaluations, code etc. Just so that there is not confusion that because there is a 5 at the end of YOLOv does not mean that its better than its predecessor.

Which brings me to my next point which is a very very important one, regarding the naming convention. Glenn states that YOLOv5 is an internal designation to this work and that the name employed here is not a concern for them… Hmmm okay using YOLOv5 as an internal name is fine, right. Internally you can call it what ever you want, Project XYZ, YOLO, KOLO,POLO, ZOLO. But the minute you publish the project and thus the name, it should be intuitive, it should be practical and it should not deceive people into thinking you have the State of the Art YOLO model, just because you have incremented the number to v5.

As for RoboFlow, while they are not to be blamed for promoting YOLOv5, they were naïve into believing what was published to be true. My recommendation for them, is for them to do their due diligence next time by properly evaluating the models and by comparing apples with apples. They have since released a new article on their blog titled Responding to the Controversy about YOLOv5 [https://blog.roboflow.ai/yolov4-versus-yolov5/]. It is quite a lengthy read, where they acknowledge their mistake and they essentially do an in-depth comparison between YOLOv4 and YOLOv5 which the results are as well discussed earlier in this article.

I do however suggest they take down their article “YOLOv5 is Here: State-of-the-Art Object Detection at 140 FPS” or at least change the title to reflect the true nature of the model implementation from Ultralytics.

Conclusion

Wow that was quite a ride! So based on the facts, we know that YOLOv4 is still the state of the art in terms of the YOLO evolution. While there’s nothing wrong with building upon other peoples work (given the correct permission of course), the use of YOLOv5 in as their model name has been frowned upon in the Computer Vision community.

Source : Karol Majek YouTube Channel

Though I would like to know your thoughts on this subject. Was it right or wrong for Ultralytics to name their model YOLOv5. Do you think this was just a sham to get people to notice their company (i.e free marketing). Or do you think they were just oblivious to the whole thing and didn’t think the name would be such a big deal.

Let me know in the comments down below.

[UPDATE]Alexey has posted an updated comparison of YOLOv3 vs YOLOv4 vs YOLOv5 in this forum [https://github.com/AlexeyAB/darknet/issues/5920#issuecomment-642213028]