I read two articles by the same guy where he uses the whole dataset for hyperparameter optimisation using with CV and then evaluates the model with the best hyperparameters using leave one out on the same dataset.
This seems fishy, from what I know I believe that by tuning the model to the whole dataset and then evaluating on the same dataset he will be overfitting it and have a overly optimistic result.
However, this guy managed to publish two articles with this same methodoly (one in a Q1 journal) so I'm wondering if I'm the one missing something.