Since SHAP gives you an estimation of an individual sample (they are local explainers), your explanations are local(for a certain instance)
You are just comparing two different instances and getting different results. This is normal and can happen in train and test set. This doesn't mean also that your train and test set have bad split, they could be good split.
In the end SHAP is done to help you understand how the model behaves in a particular instance. It should be done where you are interested in understanding. I guess that you can also try to find what is the difference between train and test with shap values, but they are local explainers so you might not find much success.
[Update]
Trustworthy, is an unspecific term. I guess what do you want to mean is that the model performance will decay.
If we look at the problem from a distribution perspective instead of a local one. Having distinct SHAP distributions between train and test means the model behaves differently. This does not necessarily mean that is going to fail, as a distribution shift can be benevolent (can even increase performance)
I wouldn't say anything about the quality of predictions given the feature importance.
In the past I have done some research on the topic in case you are interested