what i am doing right now is using low res satellite images to train a ddpm. My problem lies with the dataset . The data set consists of 10 band-images of the same patch of land but with variation in tree sizes(+-20%) and variation in some physical properties which affects the colors of some of the bands . This is a sample of the bands used: 
I used the code that Nvidia for ddpm . The model is starting to see patters but its not enough:
Both of these images are the images generated by the model (all 10 bands)
After 20 or 30 epochs the loss i am getting is 0.1-0.2.
After 100ish epochs it starts to converge towards 0.05-0.02.
That's why i want suggestions on things to try to improve the model.