Training Loss is higher than Testing Loss, any idea why?

I think about 2 main cases for such behaviour:

  • no data shuffling
  • heavy class imbalance