I was able to implement softmax like, this by transposing the first matrix, multiplying the total and transposing it back.
I was wondering if this is how it should be done.
num = torch.exp(outputs)
total = torch.sum(num, dim=1)
transpose = torch.transpose(num,0,1)
result_t = torch.div(transpose, total)