MSE is the average squared difference between the predicted values and the actual values. It is a good metric for measuring the overall accuracy of the model. However, it can be sensitive to outliers, which are large errors that can disproportionately affect the MSE.



R^2 compares the squared errors of the model against the squared errors from the simplest model possible, the average of the response. Since both SSE and SST have the same scale, R^2 can help you to determine whether transforming your target is helping to obtain better predictions. The closer R^2 is to 1, the more variance in the target vector is explained by the features.
<aside> 💡 Please remember that linear transformations, such as minmax or standardization do not change the performance of any regressor, since they are linear transformations of the target. Non-linear transformations, such as the square root, the cubic root, the logarithm, the exponentiation, and their combinations, should instead definitely modify the performance of your regression model on the evaluation metric (hopefully for the better, if you decide on the right transformation).
</aside>
MSE is a great instrument for comparing regression models applied to the same problem. Large prediction errors are greatly penelized because of the squaring activity. Often RMSE is preferred.
Vergleichbarkeit:
In fact, by taking the root of MSE (—> RMSE), its value will resemble the original scale of your target and it will be easier at a glance to figure out if your model is doing a good job or not.
In addition, if you are considering the same regression model across different data problems (for instance, across various datasets or data competitions), R^2 is better because it is perfectly correlated with MSE and its values range between 0 and 1, making all comparisons easier.