Allow to generate scatter plots/confusion matrices

_From_ @borchero:

This is still a very open idea: when comparing pipeline outputs, we are often interested in how model predictions/scores change. To this end, we often generate scatter plots/confusion matrices.

Potentially, we could support this to some extent via `diffly`?