Hi Mark,
The number of components you select is a personal choise. As a rule of thumb you can choose as many conponents as your EEG channels, as @richard has previously pointed out (1) (2). However, note that you need to have enough data points to be able to compute that many components. As pointed out in EEGLAB’s website:
ICA works best when given a large amount of basically similar and mostly clean data. When the number of channels (N) is large (>>32), then a considerable amount of data may be required to find N components.
I agree with you that MNE-python uses only variance-explained as a measure of model fit. For other methods you will need to provide the code yourself.
Best,
Konstantinos