It’s not a trivial question to answer succinctly, and I really suggest you spend some time reading through some of the many references on the topic before you delve into this. I highly recommend the refs ,  and  as they are really written in a didactic manner and cover a lot of ground.
Regarding your question, I tend to look at it this way : because it’s a linear method the stimulus reconstruction method (aka Backward TRF) method will be able to reconstruct the stimulus only if your stimulus evokes reliable, or repeatable responses in the EEG. So what you really need to ask yourself is whether your biophysiological process of interest (in your case “imagined beats in music”) will trigger reliable responses.
I leave it up to you to answer this, but you might be aware that thew TRF method has been successfully applied to different stimulus features, ranging from low-level features like speech envelope , spectrogram  or sound intensity , to more complex representations of speech like phonemes  or transition probabilities between phonemes .
Also note that the linearity hypothesis is reasonable (especially for EEG which has high spatial smearing) and has given interesting results (as you can see from the list of references), but we also know that this assumption is not strictly met and the brain is known to respond in a non-linear way, especially to complex stimuli.
- Brodbeck, C., & Simon, J. Z. (2020). Continuous speech processing. Current Opinion in Physiology , 18 , 25–31. doi:10.1016/j.cophys.2020.07.014
- Daube, C., Ince, R. A. A., & Gross, J. (2019). Simple Acoustic Features Can Explain Phoneme-Based Predictions of Cortical Responses to Speech. Current Biology , 29 (12), 1924-1937.e9. doi:10.1016/j.cub.2019.04.067
- Drennan, D. P., & Lalor, E. C. (2019). Cortical Tracking of Complex Sound Envelopes: Modeling the Changes in Response with Intensity. Eneuro , 6 (3), ENEURO.0082-19.2019. doi:10.1523/ENEURO.0082-19.2019
- Di Liberto, G. M., O’Sullivan, J. A., & Lalor, E. C. (2015). Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing. Current Biology , 25 (19), 2457–2465. doi:10.1016/j.cub.2015.08.030
- Di Liberto, G. M., Wong, D., Melnik, G. A., & de Cheveigné, A. (2019). Low-frequency cortical responses to natural speech reflect probabilistic phonotactics. NeuroImage , 196 , 237–247. doi:10.1016/j.neuroimage.2019.04.037
- Holdgraf, C. R., Rieger, J. W., Micheli, C., Martin, S., Knight, R. T., & Theunissen, F. E. (2017). Encoding and Decoding Models in Cognitive Electrophysiology. Frontiers in Systems Neuroscience , 11 . doi:10.3389/fnsys.2017.00061
- Mesgarani, N., & Chang, E. F. (2012). Selective cortical representation of attended speaker in multi-talker speech perception. Nature , 485 (7397), 233–236. doi:10.1038/nature11020
- Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014). Phonetic Feature Encoding in Human Superior Temporal Gyrus. Science (New York, N.Y.) , 343 (6174), 1006–1010. doi:10.1126/science.1245994
- O’Sullivan, J. A., Power, A. J., Mesgarani, N., Rajaram, S., Foxe, J. J., Shinn-Cunningham, B. G., … Lalor, E. C. (2015). Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. Cerebral Cortex , 25 (7), 1697–1706. doi:10.1093/cercor/bht355
- Wong, D. D. E., Fuglsang, S. A., Hjortkjær, J., Ceolini, E., Slaney, M., & de Cheveigné, A. (2018). A Comparison of Regularization Methods in Forward and Backward Models for Auditory Attention Decoding. Frontiers in Neuroscience , 12 . doi:10.3389/fnins.2018.00531