Why should it be reflected in the MRS at all? Isn’t the agreement here only syntactic?
The shape of the ERG parse is different also because the _together_p_state EP is not the same category as _junto_a and does not require the same methods of connection. Here is the MRS from a similar ERG parse for They play music united, which is not a very fluent translation, but its structure may be enlightening:
This looks very similar to the Spanish one except for three things (the last two are unrelated to the current question and I think they reflect bugs in the grammar):
The ARG1 of _junto_a uses a u variable instead of an i one for the unlinked argument.
_hacer_v has an ARG3 with the value x9 but x9 does not appear anywhere else. If it is an unfilled argument, I would expect an i9 instead.)
The (L)TOP is directly linked to the label of the highest EP in the Spanish MRS, but the ERG more conventionally goes through a qeq: h0 qeq h1
If you want junto_a to link to the pron EP, then you might be saying that juntos is a property of Ellos instead of a manner of hacen música. In the ERG, you can get a parse like this with, e.g., United people play music or They together play music.