-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
occupant-id not stable in some cases #4236
Comments
I played trying to reproduce the problem, but did not get it. Does the problem appear from time to time in your server, even if you are not able to reproduce it voluntarily on demand? In that case, you could apply a small patch to the modue that generates an ID that we can later decrypt, to determine what is the difference between the two ones diff --git a/src/mod_muc_occupantid.erl b/src/mod_muc_occupantid.erl
index fabd69e28..42449e563 100644
--- a/src/mod_muc_occupantid.erl
+++ b/src/mod_muc_occupantid.erl
@@ -74,8 +74,8 @@ add_occupantid_packet(Packet, RoomJid) ->
xmpp:set_subtag(Packet, OccupantElement).
calculate_occupantid(From, RoomJid) ->
- Term = {jid:remove_resource(From), get_salt(RoomJid)},
- misc:term_to_base64(crypto:hash(sha256, io_lib:format("~p", [Term]))).
+ Term = {jid:remove_resource(From), RoomJid, get_salt(RoomJid)},
+ misc:term_to_base64(io_lib:format("~p", [Term])).
%%%
%%% Table storing rooms' salt Then, the ID is a very long string like: <occupant-id id='g2wAAAABbAAAAANhe2wAAAAJbAAAAANhe2wAAAANawADamlkYSxsAAAABWE8YTxrAAcidXNlcjIiYT5hPmphLGwAAAAFYTxhPGsACyJsb2NhbGhvc3QiYT5hPmphLGsABDw8Pj5hLGwAAAAFYTxhPGsAByJ1c2VyMiJhPmE+amEsbAAAAAVhPGE8awALImxvY2FsaG9zdCJhPmE+amEsawAEPDw+PmphfWphLGEKawABIGwAAAADYXtsAAAAD2sAA2ppZGEsbAAAAAVhPGE8awAHInNhbGExImE+YT5qYSxsAAAABWE8YTxrABYiY29uZmVyZW5jZS5sb2NhbGhvc3QiYT5hPmphLGsABDw8Pj5hLGwAAAAFYTxhPGsAByJzYWxhMSJhPmE+amEsYQpsAAAAAmwAAAAEYSBrAAIgIGEgYSBqYSBqbAAAAAVhPGE8awAWImNvbmZlcmVuY2UubG9jYWxob3N0ImE+YT5qYSxrAAQ8PD4+amF9amEsYQprAAEgbAAAAAVhPGE8awAWIjE1OTU4NTQzNDU1MjgyNzcwNjEyImE+YT5qamF9amo='
xmlns='urn:xmpp:occupant-id:0'/> That can be easily decrypted to know what user, what room and what room salt were used: misc:base64_to_term("g2wAAAABbAAAAANhe2wAAAAJbAAAAANhe2wAAAANawADamlkYSxsAAAABWE8YTxrAAcidXNlcjIiYT5hPmphLGwAAAAFYTxhPGsACyJsb2NhbGhvc3QiYT5hPmphLGsABDw8Pj5hLGwAAAAFYTxhPGsAByJ1c2VyMiJhPmE+amEsbAAAAAVhPGE8awALImxvY2FsaG9zdCJhPmE+amEsawAEPDw+PmphfWphLGEKawABIGwAAAADYXtsAAAAD2sAA2ppZGEsbAAAAAVhPGE8awAHInNhbGExImE+YT5qYSxsAAAABWE8YTxrABYiY29uZmVyZW5jZS5sb2NhbGhvc3QiYT5hPmphLGsABDw8Pj5hLGwAAAAFYTxhPGsAByJzYWxhMSJhPmE+amEsYQpsAAAAAmwAAAAEYSBrAAIgIGEgYSBqYSBqbAAAAAVhPGE8awAWImNvbmZlcmVuY2UubG9jYWxob3N0ImE+YT5qYSxrAAQ8PD4+amF9amEsYQprAAEgbAAAAAVhPGE8awAWIjE1OTU4NTQzNDU1MjgyNzcwNjEyImE+YT5qamF9amo=").
{term,[[123,
[[123,
["jid",44,
[60,60,"\"user2\"",62,62],
44,
[60,60,"\"localhost\"",62,62],
44,"<<>>",44,
[60,60,"\"user2\"",62,62],
44,
[60,60,"\"localhost\"",62,62],
44,"<<>>"],
125],
44,10," ",
[123,
["jid",44,
[60,60,"\"sala1\"",62,62],
44,
[60,60,"\"conference.localhost\"",62,62],
44,"<<>>",44,
[60,60,"\"sala1\"",62,62],
44,10,
[[32," ",32,32],32],
[60,60,"\"conference.localhost\"",62|...],
44,"<<>>"],
125],
44,10," ",
[60,60,"\"15958543455282770612\"",62,62]],
125]]} If you compile ejabberd from source code, simply apply the patch, recompile, install the file and restart the module or restart the whole ejabberd. If you cannot compile ejabberd yourself, ping me and I'll generate an updated module *.beam file so you can install it. |
It could be that room was recreated between those two stanzas, that would change salt (which i think is expected). Only thing that's worth checking is if room hibernation doesn't trigger salt regeneration - it that case we don't want to do that. Also Badlop was there any reason why you didn't want to put salt in muc_room state but used external mnesia table? |
Yes, the reasoning was something like this:
Right, that could be the case here, and it makes sense that the occupantid changes, as the room is different (may be from a different creator, owner, ... both rooms just have the same roomname). If that's what happened in this case, it will be revealed when there are new logs that have occupantid we can unencrypt to determine their source information.
Aha, that could be the case here too, and in that case the module should be improved. Looking at the code, a room salt is only removed when the room is removed (and the remove_room hook is called):
Maybe the room is non-persistent, using mnesia storage, got hibernated and the module deleted its salt when got unhibernated. I'll check and update this comment. |
Another thing related to hibernation even if hibernation doesn't reset salt (and i think we call remove_room hook only when room is truly removed, and not just hibernated), could be restoring room on different node, i don't think that mnesia table is synced between nodes, so if room was restored on different node, which didn't have it, it will make new one. But i think it most probably room that is not set to persistent, and it just get deleted when everyone leaves and then get recreated. |
A temporary (that is, not-persistent) room is deleted when the last occupant leaves and it has no subscribers. I was able to reproduce the problem:
I wonder what was the requirement to forget_room when restoring a temporary room. Removing it solves this particular problem: diff --git a/src/mod_muc.erl b/src/mod_muc.erl
index e297d866f..3184fe90a 100644
--- a/src/mod_muc.erl
+++ b/src/mod_muc.erl
@@ -903,14 +903,7 @@ load_room(RMod, Host, ServerHost, Room, ResetHibernationTime) ->
Res2;
_ ->
?DEBUG("Restore hibernated non-persistent room: ~ts", [Room]),
- Res = start_room(RMod, Host, ServerHost, Room, Opts0),
- case erlang:function_exported(Mod, get_subscribed_rooms, 3) of
- true ->
- ok;
- _ ->
- forget_room(ServerHost, Host, Room)
- end,
- Res
+ start_room(RMod, Host, ServerHost, Room, Opts0)
end
end.
|
Hi, i think its a fair assumption that a salt is regenerated after a room was destroyed. the room in question here is a group chat with >60 people, and its configured as persistent (at least today), i think it unlikely that it was destroyed. A bit more info, we could see the problem happening consistently, as in messages always had a different id than the presence. It was not a case of send a message on one day, and check the presence on the next day. so both occupant-ids where present at the same time, not a case of occupant-id changed because of an event and from that moment on it was different everywhere. I try to get more logs from the user, maybe we can see what the occupant-id was over a longer period of time. |
@badlop Ah, missed that forget room call. That call was there to delete entry from db, as we will not delete this otherwise, we probably should add function that just delete room from db, and have other that remove from db and calls that hook. @lovetox Do you know if server where this is deployed use clustering or is that just single computer? |
its 404.city, we try to reach the admin, but until now no success |
EDIT: Sorry got this wrong the first time. Seems like @prefiks found the reason
|
This still happening, could it be a problem with mutli devices joined in the same MUC from the same user? |
I've had this issue occur occasionally for a while and I do have two devices I connect with. But I just met someone else who has the same issue and apparently they only have one device that is connecting to the account they have the issue with. |
Looks like we can reproduce as well (this was in the DivestOS Mobile chat - xmpp:divestos-mobile@conference.konvers.me?join): This happened when sending messages after reconnecting after the server restarted (we were connected before server restarted) The person I spoke with also experienced the same issue as well after sending a message, so probably not to do with multiple clients but just the server restarting. In case this information is helpful, the server the account I am connecting is from is conversations.im |
Environment
Bug description
I have 2 different users in Gajim that reported a problem with occupant-id in 2 different MUCs on different servers.
When we investigated we found that the occupant-id in the presence is different from the occupant-id the server adds to their messages.
Multiple device where joined in the MUC, if that could be a issue.
We have no reproducible case for now, but it happens.
Could you please check the code, under which conditions this could happen?
The text was updated successfully, but these errors were encountered: