-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add shutdown_socket()
function for follow up of #505
#506
base: main
Are you sure you want to change the base?
Conversation
shutdown_socket()
function.shutdown_socket()
function.
shutdown_socket()
function.shutdown_socket()
function for follow up of #505
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice refactoring. Just left minor suggestions.
core/federated/RTI/rti_remote.c
Outdated
@@ -1030,9 +1009,7 @@ void send_reject(int* socket_id, unsigned char error_code) { | |||
lf_print_warning("RTI failed to write MSG_TYPE_REJECT message on the socket."); | |||
} | |||
// Close the socket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: how about adding more information for this call? For example, "Close the socket without reading until EOF."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I fixed it as suggested.
core/federated/RTI/rti_remote.c
Outdated
@@ -1418,9 +1395,7 @@ void lf_connect_to_federates(int socket_descriptor) { | |||
if (!authenticate_federate(&socket_id)) { | |||
lf_print_warning("RTI failed to authenticate the incoming federate."); | |||
// Close the socket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I fixed it as suggested.
core/federated/RTI/rti_remote.c
Outdated
@@ -1488,8 +1463,7 @@ void* respond_to_erroneous_connections(void* nothing) { | |||
lf_print_warning("RTI failed to write FEDERATION_ID_DOES_NOT_MATCH to erroneous incoming connection."); | |||
} | |||
// Close the socket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I fixed it as suggested.
The macOS test is not passing for some unknown reason. The CI test log seems to not work starting from The reason for not passing is that the program blocks on the final shutdown phase. First, this is how the LF code is intended to work.
This normally works on Linux, thus passing all tests. However, this fails on Mac. I also checked that the
The RTI(15045) is in This error can be reproduced by the lingua-franca's repo on branch @hokeun @edwardalee Could anyone help investigating this problem? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice and clean!
@edwardalee Would you please review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've requested what I think is just a documentation change, but also a sanity check to make sure the right mutex is held by all callers of this new function.
* Shutdown and close the socket. If read_before_closing is false, it just immediately calls shutdown() with SHUT_RDWR | ||
* and close(). If read_before_closing is true, it calls shutdown with SHUT_WR, only disallowing further writing. Then, | ||
* it calls read() until EOF is received, and discards all received bytes. | ||
* @param socket Pointer to the socket descriptor to shutdown and close. | ||
* @param read_before_closing If true, read until EOF before closing the socket. | ||
* @return int 0 for success and -1 for an error. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this function assumes the lf_outbound_socket_mutex
mutex lock is held when the function is called. This needs to be stated in the documentation and checked for all calls to the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't really assume the mutex lock is held always.
In normal socket shutdown cases (normal termination), they do assume there is a mutex lock, for example, federate.c
's close_inbound_socket()
, close_outbound_socket()
, and rti_remote.c
's handle_federate_failed()
, handle_federate_resign()
does hold the mutexes.
However, some corner cases that handle abnormal termination, do not assume that there are mutex locks.
For example, in federate.c
's listen_to_rti_TCP()
function, when the 1byte header read fails, it does not lock any mutexes, and just shutdowns the socket.
Also, in rti_remote.c
's respond_to_erroneous_connections()
function, which is running in a separate thread, it does not lock any mutexes and just shutdowns the socket.
So, what I thought was that the user should handle the mutex if it is needed.
Of course, I could add one line in the documentation of this function, such as
Mutex locks should be considered before calling this function.
or would there be any other suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the code correct without this specific mutex? Note that it can't be any mutex... All callers have to use the same mutex...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, professor, I think I didn't understand your question in the first place.
What I understood was, "Don't you need this lf_outbound_socket_mutex
inside this function?" Am I correct? For that question, as I explained, some cases need mutexes, and some cases do not. So, I thought that the function caller should set the mutex on their own.
If I misunderstood your question, could you please clarify your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Among other things, the function does this:
int shutdown_socket(int* socket, bool read_before_closing) {
if (*socket == -1) {
lf_print_log("Socket is already closed.");
return 0;
}
...
*socket = -1;
return 0;
}
If this function can be called from multiple threads, then two threads can both get past the first check and then proceed to try to simultaneously close the socket. The only way this code is safe is if you can assure that no more than one thread can call this function.
If more than one thread can call the function, then the above code is only correct if all threads that can call the function acquire the same mutex before calling it. Otherwise, the code is not correct.
One way to ensure safety is to assure that anywhere this is called without acquiring a mutex, only one thread is running. Alternatively, you would need to assure that no other thread that might be running can call this function. This may be quite difficult to assure.
One solution would be to acquire a mutex within this function. But this may be inefficient because some callers may already hold the relevant mutex. In the absence of such a local mutex, you would need to assure that all callers hold the mutex, and it would have to be clearly documented that this is required to call this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added another mutex in the shutdown_socket()
function, named as socket_mutex
, and updated the PR description.
So, now when the function shutdown_socket()
is called, the shutdown_mutex
is first locked, and unlocked in the end of the function, preventing other shutdown_socket()
function calls creating a potential memory access issues in the critical section.
To remove some inefficiency, I removed all unnecessary mutex locks that were acquired only for shutdown.
For example, the mutex lock in the function below is removed.
static void close_inbound_socket(int fed_id) {
// LF_MUTEX_LOCK(&socket_mutex); // Now removed.
if (_fed.sockets_for_inbound_p2p_connections[fed_id] >= 0) {
shutdown_socket(&_fed.sockets_for_inbound_p2p_connections[fed_id], false);
}
// LF_MUTEX_UNLOCK(&socket_mutex); // Now removed.
}
Another change is that I removed the socket_mutex
itself, which acted as the inbound_socket_mutex
for federates. It was only used before calling the shutdown_socket()
function, so I removed it because there are mutex locks inside the shutdown_socket
function now.
…ate a function to call the initialize of the mutex.
This PR follow ups #505.
Please merge with lf-lang/lingua-franca#2474.
New Integrated Function:
shutdown_socket()
int shutdown_socket(int* socket, bool read_before_closing);
Arguments
socket
: A pointer to the socket descriptor that needs to be shut down and closed.read_before_closing
: A boolean indicating whether the socket should read any remaining incoming data before closing:FIN
packet (SHUT_WR
) and waits forEOF
(0-length message) from the peer.SHUT_RDWR
).Return Values
0
: Indicates successful shutdown and closure of the socket.-1
: Indicates a failure during shutdown or closure, with errno set to describe the specific error.Function Description
This function gracefully shuts down and closes a socket.
shutdown(SHUT_RDWR)
to immediately stop both sending and receiving data. Then callsclose()
.shutdown(SHUT_WR)
to signal the end of writing but allows reading to continue.EOF
(indicated byread()
returning 0), and discards all received bytes.This function holds a new mutex named
shutdown_mutex
, which ensures preventing deadlocks.Refactoring on
close_inbound_socket()
andclose_outbound_socket()
fromfederate.c
close_inbound_socket()
The original code looked not updated. There were no use case when the argument
flag
was not 1. I removed theflag
argument, and all unused code.close_outbound_socket()
The original flags were only set by this one line code.
int flag = _lf_normal_termination ? 1 : -1;
So it depends on
_lf_normal_termination
. However, this function can directly use_lf_normal_termination
, so I removed theflag
, and directly used_lf_normal_termination
.Minor implementation details for macOS.
When the federate sends the
MSG_TYPE_RESIGN
signal, the RTI callsshutdown(SHUT_WR)
. Then, in the federate side, the thread fromlisten_to_rti_TCP()
'sread_from_socket()
'sread()
returns with a EOF. Then, it callsshutdown_socket()
.However, the
shutdown()
fails, with anerrno
Socket is not connected
on only MacOS. even though the connection is half open. However, since the connection is half p[em, the federate should still send aFIN_ACK
packet to the RTI, because the RTI is blocked on theread()
.Thus, the federate does not return on failure of
shutdown()
, and calls theclose()
, to ensure theFIN_ACK
packet is sent.To summarize, even when
shutdown()
fails, the code will tryclose()
, to ensure the socket is closed.Minor addition of logic in
federate.c
'shandle_tagged_message()
In
federate.c
'shandle_tagged_message()
, when the received message already missed the tag, it closes the inbound socket, however does not return -1 for failure of reading the message. This is logic is added.Some mutex locks removed.
socket_mutex
was only used as federate's inbound socket mutex. Especially, it was only used when closing inbound sockets. The newshutdown_socket()
function holds its own mutex, so removed the duplication of mutex locks.