Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove tracing errors from tests execution output #1142

Conversation

josecelano
Copy link
Member

@josecelano josecelano commented Dec 18, 2024

When you run tests you see some errors like this:

running 1 test
  2024-12-18T11:06:54.443275Z ERROR tower_http::trace::on_failure: response failed, classification: Status code: 500 Internal Server Error, latency: 0 ms
    at /home/josecelano/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tower-http-0.6.2/src/trace/on_failure.rs:93

test servers::api::v1::contract::context::auth_key::should_fail_when_the_auth_key_cannot_be_deleted ... ok

Those errors are written by tracing to the standard error output. They are enabled by adding these lines to the tests:

INIT.call_once(|| {
    tracing_stderr_init(LevelFilter::ERROR);
});

Errors are expected. It's the current API behaviour. We need to change that in the next major API version. For now, we will capture them (not showing them in the output) and assert that they were written by tracing. This is something we should do every time we write into logs, as logs are an essential feature for the application, and it should be tested, too.

This PR adds a new tracing initialization function for tests tracing_init_with_capturer that can be used to capture logs and assert that logs contain certain messages. For example:

async fn should_fail_when_the_auth_key_cannot_be_deleted() {
    let log_capturer = Arc::new(Mutex::new(LogCapturer::new()));

    INIT.call_once(|| {
        tracing_init_with_capturer(LevelFilter::ERROR, log_capturer.clone());
    });

    let env = Started::new(&configuration::ephemeral().into()).await;

    let seconds_valid = 60;
    let auth_key = env
        .tracker
        .generate_auth_key(Some(Duration::from_secs(seconds_valid)))
        .await
        .unwrap();

    force_database_error(&env.tracker);

    let response = Client::new(env.get_connection_info())
        .delete_auth_key(&auth_key.key.to_string())
        .await;

    assert_failed_to_delete_key(response).await;

    // We expect to see a 500 error; it's the current API behavior
    assert!(log_capturer.lock().unwrap().contains("ERROR"));
    assert!(log_capturer.lock().unwrap().contains("tower_http"));
    assert!(log_capturer.lock().unwrap().contains("500 Internal Server Error"));

    env.stop().await;
}

Subtasks

  • Create a tracing "writer" to capture logs: log capturer.
  • Modify one test as an example.
  • Modify the rest of the tests currently printing errors to the output when you run tests.

Future PRs

  • Modify all tests that affect logs, asserting lines that should be written included.

@josecelano josecelano linked an issue Dec 18, 2024 that may be closed by this pull request
@josecelano josecelano requested a review from mario-nt December 18, 2024 12:09
@josecelano josecelano requested a review from da2ce7 December 18, 2024 12:09
@josecelano josecelano marked this pull request as draft December 18, 2024 12:10
@josecelano josecelano force-pushed the 1069-sdterror-output-from-tracing-running-tests branch from bce82b3 to b57516c Compare December 18, 2024 12:18
@josecelano
Copy link
Member Author

I had not realized we were using the init tracing function that installs a Subscriber as the global default. So, my solution does not work when you run tests in parallel. It only works when you run a single test:

cargo test should_fail_when_the_auth_key_cannot_be_deleted

I have to find a way to capture logs independently for each test. The log lines do not have any identifier to filter logs written by the execution of the test. Besides, it's not a log error we are writing directly. Maybe we should disable the tower_http::trace::on_failureand replace it with our error. That would allow us to change it. However, we still need to find a way to identify the errors that came from the test.

cc @da2ce7 @mario-nt

@josecelano
Copy link
Member Author

I had not realized we were using the init tracing function that installs a Subscriber as the global default. So, my solution does not work when you run tests in parallel. It only works when you run a single test:

cargo test should_fail_when_the_auth_key_cannot_be_deleted

I have to find a way to capture logs independently for each test. The log lines do not have any identifier to filter logs written by the execution of the test. Besides, it's not a log error we are writing directly. Maybe we should disable the tower_http::trace::on_failureand replace it with our error. That would allow us to change it. However, we still need to find a way to identify the errors that came from the test.

cc @da2ce7 @mario-nt

I see two solutions:

1. Global Log Capturer

We can create a global log capturer (static) shared by all tests. The problem here is identifying logs from the test we are executing. We need to introduce an identifier. We can override the error written by tomer_http (see here). We could include the request id. Since we also have the request-id in the response, we can get it from the response header and search in logs from lines containing that request-id.

2. Write logs to files

We can add new settings to the tracker to write logs to files. For example:

[logging]
#output = 'sdt' 
#output = 'file' 
file_path = 'path/to/log'
format = 'plain'
#format = 'json'

When we make the ephemeral test env, we can set output to file and read logs from the file after executing the test and before stopping the env.

3. Add a logs collector

We could create a new middleware, or tracing wrapper, or find other ways, to send logs to an external service, for example HTTP API. This could be useful in the future, but I think it's too complex now.

I prefer the second option even if it can be slower because of the writing and reading to disk. At least for these tests. For other tests, maybe we should find a way to mock tracing in unit tests.

cc @da2ce7 @mario-nt

@josecelano
Copy link
Member Author

Today, @da2ce7 and I agreed on implementing option 1 because, with option 2, the tracing configuration is also initialized once. Therefore, all tests would use the same log file.

@josecelano
Copy link
Member Author

josecelano commented Dec 19, 2024

Today, @da2ce7 and I agreed on implementing option 1 because, with option 2, the tracing configuration is also initialized once. Therefore, all tests would use the same log file.

There are a couple of things to consider before implementing option 1:

Especially the tracing-test crate.

@josecelano
Copy link
Member Author

Hi @da2ce7 I found this crate tracing-test that does what we want to do with option 1. However, we have to disable the tracing initialization in production with an env var when we run the tests (because we can't set a second global subscriber).

I've tried to use it, but I am having a problem. I see logs written in the tests and others but I don't see some logs written from the test env. I have created a minimal example nesting levels of spawned tasks, and it's working, but some people have reported similar problems.

I will continue trying more with the crate, but if I don't find the problem, we can continue with our custom solution. The crate is nice because it uses macros and remove boilerplate code from the test but the code is simple. It's similar; it has a static buffer that captures the output. Maybe we can change the configuration to use tracing_stdout_init or tracing_capturer_init depending on an env var, so we can use it in any test without adding a new attribute to the test. You will only need to add the env var when you run the test, and It will be available. I can try that. I hope I will not run into the same problem. Maybe a custom solution can be more flexible for us since we have all that code to start/stop environments.

@josecelano
Copy link
Member Author

Hi @da2ce7 I found this crate tracing-test that does what we want to do with option 1. However, we have to disable the tracing initialization in production with an env var when we run the tests (because we can't set a second global subscriber).

I've tried to use it, but I am having a problem. I see logs written in the tests and others but I don't see some logs written from the test env. I have created a minimal example nesting levels of spawned tasks, and it's working, but some people have reported similar problems.

I will continue trying more with the crate, but if I don't find the problem, we can continue with our custom solution. The crate is nice because it uses macros and remove boilerplate code from the test but the code is simple. It's similar; it has a static buffer that captures the output. Maybe we can change the configuration to use tracing_stdout_init or tracing_capturer_init depending on an env var, so we can use it in any test without adding a new attribute to the test. You will only need to add the env var when you run the test, and It will be available. I can try that. I hope I will not run into the same problem. Maybe a custom solution can be more flexible for us since we have all that code to start/stop environments.

I've created a sample repo to reproduce the problem:

https://github.com/josecelano/tracing-test-bug

It does not work with threads, therefore I'm going to continue with our custom solution if I don't have the same problem.

@josecelano josecelano force-pushed the 1069-sdterror-output-from-tracing-running-tests branch from b57516c to b1e7ef5 Compare December 20, 2024 11:10
@josecelano
Copy link
Member Author

josecelano commented Dec 20, 2024

Superseded by #1148

@josecelano josecelano closed this Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant