Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClusterFuzzLite integration #122

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

DavidKorczynski
Copy link

This adds fuzzing by way of ClusterFuzzLite, which is a GitHub action that will perform a short amount of fuzzing for new PRs. The goal is to use fuzzing to catch bugs that may be introduced by new PRs.

I added a fuzzer that targets simd_parse_string, and currently set the timeout of CFLite to 180 seconds. CFLite will flag if the fuzzer finds any issues in the code introduced by a PR.

If you'd like to test this the way ClusterFuzzLite runs it (by way of OSS-Fuzz) you can use the steps:

git clone https://github.com/google/oss-fuzz
git clone https://github.com/DavidKorczynski/simdzone simdzone
cd simdzone
git checkout clusterfuzzlite

# Build the fuzzers in .clusterfuzzlite
python3 ../oss-fuzz/infra/helper.py build_fuzzers --external $PWD

# Run the fuzzer for 10 seconds
python3 ../oss-fuzz/infra/helper.py run_fuzzer --external $PWD zone_parse_string_fuzzer -- -max_total_time=10

Signed-off-by: David Korczynski <david@adalogics.com>
@k0ekk0ek
Copy link
Contributor

Hi @DavidKorczynski. I've been wanting to add fuzzing and didn't even know about ClusterFuzzLite. Thanks! I've done a bit of reading up and fuzzing using ClusterFuzzLite seems like a great idea. I do have some reservations with regards to the fuzzing harness. The input string is required to be null-terminated and padded with ZONE_BLOCK_SIZE. This is one of the assumptions the parser makes in order to efficiently process as much complete blocks as possible. Apart from that, DNS presentation format is rather peculiar. It's a tabular serialization format where the schema changes with the resource record (RR) type, so in order to properly test parsing functions we should control that a little bit (eventually). The general layout is <owner> <class> <ttl> <type> <rdata>. The <class> and <ttl> are optional, the <owner> is too, but if it's left out, the <owner> of the previous RR is used. Then if the <type> is A, rdata is expected to be an IPv4 address, if the type is NS, rdata is expected to be a domain name. There's a lot more record types and even presentation formats. If no valid type is used, the actual parser functions will not be called.

All of that being said, this is a great start! Inputting random data will surely catch some bugs.

I'm currently working on getting documentation ready for a first release and plan to add more testing afterwards. Adding your work while adding tests is probably a good plan. I'll get to this in a couple of weeks. With documentation done it'll also be easier to explain some of the limitations/requirements of the api.

Thanks again for issuing the PR!

@DavidKorczynski
Copy link
Author

I do have some reservations with regards to the fuzzing harness. The input string is required to be null-terminated and padded with ZONE_BLOCK_SIZE. This is one of the assumptions the parser makes in order to efficiently process as much complete blocks as possible.

Should the null-termination happen before/after padding? I added changes such that the there is padding and the null-termination happens after the padding, let me know if this loos better!

Signed-off-by: David Korczynski <david@adalogics.com>
@k0ekk0ek
Copy link
Contributor

Null-termination should happen before the padding. So, given foo bar baz\0..., to determine the length of the final token, i.e. baz, the pointer to baz\0... is subtracted from the pointer to \0....

Signed-off-by: David Korczynski <david@adalogics.com>
@DavidKorczynski
Copy link
Author

Null-termination should happen before the padding. So, given foo bar baz\0..., to determine the length of the final token, i.e. baz, the pointer to baz\0... is subtracted from the pointer to \0....

Got it -- the fuzzer now null-terminates before the padding.

@k0ekk0ek
Copy link
Contributor

Thanks @DavidKorczynski! Looks right. I'll give this a try first thing next week.

@k0ekk0ek
Copy link
Contributor

@DavidKorczynski, I'm looking into this. I think I want to be able to run this locally too, i.e. compile with -fsanitize=fuzzer. I'm looking into how all the compiler options correlate, e.g. we currently have an option named SANITIZER that can be used like cmake -DSANITIZER=address,... .., that needs to be updated. And I'm thinking we want a corpus too. I'm not very familiar with all available options/tools for fuzzing yet. I suspect I'll update the build instructions in this PR and use the rest as-is. I'll report back asap.

options.default_ttl = 3600;
options.default_class = 1;

zone_parse_string(&parser, &options, &buffers, null_terminated, size_of_input,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor remark: size_of_input here should be size.

@k0ekk0ek
Copy link
Contributor

I studied some examples and came up with something that integrates nicely(?) If -DBUILD_FUZZING=on is passed to the cmake configure step, the sanitizers are automatically enabled unless sanitizers are manually specified in -DSANITIZE=.... I reused the fuzzer target from the PR. I'll have a look at integrating it into this PR tomorrow and I'll expand the corpus a bit.

@DavidKorczynski
Copy link
Author

Corpus is nice, but due to code size I'm not sure it's a concern as such in this case. I think increasing the amount of seconds spend fuzzing may have more impact.

Looks good with the flags! There may also be a benefit to having an option that purely relies on forwarding the CFLAGS/CXXFLAGS because these are what OSS-Fuzz provides.

Feel free to use the fuzzers however you like though -- no need to merge this PR as such, I'm just happy if we can get continuous fuzzing up and running for simdzone :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants