Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when using filtration vertex_degree #16

Closed
MonkeyBreaker opened this issue Mar 20, 2020 · 11 comments
Closed

Segmentation fault when using filtration vertex_degree #16

MonkeyBreaker opened this issue Mar 20, 2020 · 11 comments

Comments

@MonkeyBreaker
Copy link
Collaborator

Hi,

Currently working on the library, I just stumbled on a segmentation fault.
To reproduce the segmentation fault do as follow:

./flagser --out ./test/tmp_d5_vertex_degree --out-format barcode --filtration vertex_degree --undirected --max-dim 1 ./test/d5.flag

I would expect it does not SIGSEGV.
I'm currently busy with other tasks, but if I have some time, I'll try to explore the issue.

Let met know if this is expected behaviour, even if I would expect more an exception to be raised.

Take care and have a nice day,
Julián

@luetge
Copy link
Owner

luetge commented Mar 21, 2020

Hi Julián,

Could you quickly share the exact commit+the operating system you are running the code on? I could not reproduce the problem on my machine (latest master commit, MacOS). Does it consistently segfault?

daniel ~/programming/flagser master [?? 3]
$ ./flagser --out ./test/tmp_d5_vertex_degree --out-format barcode --filtration vertex_degree --undirected --max-dim 1 ./test/d5.flag
The dimensions of the mod-2 homology of the full complex are:

dim H_0 = 1
dim H_1 = 0
daniel ~/programming/flagser master [?? 4]
$ 

Best,
Daniel

@MonkeyBreaker
Copy link
Collaborator Author

Hi Daniel,

Thank you for the feedback.

I work on Ubuntu 18.04, my compiler is GCC version 7.5.0.
I work as you on latest commit on master.
It consistently segfaults, I cleaned everything, recompiled and error still raises.

Best,
Julián

@MonkeyBreaker
Copy link
Collaborator Author

MonkeyBreaker commented Mar 23, 2020

I profiled the error, it's raised in compute_filtration_t, line 83.

The segfaults raises because bdry.vertex(i) returns numbers greater than 32000. In my opinion, in method vertex it's reading pass vertices, when I look at the values of index and i in method vertex, they both equal to 1. I suppose, it then tries to read vertices[2], which is out of bounds.

I tried to change the rule from vertices[index + (index < i ? 0 : 1)]; to vertices[index + (index <= i ? 0 : 1)];, this fixes the segmentation fault, but then my barcodes are wrong.

Do you have an idea how to fix this ?

A supposition on why this does work on your machine, is because the memory is maybe initialized to 0, then if it reads out of bouds, it's only reading 0. This is only a supposition.

Best,
Julián

@MonkeyBreaker
Copy link
Collaborator Author

MonkeyBreaker commented Mar 23, 2020

Well I tested changing as follow in compute_filtration_t:

from:

boundary_filtration[i] = current_filtration[bdry.vertex(i)];

To:

auto tmp = bdry.vertex(i);
boundary_filtration[i] = current_filtration[tmp > size ? 0 : tmp];

No segfaults is raised and the barcodes are good. BTW, I let the condition original vertices[index + (index < i ? 0 : 1)];.

It's a hacky way of achieving something that seems to work, but I would like to have your feedback.

Best,
Julián

PS: Sorry for the spam ...

@luetge
Copy link
Owner

luetge commented Apr 4, 2020

Just checked, for me bdry.vertex(i) never gives high values, when I print all those values my output is:

4 
3 
1 
0 
0 
4 
0 
2 
0 
3 
0 
4 
0 
0 
2 
0 
3 
0 
4 
0 

Are you sure that the input file is not modified? Can you print those numbers? If anything, then the problem should lie there, I think. Checked it on MacOS and CentOS.

Thanks for checking!

@MonkeyBreaker
Copy link
Collaborator Author

MonkeyBreaker commented Apr 4, 2020

Hi,

Really interesting the issue I'm encountering. Now that I updated to latest commit, sometimes it work and sometimes it generated the segfault. I slightly modified the printing, because when streams are done in parallel, the output can be quite noisy.

When it works, I encounter this output for bdry.vertex(i):

2omputing the filtration of all edges
0
1
3
0
4
0
2
0
3
0
4
0
4
3
0
4
0
0
0
The dimensions of the mod-2 homology of the full complex are:

dim H_0 = 1
dim H_1 = 0

It's different from yours, because this method is run in parallel, now that I'm thinking about it, maybe I'm encountering a race condition, but strange why only with filtration equals vertex_degree.

When it fails, I've usually something that shows this:

1omputing the filtration of all edges
0
2
0
3
0
4
0
2
0
3
0
4
0
4
32687
32687 vs 5
[2]    24325 segmentation fault (core dumped)  ./flagser --out ./test/tmp_d5_vertex_degree --out-format barcode --filtration

BTW, I'm running my tests with the following command: ./flagser --out ./test/tmp_d5_vertex_degree --out-format barcode --filtration vertex_degree --undirected --max-dim 1 ./test/d5.flag

@MonkeyBreaker
Copy link
Collaborator Author

And about the input data, I printed it when it fails and when it success, both the same:

good bad
graph.vertex_filtration graph.vertex_filtration
0.33 0.33
0.44 0.44
0.89 0.89
0.36 0.36
0.11 0.11
graph.edge_filtration graph.edge_filtration
1.01 1.01
1.18 1.18
1.125 1.125
1.77 1.77
1.655 1.655
1.755 1.755
1.145 1.145
1.34 1.34
1.2 1.2
1.295 1.295

I'll try to explore the possibility of a race condition.
If you have any other suggestion, let me know.

@MonkeyBreaker
Copy link
Collaborator Author

MonkeyBreaker commented Apr 6, 2020

I think I found out the issue, the backtrace (not the complete, just important points) is as follow:

0. boundary_filtration[i] = current_filtration[bdry.vertex(i)]; // i == 1
1. f(prefix, prefix_size); // prefix_size == 2
2. complex.for_each_cell(compute_filtration, 1);
  1. code
  2. code
  3. code

The segfault arise because bdry.vertex(i) returns a big big number (>32000). The reason is because bdry.vertex(i) calls return vertices[index + (index < i ? 0 : 1)];. When printing index + (index < i ? 0 : 1) reveals this:

1 // return vertices[1];
2 // return vertices[2];

Which is correct in the sense of the current implementation:

for (int i = 0; i < size; i++) {
        // Creates object of type directed_flag_complex_boundary_cell_t with bdry.i == i
	auto bdry = cell.boundary(i);

	if (!cell_hash.size()) {
	    std::cerr << (std::to_string(bdry.vertex(i)) + " vs " + std::to_string(current_filtration.size()) + "\n");
	    if (bdry.vertex(i) >= current_filtration.size()) {
	        std::cerr << bdry.vertex(i) << " vs " << current_filtration.size() << std::endl;
	    }
	    boundary_filtration[i] = current_filtration[bdry.vertex(i)];
	} else {
		...
	}
}
// When i == 0 in the previous for loop we obtain this bdry.vertex(0)
return vertices[0 + (0 < 0 ? 0 : 1)]; // vertices[0 + 1]
// When i == 1 in the previous for loop we obtain this bdry.vertex(1)
return vertices[1 + (1 < 1 ? 0 : 1)]; // vertices[1 + 1]

But the problem is that vertices which is a pointer to prefix which is a pointer to std::vector<vertex_index_t> prefix(max_dimension + 1);, and that vector is of size 2.

From the previous line, max_dimension is equal to 1 and combined with the +1, produces a vector of size 2. You see where I'm going 😉 ? As showed before, in bdry.vertex(i) we access positions 1 and 2 of the vector, the problem is that our vector only contains positions 0 and 1.
This means we're reading unallocated memory.

A easy hack would be to change std::vector<vertex_index_t> prefix(max_dimension + 1); to std::vector<vertex_index_t> prefix(max_dimension + 2, 0);, but in my opinion it's quite dirty. Otherwise, my understanding of the algorithm does not allow me to find a clever fix ...

Hopefully all this debug of the code helps you,

Julián

@luetge
Copy link
Owner

luetge commented Apr 10, 2020

Hi,

Thanks for digging this deep! Indeed there is a problem here, to lookup the filtration values of the vertices of an edge we looked at the i-th vertex of the i-th face, i.e. the 0th vertex of the 0th face and the 1st vertex of the first face. That second one is obviously non-existent since a face of an edge only has a single vertex. Switching to reading the 0th vertex of the i-th face fixes that problem.

Please test out the code in #19 , I will merge as soon as you confirm. Thanks!

Best,
Daniel

@MonkeyBreaker
Copy link
Collaborator Author

Hi,

Niceuu !
Fix works like a charm, I also tested for every filtration and also tested the persistences found with each filtration with dataset d5.
Works perfectly 😄 !

Have a nice week-end and take care,
Julián

@luetge
Copy link
Owner

luetge commented Apr 10, 2020

Happy to hear that. Thanks a lot for all the debugging, made it super easy to fix it.

Have a great weekend, and stay healthy!

Cheers,
Daniel

@luetge luetge closed this as completed Apr 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants