Spinroot

A forum for Spin users

You are not logged in.

#1 2012-10-22 15:04:53

Mara
Member
Registered: 2012-05-30
Posts: 24

multicore BFS error (and broken link to the Preprint of Spin 2012 Work

Hi again.
I'm trying to run my verification qith the BFS approach on a multicore machine.

gcc -o pan pan.c -O2 -DBFS_PAR -DNOFAIR -DSAFETY -DBFS_DISK
./pan -E -e -v -u4

The message I get is "diskread: No space left on device" (see below for details).
If instead I specify the DMEMLIM flag

gcc -o pan pan.c -O2 -DBFS_PAR -DMEMLIM=2048 -DNOFAIR -DSAFETY -DBFS_DISK
./pan -E -e -v -u4

I get another error: "diskread: No such file or directory"  (see 2nd error, below, for details).

What I'm doing wrong?
Thank you VERY MUCH for all the answers,

Mara



--------- first error
cpu00: largest shared memory segment: 8160 MB
pan: using 4 cores
diskread: No space left on device
cpu03: pan:1: aborting (at depth 0)
cpu03: stop (bfs_Uerror)
cpu00: states sent       1 recvd       0 stored        1 sleeps:    0,    0, 761
cpu01: stop (early termination)
cpu02: stop (early termination)

(Spin Version 6.2.2 -- 6 June 2012)
    + Multi-Core (using 4 cores)
    + Breadth-First Search
    + Partial Order Reduction

Hash-Compact 4 search for:
    never claim             - (none specified)
    assertion violations    +
    cycle checks           - (disabled by -DSAFETY)
    invalid end states    - (disabled by -E flag)

State-vector 68 byte, depth reached 0, errors: 1
        1 states, stored
           1 nominal states (stored-atomic)
        0 states, matched
        1 transitions (= stored+matched)
        0 atomic steps
hash conflicts:         0 (resolved)

Stats on memory usage (in Megabytes):
    0.000    equivalent memory usage for states (stored*(State-vector + overhead))
  128.000    shared memory used for hash table (-w24)
    0.391    total non-shared memory usage

  672.033    total shared memory usage

------------------------  second error
gcc -o pan pan.c -O2 -DBFS_PAR -DMEMLIM=2048 -DNOFAIR -DSAFETY -DBFS_DISK
./pan -E -e -v -u4
cpu00: largest shared memory segment: 2048 MB
pan: using 4 cores
diskread: No such file or directory
cpu03: pan:1: aborting (at depth 0)
cpu03: stop (bfs_Uerror)
cpu00: states sent       1 recvd       0 stored        1 sleeps:    0,    0, 1834
cpu01: stop (early termination)
cpu02: stop (early termination)

(Spin Version 6.2.2 -- 6 June 2012)
    + Multi-Core (using 4 cores)
    + Breadth-First Search
    + Partial Order Reduction

Hash-Compact 4 search for:
    never claim             - (none specified)
    assertion violations    +
    cycle checks           - (disabled by -DSAFETY)
    invalid end states    - (disabled by -E flag)

State-vector 68 byte, depth reached 0, errors: 1
        1 states, stored
           1 nominal states (stored-atomic)
        0 states, matched
        1 transitions (= stored+matched)
        0 atomic steps
hash conflicts:         0 (resolved)

Stats on memory usage (in Megabytes):
    0.000    equivalent memory usage for states (stored*(State-vector + overhead))
  128.000    shared memory used for hash table (-w24)
    0.391    total non-shared memory usage

  672.033    total shared memory usage

Offline

#2 2012-10-23 04:46:29

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

If I am not mistaken, the -DBFS_DISK option isn't supported yet for parallel breadth-first search.
The compilation should of course have reported that and failed. Sorry about that.

Offline

#3 2012-10-23 11:41:52

Mara
Member
Registered: 2012-05-30
Posts: 24

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

Hi,
thank you very much  for your answer. I read of the -DBFS_DISK otion on the online page:
http://spinroot.com/spin/multicore/bfs.html
in the section New Directives. Maybe will be useful to specify also there that this option is not yet supported.
BTW the link Preprint of Spin 2012 Workshop paper in the same page is broken. Is this paper downloadable elsewere?
Thank you

Mara

Offline

#4 2012-10-23 16:26:12

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

I checked, and you were right: it is supported (though for -DBFS_PAR not quite recommended; it works well for -DBFS though).
I tried to do a run with another model, compiled precisely the same way you did, and running pan the same way you did, but do not see the same behavior. In my case it all works correctly.
One thing I noticed is that you have very little shared memory -- you can increase it by following the steps shown on:
http://spinroot.com/spin/multicore/setup.html
The error you get is extra strange because it is the perror (internal system error from the operating system) after a failed call to "read" -- i.e., a file read operation. How that can result in an "no space left on device" report I don't understand.
Also don't really understand how adding -DMEMLIM can change that into a "No such file" error.
Could you create a small version of your Spin model that can illustrate the problem, so that I can try to reproduce it here?

Re the paper -- you can find a working link at the bottom of this page: http://spinroot.com/gerard/pubs.html
(I'll fix the other link.)

Offline

#5 2012-10-24 09:26:38

Mara
Member
Registered: 2012-05-30
Posts: 24

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

Hi, thank you for your answers!

I tried with this reduced version of my pml program, and still gives me the same messages.
BTW, I have checked my shared memory (I had already followed the instruction on the site) and is kernel.shmmax = 8589934592, confirmed also by the message in first error  cpu00: largest shared memory segment: 8160 MB . I can change it up to 14Gb, ma I have no more than this. Is this quantity you're referring to or mabe you're talking about the   672.033    total shared memory usage?
Thank you again

Mara
Here follows the code (this is a reduced version, i removed a lot of atomic steps, but I did not removed all the contained variables, so the variable list is bigger than the really needed one.)

 byte HC00011_c = 5;
 byte HC00011_s = 10; 
 byte HC00012_c = 5; 
 byte HC00013_c = 5;
 byte HC00014_c = 5;
 byte HC00015_c = 5;
 byte HC00016_c = 5;
 byte HC00017_c = 10;
 byte HC00017_s = 5;
 byte HC00018_c = 10;
 byte HC00019_c = 10;
 byte HC00021_b = 5;
 byte HC00021_c = 5; 
 byte HC00023_c = 5;
 byte HC00023_n = 5;
 byte HC00032_c = 5;
 byte HC00032_p = 5;
 byte HC00034_c = 5;
 byte HC00036_c = 5;
 byte HC00040_c = 5;
 byte HC00040_s = 10;
 byte HC00045_c = 5;
 byte HC00056_c = 5;
 byte HC00066_c = 5;
 byte HC00066_n = 5;
 byte HC00068_c = 5;
 byte HC00083_c = 5;
 byte HC00083_p = 5;
 byte HC00088_c = 5;
 byte HC00094_c = 5;
 byte HC00097_c = 5;
 byte HC00102_c = 5;
 byte HC00109_c = 5;
 byte HC00116_c = 5;
 byte HC00125_c = 5;
 byte HC00126_c = 5;
 byte HC00126_n = 5;
 byte HC00140_c = 5;
 byte HC00163_c = 5;
 byte HC00186_c = 5;
 byte HC00212_c = 5;
 byte HC00214_c = 5;
 byte HC00235_c = 5;
 byte HC00244_c = 5;
 byte HC00244_n = 5;
 byte HC00360_c = 5;
 byte HC00436_c = 5;
 byte HC00460_c = 5;
 byte HC00662_c = 5;
 byte HC01115_c = 5;
 byte HC01162_c = 5;
 byte HC01522_c = 5;
 byte HC02129_c = 5;

/*
Only two processes are generated.
proctype R_proc contains all the reactions, 
proctype assert_proc contains the assert invariant to be checked
*/

proctype R_proc()
	{
	do
	:: atomic{((HC00014_c > 0) && (HC00125_c > 0))  -> HC00014_c=HC00014_c-1; HC00125_c=HC00125_c-1; HC00013_c=HC00013_c+1; HC00056_c=HC00056_c+2} /* REACTION NUMBER 21: 1 HC00014_c + 1 HC00125_c-->1 HC00013_c+2 HC00056_c*/
	:: atomic{((HC00036_c > 0) && (HC00056_c > 1))  -> HC00036_c=HC00036_c-1; HC00056_c=HC00056_c-2; HC00011_c=HC00011_c+2; HC00125_c=HC00125_c+1} /* REACTION NUMBER 87: 1 HC00036_c + 2 HC00056_c-->2 HC00011_c+1 HC00125_c*/
	:: atomic{((HC00094_c > 0))  -> HC00094_c=HC00094_c-1; HC00088_c=HC00088_c+1} /* REACTION NUMBER 192: 1 HC00094_c-->1 HC00088_c*/
	:: atomic{((HC00012_c > 0) && (HC00097_c > 0))  -> HC00012_c=HC00012_c-1; HC00097_c=HC00097_c-1; HC00018_c=HC00018_c+1; HC00662_c=HC00662_c+1} /* REACTION NUMBER 206: 1 HC00012_c + 1 HC00097_c-->1 HC00018_c+1 HC00662_c*/
	:: atomic{((HC00015_c > 0) && (HC00021_c > 0) && (HC00102_c > 0))  -> HC00015_c=HC00015_c-1; HC00021_c=HC00021_c-1; HC00102_c=HC00102_c-1; HC00011_c=HC00011_c+1; HC00016_c=HC00016_c+1; HC00212_c=HC00212_c+1} /* REACTION NUMBER 227: 1 HC00015_c + 1 HC00021_c + 1 HC00102_c-->1 HC00011_c+1 HC00016_c+1 HC00212_c*/
	:: atomic{((HC00011_c > 0) && (HC00045_c > 0) && (HC00140_c > 0))  -> HC00011_c=HC00011_c-1; HC00045_c=HC00045_c-1; HC00140_c=HC00140_c-1; HC00068_c=HC00068_c+1; HC00102_c=HC00102_c+1} /* REACTION NUMBER 230: 1 HC00011_c + 1 HC00045_c + 1 HC00140_c-->1 HC00068_c+1 HC00102_c*/
	:: atomic{((HC00109_c > 0))  -> HC00109_c=HC00109_c-1; HC00116_c=HC00116_c+1} /* REACTION NUMBER 243: 1 HC00109_c-->1 HC00116_c*/
	:: atomic{((HC00013_c > 0) && (HC00019_c > 0) && (HC00116_c > 0))  -> HC00013_c=HC00013_c-1; HC00019_c=HC00019_c-1; HC00116_c=HC00116_c-1; HC00014_c=HC00014_c+1; HC00214_c=HC00214_c+1} /* REACTION NUMBER 253: 1 HC00013_c + 1 HC00019_c + 1 HC00116_c-->1 HC00014_c+1 HC00214_c*/
	:: atomic{((HC00014_c > 0) && (HC00017_c > 0) && (HC00126_c > 0))  -> HC00014_c=HC00014_c-1; HC00017_c=HC00017_c-1; HC00126_c=HC00126_c-1; HC00011_c=HC00011_c+1; HC00013_c=HC00013_c+1; HC01162_c=HC01162_c+1} /* REACTION NUMBER 267: 1 HC00014_c + 1 HC00017_c + 1 HC00126_c-->1 HC00011_c+1 HC00013_c+1 HC01162_c*/
	:: atomic{((HC00066_n > 0) && (HC00244_n > 0))  -> HC00066_n=HC00066_n-1; HC00244_n=HC00244_n-1; HC00023_n=HC00023_n+1; HC00126_n=HC00126_n+1} /* REACTION NUMBER 269: 1 HC00066_n + 1 HC00244_n-->1 HC00023_n+1 HC00126_n*/
	:: atomic{((HC00015_c > 0) && (HC00360_c > 0))  -> HC00015_c=HC00015_c-1; HC00360_c=HC00360_c-1; HC00016_c=HC00016_c+1; HC00140_c=HC00140_c+1} /* REACTION NUMBER 293: 1 HC00015_c + 1 HC00360_c-->1 HC00016_c+1 HC00140_c*/
	od
	}

proctype assert_proc()
{

	timeout -> assert(HC00012_c < 6);
}


	
/* start the two processes*/
	init{
			atomic{run R_proc(); run assert_proc();}
	}
	

Offline

#6 2012-10-24 16:50:52

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

Thanks for the example -- I can reproduce the error as well and will prepare a fix.
For now, a workaround for the problem is if you simply declare the two proctypes as
active proctype R_proc() { ...}
and
active proctype assert_proc() { ... }
and omit the init {..} process

Offline

#7 2012-10-24 18:42:04

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

I've fixed the error -- the correct version will be part of Spin Version 6.2.3
(not sure yet when this will get into the main distribution though)
Thanks for the bug report!

Offline

#8 2012-10-25 09:20:34

Mara
Member
Registered: 2012-05-30
Posts: 24

Re: multicore BFS error (and broken link to the Preprint of Spin 2012 Work

Thank you for all the support!

Mara

Offline

Board footer

Powered by FluxBB