Spinroot

A forum for Spin users

You are not logged in.

#1 2012-10-22 11:06:50

Mara
Member
Registered: 2012-05-30
Posts: 24

Analysis of pan output and memory adjustment in multicore run

Hi all,
I'm trying to run my verification on a multicore machine (11Gb of free RAM, 8 cores). I've changed the shmmax and shmall as suggested on spin website.
And then:


gcc -o pan pan.c -DMEMLIM=8192 -DNOFAIR -DSAFETY -DNCORE=4 -DVMAX=88 -DUSE_DISK -DFULL_TRAIL -DCOLLAPSE
pan -E -e -v -w27

I need the full traces, and I need all the traces. The  -DVMAX=88 is suggested by pan.

When I compile pan without the -DUSE_DISK flag, the memory is very rapidly consumed, so I specified -DUSE_DISK flag. The point is that 70Gb of disk memory are very fast consumed, without providing me with better results (in terms of obtained traces and portion of explored tree) than a single processor run with only 2Gb RAM!
I tried specifying the -DCOLLAPSE flag, but no sensible changes happens.

Is there something that I'm missing in memory or flags specification?  Or is this the normal behavior due to overhead of multicore run?
Thanks in advance for your attention

Mara

---------------------------------------------

cpu0: step 0: -DMEMLIM=8192 Mb minus hashtable+workqs (1024 + 128.043 Mb) leaves 7039.33 Mb
cpu0: step 1: allocate shared hastable 1024 Mb
cpu0: step 2: pre-allocate shared state memory 7039.33 Mb
cpu0: setting segsize to 1759.83 MB
cpu0: step 2+: pre-allocate memory arena 0 of 1.8e+03 Mb
cpu0: step 2+: pre-allocate memory arena 1 of 1.8e+03 Mb
cpu0: step 2+: pre-allocate memory arena 2 of 1.8e+03 Mb
cpu0: step 2+: pre-allocate memory arena 3 of 1.8e+03 Mb
cpu0: step 3: allocate shared workqueues 128.043 MB
cpu0: step 4: calling fork()
cpu1: starting core_id 1 -- pid 27806
cpu0: starting core_id 0 -- pid 27805
cpu0: local heap 1759 MB
cpu2: starting core_id 2 -- pid 27807
cpu2: local heap 1759 MB
cpu3: starting core_id 3 -- pid 27808
cpu1: local heap 1759 MB
cpu3: local heap 1759 MB
cpu0: created temporary diskfile Q000_000.tmp
cpu2: created temporary diskfile Q000_002.tmp
cpu3: created temporary diskfile Q000_003.tmp
cpu1: created temporary diskfile Q000_001.tmp
cpu0: created temporary diskfile Q001_000.tmp
cpu2: created temporary diskfile Q001_002.tmp
cpu3: created temporary diskfile Q001_003.tmp
cpu1: created temporary diskfile Q001_001.tmp
cpu0: created temporary diskfile Q002_000.tmp
cpu2: created temporary diskfile Q002_002.tmp
cpu3: created temporary diskfile Q002_003.tmp
cpu1: created temporary diskfile Q002_001.tmp
cpu0: created temporary diskfile Q003_000.tmp
cpu2: created temporary diskfile Q003_002.tmp
cpu1: created temporary diskfile Q003_001.tmp
cpu3: created temporary diskfile Q003_003.tmp
cpu0: created temporary diskfile Q004_000.tmp
cpu1: created temporary diskfile Q004_001.tmp
cpu2: created temporary diskfile Q004_002.tmp
cpu2: created temporary diskfile Q005_002.tmp
cpu0: created temporary diskfile Q005_000.tmp
cpu3: created temporary diskfile Q004_003.tmp
cpu1: created temporary diskfile Q005_001.tmp
cpu3: created temporary diskfile Q005_003.tmp
cpu2: created temporary diskfile Q006_002.tmp
cpu0: created temporary diskfile Q006_000.tmp
cpu1: created temporary diskfile Q006_001.tmp
cpu3: created temporary diskfile Q006_003.tmp
cpu2: created temporary diskfile Q007_002.tmp
cpu0: created temporary diskfile Q007_000.tmp
cpu3: created temporary diskfile Q007_003.tmp
cpu2: created temporary diskfile Q008_002.tmp
cpu0: created temporary diskfile Q008_000.tmp
cpu1: created temporary diskfile Q007_001.tmp
cpu3: created temporary diskfile Q008_003.tmp
cpu2: created temporary diskfile Q009_002.tmp
cpu1: created temporary diskfile Q008_001.tmp
cpu0: created temporary diskfile Q009_000.tmp
cpu3: created temporary diskfile Q009_003.tmp
cpu0: created temporary diskfile Q010_000.tmp
cpu1: created temporary diskfile Q009_001.tmp
cpu2: created temporary diskfile Q010_002.tmp
cpu3: created temporary diskfile Q010_003.tmp
cpu1: created temporary diskfile Q010_001.tmp
cpu0: created temporary diskfile Q011_000.tmp
cpu2: created temporary diskfile Q011_002.tmp
cpu3: created temporary diskfile Q011_003.tmp
cpu1: created temporary diskfile Q011_001.tmp
cpu0: created temporary diskfile Q012_000.tmp
cpu2: created temporary diskfile Q012_002.tmp
cpu3: created temporary diskfile Q012_003.tmp
cpu1: created temporary diskfile Q012_001.tmp
cpu0: created temporary diskfile Q013_000.tmp
cpu3: created temporary diskfile Q013_003.tmp
cpu2: created temporary diskfile Q013_002.tmp
cpu0: created temporary diskfile Q014_000.tmp
cpu1: created temporary diskfile Q013_001.tmp
cpu3: created temporary diskfile Q014_003.tmp
cpu2: created temporary diskfile Q014_002.tmp
cpu0: created temporary diskfile Q015_000.tmp
cpu1: created temporary diskfile Q014_001.tmp
cpu3: created temporary diskfile Q015_003.tmp
cpu2: created temporary diskfile Q015_002.tmp
cpu1: created temporary diskfile Q015_001.tmp
cpu0: created temporary diskfile Q016_000.tmp
cpu3: created temporary diskfile Q016_003.tmp
cpu2: created temporary diskfile Q016_002.tmp
cpu3: created temporary diskfile Q017_003.tmp
cpu0: created temporary diskfile Q017_000.tmp
cpu2: created temporary diskfile Q017_002.tmp
cpu1: created temporary diskfile Q016_001.tmp
cpu0: created temporary diskfile Q018_000.tmp
cpu3: created temporary diskfile Q018_003.tmp
cpu1: created temporary diskfile Q017_001.tmp
cpu2: created temporary diskfile Q018_002.tmp
cpu0: created temporary diskfile Q019_000.tmp
cpu1: created temporary diskfile Q018_001.tmp
cpu3: created temporary diskfile Q019_003.tmp
cpu2: created temporary diskfile Q019_002.tmp
cpu2: Depth=      40 States=    1e+06 Transitions= 4.74e+06 Memory=   106.812    t=      365 R=   3e+03
cpu3: Depth=      40 States=    1e+06 Transitions= 4.78e+06 Memory=   106.812    t=      374 R=   3e+03
cpu0: created temporary diskfile Q020_000.tmp
cpu1: Depth=      40 States=    1e+06 Transitions=  5.1e+06 Memory=   106.812    t=      379 R=   3e+03
cpu3: created temporary diskfile Q020_003.tmp
cpu2: created temporary diskfile Q020_002.tmp
cpu1: created temporary diskfile Q019_001.tmp
cpu0: created temporary diskfile Q021_000.tmp
cpu3: created temporary diskfile Q021_003.tmp
cpu1: created temporary diskfile Q020_001.tmp
cpu2: created temporary diskfile Q021_002.tmp
cpu0: created temporary diskfile Q022_000.tmp
cpu2: created temporary diskfile Q022_002.tmp
cpu3: created temporary diskfile Q022_003.tmp
cpu1: created temporary diskfile Q021_001.tmp
cpu0: created temporary diskfile Q023_000.tmp
cpu2: created temporary diskfile Q023_002.tmp
cpu1: created temporary diskfile Q022_001.tmp
cpu3: created temporary diskfile Q023_003.tmp
cpu0: created temporary diskfile Q024_000.tmp
cpu2: created temporary diskfile Q024_002.tmp
cpu1: created temporary diskfile Q023_001.tmp
cpu3: created temporary diskfile Q024_003.tmp
cpu0: Depth=      20 States=    1e+06 Transitions= 4.32e+06 Memory=  1259.486    t=      474 R=   2e+03
cpu1: created temporary diskfile Q024_001.tmp
cpu3: created temporary diskfile Q025_003.tmp
cpu0: created temporary diskfile Q025_000.tmp
cpu2: created temporary diskfile Q025_002.tmp
cpu0: created temporary diskfile Q026_000.tmp
cpu1: created temporary diskfile Q025_001.tmp
cpu2: created temporary diskfile Q026_002.tmp
cpu3: created temporary diskfile Q026_003.tmp
cpu0: created temporary diskfile Q027_000.tmp
cpu1: created temporary diskfile Q026_001.tmp
cpu2: created temporary diskfile Q027_002.tmp
cpu3: created temporary diskfile Q027_003.tmp
cpu0: created temporary diskfile Q028_000.tmp
cpu1: created temporary diskfile Q027_001.tmp
cpu2: created temporary diskfile Q028_002.tmp
cpu3: created temporary diskfile Q028_003.tmp
cpu0: created temporary diskfile Q029_000.tmp
cpu1: created temporary diskfile Q028_001.tmp
cpu2: created temporary diskfile Q029_002.tmp
cpu3: created temporary diskfile Q029_003.tmp
cpu0: created temporary diskfile Q030_000.tmp
cpu1: created temporary diskfile Q029_001.tmp
cpu3: created temporary diskfile Q030_003.tmp
cpu2: created temporary diskfile Q030_002.tmp
cpu1: created temporary diskfile Q030_001.tmp
cpu0: created temporary diskfile Q031_000.tmp
cpu2: created temporary diskfile Q031_002.tmp
cpu3: created temporary diskfile Q031_003.tmp
cpu1: created temporary diskfile Q031_001.tmp
cpu0: created temporary diskfile Q032_000.tmp
cpu3: created temporary diskfile Q032_003.tmp
cpu2: created temporary diskfile Q032_002.tmp
cpu0: created temporary diskfile Q033_000.tmp
cpu1: created temporary diskfile Q032_001.tmp
cpu2: created temporary diskfile Q033_002.tmp
cpu3: created temporary diskfile Q033_003.tmp
cpu0: created temporary diskfile Q034_000.tmp
cpu2: created temporary diskfile Q034_002.tmp
cpu1: created temporary diskfile Q033_001.tmp
cpu3: created temporary diskfile Q034_003.tmp
cpu0: created temporary diskfile Q035_000.tmp
cpu2: created temporary diskfile Q035_002.tmp
cpu1: created temporary diskfile Q034_001.tmp
cpu3: created temporary diskfile Q035_003.tmp
cpu0: created temporary diskfile Q036_000.tmp
cpu2: created temporary diskfile Q036_002.tmp
cpu1: created temporary diskfile Q035_001.tmp
cpu3: created temporary diskfile Q036_003.tmp
cpu0: created temporary diskfile Q037_000.tmp
cpu1: created temporary diskfile Q036_001.tmp
cpu2: created temporary diskfile Q037_002.tmp
cpu1: pan:1: aborting -- disk write failed (disk full?) (at depth 41)
cpu2: pan:1: aborting --pan: wrote prova_meta1_low_values.pml1.trail
cpu0: pan: wrote trailfile
pan: wrote prova_meta1_low_values.pml1.cpu1_trail
cpu1: pan: wrote trailfile
pan: wrote prova_meta1_low_values.pml1.cpu2_trail
cpu2: pan: wrote trailfile
cpu1: stop - Uerror
cpu1: termination initiated (4)
cpu1: wrapup -- 1 error(s)
cpu0: stop - Uerror
cpu0: termination initiated (4)
cpu0: wrapup -- 1 error(s)
cpu2: stop - Uerror
cpu2: termination initiated (4)
cpu2:
(Spin Version 6.2.2 cpu1: dsk_written 36120271 states in 37 files
cpu1: dsk_drained      0 states
cpu2: dsk_written 37095175 states in 38 files
cpu2: dsk_drained      0 states
  1512419 states, stored
  5484542 states, matched
  6996961 transitions (= stored+matched)
        1 atomic steps
hash conflicts:    221721 (resolved)

Stats on memory usage (in Megabytes):
  150.005    equivalent memory usage for states (stored*(State-vector + overhead))
7039.325    shared memory reserved for state storage
        in 4 local heaps of 1759.831 MB each

  161.642    actual memory usage for states
1024.000    memory used for hash table (-w27)
    0.534    memory used for DFS stack (-m10000)
  128.043    shared memory used for work-queues
        in 4 queues of   0.011 MB each + a global q of 128.000 MB
1314.219    total actual memory usage


nr of templates: [ 0:globals 1:chans 2:procs ]
collapse counts: [ 0:7373641 2:1 3:1 4:2 ]
unreached in proctype R_proc
    prova_meta1_low_values.pml:104, state 199, "-end-"
    (1 of 199 states)
unreached in proctype assert_proc
    prova_meta1_low_values.pml:109, state 2, "assert((HC00012_c<6))"
    prova_meta1_low_values.pml:110, state 3, "-end-"
    (2 of 3 states)
unreached in init
    (0 of 4 states)
cpu0: dsk_written 37419599 states in 38 files
cpu0: dsk_drained      0 states
cpu0: done, 0 Mb of shared state memory left
cpu0: done -- got 0 states from queue

cpu0: elapsed time 749 seconds (1.51242e+06 states visited)
cpu0: rate 2020.46 states/second

Offline

#2 2012-10-23 04:42:15

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: Analysis of pan output and memory adjustment in multicore run

In this case you may be better off with a single core bit state run.
Note though that you cannot get "all the traces" because that's not how the model checker works.
Spin guarantees only that if at least one counter-example (ie error sequence) exist then it will generate at least one such counter-example. It does not guarantee that it can generate 'all' counter examples (although it generally is able to produce a lot of them anyway...)

Offline

#3 2012-10-23 13:06:25

Mara
Member
Registered: 2012-05-30
Posts: 24

Re: Analysis of pan output and memory adjustment in multicore run

Thank you for your answer, as always.
I'm just thinking:

1) when I'm not using an approximated approach, and when Spin does not block at the first counterexample, I assumed that - theoretically speaking, without memory and time constraints  - it would be able to explore the whole computation space. And while doing this, it will provide me all the traces. So, is this assumption wrong?

And if it is wrong (and this maybe is connected with my other question about the measure of the portion of the explored space)  is possible to know how Spin behaves during the generation of the counterexamples? Can you maybe direct me to a paper or publication in which I can find details about this subject?



2)  If I got it well,  I have no hope of getting benefit from a multicore machine since the problem is inherently connected with the dimension of the computation space of my model. So it's better to use one single core with all the available memory, independently from the storage model or the hashing method. Is this right

3) You're suggesting me to use an approximate hashing methods, that is not guaranteed to find some counteraxmples. Also in this case is there no measure of the portion of computation space that is spanned? Is there same literature on the subject?

I know that I'm asking "strange" things and doing a "strange" use of Spin, but I'm working on verification of biological models that really differs from the ones Spin was designed to.

Thank you very much for all attention

Mara

Offline

#4 2012-10-23 16:39:39

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: Analysis of pan output and memory adjustment in multicore run

Re 1:
for a Depth-first search (which is what you'd need for the verification of LTL,or  omega regular properties in general) there is no guarantee that the search will exercise all complete paths in the reachability graph -- it only guarantees to visit all reachable states. (due to backtracking and the storage of previously visited states, it will not go down every path, e.g., if the path starts with a previously visited state -- even if the prefix of the path is different).
for a Breadth-first search things work differently, but also here all states but not all full paths are explored -- for instance if there is a cycle in the path, then the full path will only be explored with one iteration (or none, if that's a possibility) of the states within the cycle.
So this doesn't really have much to do with explicit decisions in Spin, other than the use of these classic graph search algorithms.

Re 2) you're likely running into memory limit problems, and since the multi-core algorithms use more memory than the single core algorithms (they aim to minimize runtime, at the expense of using more memory), you're probably better off sticking with the single core methods. Also to save memory and to optimize the number of states that you can explore, the Bitstate option also seems to be the best: allowing you to explore a much larger part of the statespace than you otherwise could do on the machine you're using. (Better still would be to simplify the model so that there are fewer reachable states to explore of course.)
The bottom line is in this case that exploring a mode, with R>>M states on a machine that can only store M states, even if you di it with an exhaustive search mode, is also going to be lossy: you will lose R-M states in those runs. Then it is far better to use the bitstate method that may still allow you to reach all R states (but with a non-zero probability that you will miss a few anyway).

Offline

#5 2012-10-24 10:01:35

Mara
Member
Registered: 2012-05-30
Posts: 24

Re: Analysis of pan output and memory adjustment in multicore run

Thank you very much for your extensive attention!
This is really a delicate point. At the moment I'm simply checking an invariant (an assert) at timeout. The idea behind it, is that I'm interested in the biggest number of counterexamples, since they can contain biologically "normal" execution and "strange" (interesting) ones. From here my need of "all" traces. Having the possibility of indicating how much often a "strange" execution occurs would have be for me really important. I think that I'll explore the bitstate option, as you suggest.

In the meantime  - I report it for the sake of chronicle - I 've tried using the minimal automata representation....it is very very very slow, but until now is not run out of memory.

I'll try also to simplify the model, but I think the only space left for this, in my case,  is reducing the initial values of my variables. This should have a sensible effect on the states to explore, right?

Thank you again,
Mara

Offline

Board footer

Powered by FluxBB