Posts by srirajpaul

srirajpaul · General

That would be great.
For powerpc include the barrier while unlocking also.

srirajpaul · General

Hello,

We solved the problem for multicore execution in power7. The problem is that Power 7 has a relaxed memory model.

We solved it as follows. There was no test_and_set defined for power7 in the pan.c file

We included it as :
"
#elif defined(__powerpc64__)
int
tas(volatile int *s)
{
int r;
r = __fetch_and_or(s, 1);
__isync();
return r;
}
"

The __isync() prevents any instructions inside the critical section from moving above the the __fetch_and_or.

Similarly at the end of the critical section, that is where sh_lock[which] is set to zero ( sh_lock[which] = 0; /* unlock */ ) we need a barrier. We should prevent any memory operations inside critical section from moving outside this unlock. For this we can use __lwsync();
We included it as :
"
__lwsync(); sh_lock[which] = 0; /* unlock */
"

__isync and __lwsync are lighter than full sync. __fetch_and_or() gives a warning since its definition uses unsigned int whereas we pass int. These atomic functions require xlc [1] compiler (not gcc).

Possible gcc replacements would be [2]:
__lwsync() -> __sync_synchronize()
__fetch_and_or(s, 1); __isync(); -> __sync_lock_test_and_set()

[1]: http://www-01.ibm.com/support/docview.wss?uid=swg27024742&aid=1
[2]: https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

srirajpaul · Bug Reports

Hi,
Sorry for the post.
The single core run found an assert violation which was a real assert violation.
But we were not able to see it in multicore run because, the trace was not full. We enabled -FULL_TRAIL, but still the trace didn't had the error.
The violation was happening in future. If the trace had printed a few more lines it would have reached the assert violation.

Thank you

srirajpaul · Bug Reports

Hi, I am using version 6.4.0.
The same error comes in older versions also, like 6.2.7.

With 32 cores it took around 3 hours to get this error. I am not sure whether I can get the same error with single core DFS version.
I have given two runs for 24 and 72 hours in the standard (single-core) DFS version. I will report it as soon as it completes.

Thank you

srirajpaul · Bug Reports

Hi,
I am getting an assert failure, but the value is true.

My program is structured as follows:

do
:: case_1 ->
:: case_2 ->
d_step{

.
.
.
printf("%d\n", func(me));
assert(func(me));
}
od;

I am model checking the same in multicore mode(32 threads). This is resulting in an assert failure.
The value of func(me) is printed as 1, yet the assert fails. From the trace I manually
evaluated the value of func(me) and that also turns out to be true.

Model checking does not terminate immediately after the 'claimed' assertion violation.
The process stops after case_2 in the next iteration, and then reports the assertion violation;
this was determined from the trace generated via spin -t.

The assertion violation can be seen at line 360 of output.txt [1]
This assertion is called on line 2965 on the file new_pe.pml [1]
The trace file is new_pe.pml.trail [1]

To get an idea of what would be happening I tried to run using the trace file as
spin -t new_pe.pml > trace.txt

My understanding is that the problem is coming from "proc 5"
Proc 5 represents process_[4] in our model since init{} occupies proc 0.
Line 236 of trace.txt outputs - "[4]: transient_count_phase_node 1"
shows that the assert is evaluated to true and also the d_step is over.
After that other processes are executing. Later control comes back to
process 5 and it executes case_2 (line 2835 of new_pe.pml) and it fails
reporting assertion failure from previous iteration.

The model was a large and hence I changed the Spin source.
I changed Line 16 in Src/dstep.c to "#define MAXDSTEP 8196 /* was 512 */"
The default value was 2048.

I ran it in an AMD opteron platform as follows
I compiled the code as
gcc -O3 -DVECTORSZ=72000 -DCOLLAPSE -DSAFETY -DMEMLIM=62000 -DNCORE=32 -DFULL_TRAIL pan.c -o phaser_f_32_ft
And ran as just ./phaser_f_32_ft

I also tried it in IBM POWER 7 using xlc compiler. There also the same problem happens.

I am running the same program using different starting configuration. The problem occurred only in this one.

Thank you
Sriraj

[1] Files are uploaded at http://srp7.web.rice.edu/

srirajpaul · General

Hi,

I have a model[1] which is quite big because of which I am doing randomized search instead of complete one.
The hash factor comes to be around 2. We are trying to get more coverage be running it with random seeds.
Is there any pointer to how many random seeds need to used to get some good coverage?
Also is there some guidance in the selection of the seeds so that the state space explored don't overlap so much?

Thank you
Sriraj

[1]:http://srp7.web.rice.edu/new_pe.pml

srirajpaul · General

Thank you. Now it is working with different random seeds.

srirajpaul · General

Thank you very much. That will be very helpful.

srirajpaul · General

Is there a time frame by which the SPIN 6.2.7 will be released?

Since the model is large, we get a hash factor as small as 2. We are relying on randomized search to get some more coverage.
Is there some way to get a patched version of SPIN that fixes this problem or is there some work around that we can try at our side?
We usually create the spin executable from the sources.

Thank you
Sriraj

srirajpaul · General

I am trying to do a randomized search process using the -DT_RAND and -DP_RAND option. When I execute the code it runs fine for some seeds, but gets stuck for few others.

For example when I tried to run with -RS38 it worked fine.
But when I tried -RS59 it got stuck. In this case it gets stuck as soon as it find the first collision. When I do CTRL+C and come out, the result shows as follows

State-vector 35684 byte, depth reached 4647, errors: 0
2324 states, stored
1 states, matched
2325 transitions (= stored+matched)
0 atomic steps

I run it as follows
gcc -O2 -DVECTORSZ=64000 -DBITSTATE -DP_RAND -DT_RAND pan.c
./a.out -a -RS59 -w20

The updated model is put at at http://srp7.web.rice.edu/new_pe.pml

Thank you
Sriraj

srirajpaul · General

The model is put at http://srp7.web.rice.edu/new_pe.pml

I tried with the -DBFS_PAR option. It is running in parallel. But irrespective of what is specified using -DNCORE, it always uses n-1 cores, where n is the number of cores in the machine.
Also since it is using BFS, the depth explored is very short. The depth reached usually goes to around 4000 when I run it in the normal DFS way. Since the state space is too large, I am using -DBITSTATE also.

I ran it as:
spin -a new_pe.pml
gcc -DVECTORSZ=64000 -DBITSTATE -DSEP_STATE -DNCORE=2 -DFULL_TRAIL -DVMAX=35000 pan.c
./a.out -a -w20 -Nmorton_order1

In this link [1], the "Liveness Verification" section tells "Using a larger number of cores than 2 is not useful with the exisiting implementation". Does that mean when LTL is used we can use only 2 cores?

[1]: http://www.spinroot.com/spin/multicore/V5_Readme.html

srirajpaul · General

Hi,

I am trying to verify a model using LTLs with multicore options. But when I execute, it only runs on one CPU(CPU 0). Other CPUs show everything as zero. Output of one sample run is as follows.

e.g:
0: Claim morton_order1 (3), from state 6
cpu1: Depth= 0 States= 0 Transitions= 0 Memory= 0.000 t= 185 R= 0
cpu0: Depth= 4329 States= 4.97e+05 Transitions= 9.98e+05 Memory= 251.567 t= 185 R= 3e+03

But the documentation tells it can run on dual core[1]. I am using Spin version 6.2.5.

I complied using options "gcc -DVECTORSZ=64000 -DBITSTATE -DSEP_STATE -DNCORE=2 -DFULL_TRAIL -DVMAX=35000 "
and ran using the options "./a.out -a -w20".

Is there something I am doing wrong that makes it run on only one core?

Also all my LTL's are 'always' properties([]). My assumption was that liveness properties are the ones that come with 'eventually' operator(<>) and safety properties are the one with 'always' operator([]).
Is it true that if we use only 'always' properties we can use N cores and when we use 'eventually' operator it needs to restricted to 2 cores?

Thank you
Sriraj

[1]:http://www.spinroot.com/spin/multicore/V5_Readme.html

srirajpaul · General

Hi,

I am trying to run spin using multicore options. It runs perfectly fine on intel processors. But I want to run it on a POWER7 processor. The pan.c generated doesn't seem to have code for POWER7. So we included the code for POWER7 in it. Specifically what we did was to include the test and set(tas) code for POWER7. There is place in pan.c where it defines tas operation for different processors. There we defined the tas for power 7 as follows using the atomic api in IBM C compiler(xlc)

#else
/*IBM */
int
tas(volatile int *s)
{
int r;
r = __fetch_and_or(s, 1);
return r;
}
//#error missing definition of test and set operation for this platform
#endif

The api docs of fetch_and_or tells [1]:
unsigned int __fetch_and_or (volatile unsigned int* addr, unsigned int val) - Sets bits in the word or doubleword specified by addr by OR-ing that value with the value specified val, in a single atomic operation, and returns the original value of addr.

It runs perfectly on some executions(it gives a speed of around 20000 states/sec for 12 cores). But in some executions it goes very slow and we have to forcefully stop the execution. In those cases the speed of around 100-500 states/sec. While doing Ctrl+c, we take the statcktrace and in all the traces it is in the function Get_Full_Frame in thread 1.

#0 0x00000000100646a8 in Get_Full_Frame (n=0) at pan.c:5636
#1 0x0000000010065818 in Read_Queue (q=0) at pan.c:5982
#2 0x0000000010066714 in mem_get () at pan.c:5502
#3 0x00000000100669e0 in do_the_search () at pan.c:7186
#4 0x0000000010067d9c in run () at pan.c:2574
#5 0x0000000010069850 in main (argc=1, argv=0xfffffffee38) at pan.c:10487

Equivalently we tried with gcc using the api "__sync_fetch_and_or". It is also giving same result(getting stuck in some executions).

Is there something more we need to do while trying to include a new architecture?

[1]:http://publib.boulder.ibm.com/infocenter/cellcomp/v101v121/index.jsp?topic=/com.ibm.xlcpp101.cell.doc/compiler_ref/bif_fetch_and_or_fetch_and_orlp.html

Thank you
Sriraj

Spinroot

#1 Re: General » Running spin with multicore options in POWER7 » 2014-10-26 06:03:19

#2 Re: General » Running spin with multicore options in POWER7 » 2014-10-24 22:43:54

#3 Re: Bug Reports » Assert fails when the value it is asserting is true » 2014-10-07 00:55:22

#4 Re: Bug Reports » Assert fails when the value it is asserting is true » 2014-09-29 20:09:03

#5 Bug Reports » Assert fails when the value it is asserting is true » 2014-09-24 22:47:40

#6 General » Randomized search coverage » 2014-03-08 10:04:01

#7 Re: General » Randomized search process getting stuck for some seeds » 2014-03-05 22:52:27

#8 Re: General » Randomized search process getting stuck for some seeds » 2014-02-28 17:28:28

#9 Re: General » Randomized search process getting stuck for some seeds » 2014-02-24 18:22:29

#10 General » Randomized search process getting stuck for some seeds » 2014-02-21 23:41:27

#11 Re: General » Using LTL with multicore, only one CPU getting used » 2014-02-15 05:43:11

#12 General » Using LTL with multicore, only one CPU getting used » 2014-02-12 07:24:21

#13 General » Running spin with multicore options in POWER7 » 2013-11-18 21:38:32

Board footer