Spinroot

A forum for Spin users

You are not logged in.

#1 2013-11-18 21:38:32

srirajpaul
Member
Registered: 2013-09-07
Posts: 13

Running spin with multicore options in POWER7

Hi,

I am trying to run spin using multicore options. It runs perfectly fine on intel processors. But I want to run it on a POWER7 processor. The pan.c generated doesn't seem to have code for POWER7. So we included the code for POWER7 in it. Specifically what we did was to include the test and set(tas) code for POWER7. There is place in pan.c where it defines tas operation for different processors. There we defined the tas for power 7 as follows using the atomic api in IBM C compiler(xlc)

        #else
                /*IBM */
                int
                tas(volatile int *s)
                {
                        int r;
                        r = __fetch_and_or(s, 1);
                        return r;
                }
                //#error missing definition of test and set operation for this platform
        #endif

The api docs of fetch_and_or tells [1]:
unsigned int __fetch_and_or (volatile unsigned int* addr, unsigned int val) - Sets bits in the word or doubleword specified by addr by OR-ing that value with the value specified val, in a single atomic operation, and returns the original value of addr.

It runs perfectly on some executions(it gives a speed of around 20000 states/sec for 12 cores).  But in some executions it goes very slow and we have to  forcefully stop the execution. In those cases the speed of around 100-500 states/sec. While doing Ctrl+c, we take the statcktrace and in all the traces it is in the function Get_Full_Frame in thread 1.

#0  0x00000000100646a8 in Get_Full_Frame (n=0) at pan.c:5636
#1  0x0000000010065818 in Read_Queue (q=0) at pan.c:5982
#2  0x0000000010066714 in mem_get () at pan.c:5502
#3  0x00000000100669e0 in do_the_search () at pan.c:7186
#4  0x0000000010067d9c in run () at pan.c:2574
#5  0x0000000010069850 in main (argc=1, argv=0xfffffffee38) at pan.c:10487

Equivalently we tried with gcc using the api "__sync_fetch_and_or". It is also giving same result(getting stuck in some executions).

Is there something more we need to do while trying to include a new architecture?

[1]:http://publib.boulder.ibm.com/infocenter/cellcomp/v101v121/index.jsp?topic=/com.ibm.xlcpp101.cell.doc/compiler_ref/bif_fetch_and_or_fetch_and_orlp.html

Thank you
Sriraj

Offline

#2 2013-11-20 05:14:25

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: Running spin with multicore options in POWER7

It doesn't seem that the way you defined tas will perform the atomic function that is needed.
So effectively, this would not suffice to enforce mutual exclusion.

Offline

#3 2014-10-24 22:43:54

srirajpaul
Member
Registered: 2013-09-07
Posts: 13

Re: Running spin with multicore options in POWER7

Hello,

We solved the problem for multicore execution in power7. The problem is that Power 7 has a relaxed memory model.

We solved it as follows. There was no test_and_set defined for power7 in the pan.c file

We included it as :
"
        #elif defined(__powerpc64__)
                int
                tas(volatile int *s)
                {
                        int r;
                        r = __fetch_and_or(s, 1);
                        __isync();
                        return r;
                }
"

The __isync() prevents any instructions inside the critical section from moving above the the __fetch_and_or.

Similarly at the end of the critical section, that is where sh_lock[which] is set to zero ( sh_lock[which] = 0;  /* unlock */ ) we need a barrier. We should prevent any memory operations inside critical section from moving outside this unlock. For this we can use __lwsync();
We included it as :
"
__lwsync(); sh_lock[which] = 0; /* unlock */
"

__isync and __lwsync are lighter than full sync. __fetch_and_or() gives a warning since its definition uses unsigned int whereas we pass int.  These atomic functions require xlc [1] compiler (not gcc).

Possible gcc replacements would be [2]:
__lwsync()   -> __sync_synchronize()
__fetch_and_or(s, 1); __isync();  ->  __sync_lock_test_and_set()

[1]: http://www-01.ibm.com/support/docview.wss?uid=swg27024742&aid=1
[2]: https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

Offline

#4 2014-10-25 21:41:28

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: Running spin with multicore options in POWER7

Thanks very much for figuring this out.
I'll add this to version 6.4.3 -- but I think I'll let it default to:
int r;
r = __sync_loctk_test_and_set();
return r;

Offline

#5 2014-10-26 06:03:19

srirajpaul
Member
Registered: 2013-09-07
Posts: 13

Re: Running spin with multicore options in POWER7

That would be great.
For powerpc include the barrier while unlocking also.

Offline

#6 2014-10-26 19:25:49

spinroot
forum
Registered: 2010-11-18
Posts: 695
Website

Re: Running spin with multicore options in POWER7

> srirajpaul wrote:

> That would be great.
> For powerpc include the barrier while unlocking also.
ack

Offline

Board footer

Powered by FluxBB