Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Nicholas Piggin-2
On Thu, 8 Jun 2017 08:54:00 +0200
Peter Zijlstra <[hidden email]> wrote:

> On Thu, Jun 08, 2017 at 10:32:44AM +1000, Nicholas Piggin wrote:
> > On Wed, 07 Jun 2017 18:15:06 +0200
> > Peter Zijlstra <[hidden email]> wrote:
> >  
> > > Now that the scheduler's rq->lock is RCsc and thus provides full
> > > transitivity between scheduling actions. And since we cannot migrate
> > > current, a task needs a switch-out and a switch-in in order to
> > > migrate, in which case the RCsc provides all the ordering we need.  
> >
> > Hi Peter,
> >
> > I'm actually just working on removing this right now too, so
> > good timing.
> >
> > I think we can't "just" remove it, because it is required to order
> > MMIO on powerpc as well.  
>
> How is MMIO special? That is, there is only MMIO before we call into
> schedule() right? So the rq->lock should be sufficient to order that
> too.

MMIO uses different barriers. spinlock and smp_ type barriers do
not order it.

> >
> > But what I have done is to comment that some other primitives are
> > already providing the hwsync for other, so we don't have to add
> > another one in _switch.  
>
> Right, so this patch relies on the smp_mb__before_spinlock ->
> smp_mb__after_spinlock conversion that makes the rq->lock RCsc and
> should thus provide the required SYNC for migrations.

AFAIKS either one will do, so long as there is a hwsync there. The
point is just that I have added some commentary in the generic and
powerpc parts to make it clear we're relying on that behavior of
the primitive. smp_mb* is not guaranteed to order MMIO, it's just
that it does on powerpc.

> That said, I think you can already use the smp_mb__before_spinlock() as
> that is done with IRQs disabled, but its a more difficult argument. The
> rq->lock RCsc property should be more obvious.

This is what I got.

https://patchwork.ozlabs.org/patch/770154/

But I'm not sure if I followed I'm not sure why it's a more
difficult argument: any time a process moves it must first execute
a hwsync on the current CPU after it has performed all its access
there, and then it must execute hwsync on the new CPU before it
performs any new access.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Peter Zijlstra-5
On Thu, Jun 08, 2017 at 05:29:38PM +1000, Nicholas Piggin wrote:

> On Thu, 8 Jun 2017 08:54:00 +0200
> Peter Zijlstra <[hidden email]> wrote:
>
> > On Thu, Jun 08, 2017 at 10:32:44AM +1000, Nicholas Piggin wrote:
> > > On Wed, 07 Jun 2017 18:15:06 +0200
> > > Peter Zijlstra <[hidden email]> wrote:
> > >  
> > > > Now that the scheduler's rq->lock is RCsc and thus provides full
> > > > transitivity between scheduling actions. And since we cannot migrate
> > > > current, a task needs a switch-out and a switch-in in order to
> > > > migrate, in which case the RCsc provides all the ordering we need.  
> > >
> > > Hi Peter,
> > >
> > > I'm actually just working on removing this right now too, so
> > > good timing.
> > >
> > > I think we can't "just" remove it, because it is required to order
> > > MMIO on powerpc as well.  
> >
> > How is MMIO special? That is, there is only MMIO before we call into
> > schedule() right? So the rq->lock should be sufficient to order that
> > too.
>
> MMIO uses different barriers. spinlock and smp_ type barriers do
> not order it.

Right, but you only have SYNC, which is what makes it possible at all.

Some of the other architectures are not so lucky and need a different
barrier, ARM for instance needs DSB(ISH) vs the DMB(ISH) provided by
smp_mb(). IA64, MIPS and a few others are in the same boat as ARM.

> > > But what I have done is to comment that some other primitives are
> > > already providing the hwsync for other, so we don't have to add
> > > another one in _switch.  
> >
> > Right, so this patch relies on the smp_mb__before_spinlock ->
> > smp_mb__after_spinlock conversion that makes the rq->lock RCsc and
> > should thus provide the required SYNC for migrations.
>
> AFAIKS either one will do, so long as there is a hwsync there. The
> point is just that I have added some commentary in the generic and
> powerpc parts to make it clear we're relying on that behavior of
> the primitive. smp_mb* is not guaranteed to order MMIO, it's just
> that it does on powerpc.

I'm not particularly happy with the generic comment; I don't feel we
should care that PPC is special here.

> > That said, I think you can already use the smp_mb__before_spinlock() as
> > that is done with IRQs disabled, but its a more difficult argument. The
> > rq->lock RCsc property should be more obvious.
>
> This is what I got.
>
> https://patchwork.ozlabs.org/patch/770154/

Your comment isn't fully correct, smp_cond_load_acquire() isn't
necessarily done by CPUy. It might be easiest to simply refer to the
"Notes on Program-Order guarantees on SMP systems." comment.

> But I'm not sure if I followed I'm not sure why it's a more
> difficult argument: any time a process moves it must first execute
> a hwsync on the current CPU after it has performed all its access
> there, and then it must execute hwsync on the new CPU before it
> performs any new access.

Yeah, its not a terribly difficult argument either way, but I feel the
RSsc rq->lock on is slightly easier.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Nicholas Piggin-2
On Thu, 8 Jun 2017 09:57:20 +0200
Peter Zijlstra <[hidden email]> wrote:

> On Thu, Jun 08, 2017 at 05:29:38PM +1000, Nicholas Piggin wrote:
> > On Thu, 8 Jun 2017 08:54:00 +0200
> > Peter Zijlstra <[hidden email]> wrote:
> >  
> > > On Thu, Jun 08, 2017 at 10:32:44AM +1000, Nicholas Piggin wrote:  
> > > > On Wed, 07 Jun 2017 18:15:06 +0200
> > > > Peter Zijlstra <[hidden email]> wrote:
> > > >    
> > > > > Now that the scheduler's rq->lock is RCsc and thus provides full
> > > > > transitivity between scheduling actions. And since we cannot migrate
> > > > > current, a task needs a switch-out and a switch-in in order to
> > > > > migrate, in which case the RCsc provides all the ordering we need.    
> > > >
> > > > Hi Peter,
> > > >
> > > > I'm actually just working on removing this right now too, so
> > > > good timing.
> > > >
> > > > I think we can't "just" remove it, because it is required to order
> > > > MMIO on powerpc as well.    
> > >
> > > How is MMIO special? That is, there is only MMIO before we call into
> > > schedule() right? So the rq->lock should be sufficient to order that
> > > too.  
> >
> > MMIO uses different barriers. spinlock and smp_ type barriers do
> > not order it.  
>
> Right, but you only have SYNC, which is what makes it possible at all.

Yeah, but a future CPU in theory could implement some other barrier
which provides hwsync ordering for cacheable memory but not uncacheable.
smp_mb* barriers would be able to use that new type of barrier, except
here.

> Some of the other architectures are not so lucky and need a different
> barrier, ARM for instance needs DSB(ISH) vs the DMB(ISH) provided by
> smp_mb(). IA64, MIPS and a few others are in the same boat as ARM.
>
> > > > But what I have done is to comment that some other primitives are
> > > > already providing the hwsync for other, so we don't have to add
> > > > another one in _switch.    
> > >
> > > Right, so this patch relies on the smp_mb__before_spinlock ->
> > > smp_mb__after_spinlock conversion that makes the rq->lock RCsc and
> > > should thus provide the required SYNC for migrations.  
> >
> > AFAIKS either one will do, so long as there is a hwsync there. The
> > point is just that I have added some commentary in the generic and
> > powerpc parts to make it clear we're relying on that behavior of
> > the primitive. smp_mb* is not guaranteed to order MMIO, it's just
> > that it does on powerpc.  
>
> I'm not particularly happy with the generic comment; I don't feel we
> should care that PPC is special here.

I think we do though, because its smp_mb happens to also order mmio.

Your patch I think failed to capture that unless I miss something. It's
not that the rq lock is RCsc that we can remove the hwsync, it's that
the smp_mb__before/after_spinlock has a hwsync in it.

As a counter-example: I think you can implement RCsc spinlocks in
powerpc using ll/sc+isync for the acquire, but that would be insufficient
because no hwsync for MMIO.

> > > That said, I think you can already use the smp_mb__before_spinlock() as
> > > that is done with IRQs disabled, but its a more difficult argument. The
> > > rq->lock RCsc property should be more obvious.  
> >
> > This is what I got.
> >
> > https://patchwork.ozlabs.org/patch/770154/ 
>
> Your comment isn't fully correct, smp_cond_load_acquire() isn't
> necessarily done by CPUy. It might be easiest to simply refer to the
> "Notes on Program-Order guarantees on SMP systems." comment.

True, thanks.

> > But I'm not sure if I followed I'm not sure why it's a more
> > difficult argument: any time a process moves it must first execute
> > a hwsync on the current CPU after it has performed all its access
> > there, and then it must execute hwsync on the new CPU before it
> > performs any new access.  
>
> Yeah, its not a terribly difficult argument either way, but I feel the
> RSsc rq->lock on is slightly easier.

It is neater to have the barrier inside the lock, I think.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Michael Ellerman-2
In reply to this post by Peter Zijlstra-5
Peter Zijlstra <[hidden email]> writes:

> On Thu, Jun 08, 2017 at 05:29:38PM +1000, Nicholas Piggin wrote:
>> On Thu, 8 Jun 2017 08:54:00 +0200
>> Peter Zijlstra <[hidden email]> wrote:
>> >
>> > Right, so this patch relies on the smp_mb__before_spinlock ->
>> > smp_mb__after_spinlock conversion that makes the rq->lock RCsc and
>> > should thus provide the required SYNC for migrations.
>>
>> AFAIKS either one will do, so long as there is a hwsync there. The
>> point is just that I have added some commentary in the generic and
>> powerpc parts to make it clear we're relying on that behavior of
>> the primitive. smp_mb* is not guaranteed to order MMIO, it's just
>> that it does on powerpc.
>
> I'm not particularly happy with the generic comment; I don't feel we
> should care that PPC is special here.

I think it'd be nice if there was *some* comment on the two uses of
smp_mb__after_spinlock(), it's fairly subtle, but I don't think it needs
to mention PPC specifically.


If we have:

arch/powerpc/include/asm/barrier.h:
+/*
+ * This must resolve to hwsync on SMP for the context switch path. See
+ * _switch.
+ */
 #define smp_mb__after_spinlock()   smp_mb()


And then something in _switch() that says "we rely on the
smp_mb__after_spinlock() in the scheduler core being a hwsync", that
should probably be sufficient.

cheers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Nicholas Piggin-2
On Thu, 08 Jun 2017 19:54:30 +1000
Michael Ellerman <[hidden email]> wrote:

> Peter Zijlstra <[hidden email]> writes:
> > On Thu, Jun 08, 2017 at 05:29:38PM +1000, Nicholas Piggin wrote:  
> >> On Thu, 8 Jun 2017 08:54:00 +0200
> >> Peter Zijlstra <[hidden email]> wrote:  
> >> >
> >> > Right, so this patch relies on the smp_mb__before_spinlock ->
> >> > smp_mb__after_spinlock conversion that makes the rq->lock RCsc and
> >> > should thus provide the required SYNC for migrations.  
> >>
> >> AFAIKS either one will do, so long as there is a hwsync there. The
> >> point is just that I have added some commentary in the generic and
> >> powerpc parts to make it clear we're relying on that behavior of
> >> the primitive. smp_mb* is not guaranteed to order MMIO, it's just
> >> that it does on powerpc.  
> >
> > I'm not particularly happy with the generic comment; I don't feel we
> > should care that PPC is special here.  
>
> I think it'd be nice if there was *some* comment on the two uses of
> smp_mb__after_spinlock(), it's fairly subtle, but I don't think it needs
> to mention PPC specifically.
>
>
> If we have:
>
> arch/powerpc/include/asm/barrier.h:
> +/*
> + * This must resolve to hwsync on SMP for the context switch path. See
> + * _switch.
> + */
>  #define smp_mb__after_spinlock()   smp_mb()
>
>
> And then something in _switch() that says "we rely on the
> smp_mb__after_spinlock() in the scheduler core being a hwsync", that
> should probably be sufficient.

I have those, I just also would like one in the core scheduler's use
of smp_mb__after_spinlock(), because it would be easy for core scheduler
change to miss that quirk. Sure we can say that Peter and scheduler
maintainers know about powerpc oddities, but then why shouldn't it also
go into a comment there?

Thanks,
Nick
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Peter Zijlstra-5
On Thu, Jun 08, 2017 at 08:00:15PM +1000, Nicholas Piggin wrote:

> I have those, I just also would like one in the core scheduler's use
> of smp_mb__after_spinlock(), because it would be easy for core scheduler
> change to miss that quirk. Sure we can say that Peter and scheduler
> maintainers know about powerpc oddities, but then why shouldn't it also
> go into a comment there?

So the core scheduler guarantees smp_mb() or equivalent full transitive
ordering happens at schedule() time.

It has for a fairly long time and I don't think we'll ever get rid of
that, its fairly fundamental.

PPC is special in that smp_mb() ends up being the strongest ordering
primitive there is. But note that PPC is not unique, afaict Alpha is in
the same boat. They rely on the MB from the scheduler core.

IA64 OTOH, while they have smp_mb() == mb() still needs SYNC.I in
__switch_to() to serialize against (instruction) cache flushes.

So while I'm all for adding comments explaining what the core provides,
I don't see immediate reasons to call out PPC.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Nicholas Piggin-2
On Thu, 8 Jun 2017 14:45:40 +0200
Peter Zijlstra <[hidden email]> wrote:

> On Thu, Jun 08, 2017 at 08:00:15PM +1000, Nicholas Piggin wrote:
>
> > I have those, I just also would like one in the core scheduler's use
> > of smp_mb__after_spinlock(), because it would be easy for core scheduler
> > change to miss that quirk. Sure we can say that Peter and scheduler
> > maintainers know about powerpc oddities, but then why shouldn't it also
> > go into a comment there?  
>
> So the core scheduler guarantees smp_mb() or equivalent full transitive
> ordering happens at schedule() time.
>
> It has for a fairly long time and I don't think we'll ever get rid of
> that, its fairly fundamental.
>
> PPC is special in that smp_mb() ends up being the strongest ordering
> primitive there is. But note that PPC is not unique, afaict Alpha is in
> the same boat. They rely on the MB from the scheduler core.
>
> IA64 OTOH, while they have smp_mb() == mb() still needs SYNC.I in
> __switch_to() to serialize against (instruction) cache flushes.
>
> So while I'm all for adding comments explaining what the core provides,
> I don't see immediate reasons to call out PPC.

I guess I see your point... okay, will constrain the comment to powerpc
context switch and primitives code. Any fundamental change to such
scheduler barriers I guess would require at least a glance over arch
switch code :)

My plan is to send the powerpc sync removal patch for hopefully 4.13
merge. I'm pretty sure it will be equally happy with your patches.
Unless you can see any problems with it? More eyes would be welcome.

Thanks,
Nick
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch

Peter Zijlstra-5
On Thu, Jun 08, 2017 at 11:18:13PM +1000, Nicholas Piggin wrote:
> I guess I see your point... okay, will constrain the comment to powerpc
> context switch and primitives code. Any fundamental change to such
> scheduler barriers I guess would require at least a glance over arch
> switch code :)

Just so.

> My plan is to send the powerpc sync removal patch for hopefully 4.13
> merge. I'm pretty sure it will be equally happy with your patches.
> Unless you can see any problems with it? More eyes would be welcome.

Feel free to Cc me. I don't see a problem with removing that SYNC now,
I'll rebase my patches on top.
Loading...