[PATCH 1/4] powerpc/64s: context switch leave interrupts hard enabled for radix

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/4] powerpc/64s: context switch leave interrupts hard enabled for radix

Nicholas Piggin-2
Commit 4387e9ff25 ("[POWERPC] Fix PMU + soft interrupt disable bug")
hard disabled interrupts over the low level context switch, because
the SLB management can't cope with a PMU interrupt accesing the stack
in that window.

Radix based kernel mapping does not use the SLB so it does not require
interrupts hard disabled here.

This is worth 1-2% in context switch performance on POWER9.

Signed-off-by: Nicholas Piggin <[hidden email]>
---
 arch/powerpc/kernel/entry_64.S |  8 ++++++++
 arch/powerpc/kernel/process.c  | 14 ++++++++------
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 6f70ea821a07..91f9fdc2d027 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -607,6 +607,14 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
    top of the kernel stack. */
  addi r7,r7,THREAD_SIZE-SWITCH_FRAME_SIZE
 
+ /*
+ * PMU interrupts in radix may come in here. They will use r1, not
+ * PACAKSAVE, so this stack switch will not cause a problem. They
+ * will store to the process stack, which may then be migrated to
+ * another CPU. However the rq lock release on this CPU paired with
+ * the rq lock acquire on the new CPU before the stack becomes
+ * active on the new CPU, will order those stores.
+ */
  mr r1,r8 /* start using new stack pointer */
  std r7,PACAKSAVE(r13)
 
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 5cbb8b1faf7e..45faa9a32a01 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1199,12 +1199,14 @@ struct task_struct *__switch_to(struct task_struct *prev,
 
  __switch_to_tm(prev, new);
 
- /*
- * We can't take a PMU exception inside _switch() since there is a
- * window where the kernel stack SLB and the kernel stack are out
- * of sync. Hard disable here.
- */
- hard_irq_disable();
+ if (!radix_enabled()) {
+ /*
+ * We can't take a PMU exception inside _switch() since there
+ * is a window where the kernel stack SLB and the kernel stack
+ * are out of sync. Hard disable here.
+ */
+ hard_irq_disable();
+ }
 
  /*
  * Call restore_sprs() before calling _switch(). If we move it after
--
2.11.0

Reply | Threaded
Open this post in threaded view
|

[PATCH 2/4] powerpc/64: context switch avoid reservation-clearing instruction

Nicholas Piggin-2
There is no need to break reservation in _switch, because we are
guranteed that context switch path will include a larx/stcx.

Comment the guarantee and remove the reservation clear from _switch.

This is worth 1-2% in context switch performance.

Signed-off-by: Nicholas Piggin <[hidden email]>
---
 arch/powerpc/kernel/entry_64.S | 11 +++--------
 kernel/sched/core.c            |  6 ++++++
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 91f9fdc2d027..273a35926534 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -521,15 +521,10 @@ _GLOBAL(_switch)
 #endif /* CONFIG_SMP */
 
  /*
- * If we optimise away the clear of the reservation in system
- * calls because we know the CPU tracks the address of the
- * reservation, then we need to clear it here to cover the
- * case that the kernel context switch path has no larx
- * instructions.
+ * The kernel context switch path must contain a spin_lock,
+ * which contains larx/stcx, which will clear any reservation
+ * of the task being switched.
  */
-BEGIN_FTR_SECTION
- ldarx r6,0,r1
-END_FTR_SECTION_IFSET(CPU_FTR_STCX_CHECKS_ADDRESS)
 
 BEGIN_FTR_SECTION
 /*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 803c3bc274c4..1f0688ad09d7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2875,6 +2875,12 @@ context_switch(struct rq *rq, struct task_struct *prev,
  rq_unpin_lock(rq, rf);
  spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
 
+ /*
+ * Some architectures require that a spin lock is taken before
+ * _switch. The rq_lock satisfies this condition. See powerpc
+ * _switch for details.
+ */
+
  /* Here we just switch the register state and the stack. */
  switch_to(prev, next, prev);
  barrier();
--
2.11.0

Reply | Threaded
Open this post in threaded view
|

[PATCH 3/4] powerpc/64: context switch an hwsync instruction can be avoided

Nicholas Piggin-2
In reply to this post by Nicholas Piggin-2
The hwsync in the context switch code to prevent MMIO access being
reordered from the point of view of a single process if it gets
migrated to a different CPU is not required because there is an hwsync
performed earlier in the context switch path.

Comment this so it's clear enough if anything changes on the scheduler
or the powerpc sides. Remove the hwsync from _switch.

This improves context switch performance by 2-3% on POWER8.

Cc: Peter Zijlstra <[hidden email]>
Signed-off-by: Nicholas Piggin <[hidden email]>
---
 arch/powerpc/include/asm/barrier.h |  5 +++++
 arch/powerpc/kernel/entry_64.S     | 23 +++++++++++++++++------
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index c0deafc212b8..25d42bd3f114 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -74,6 +74,11 @@ do { \
  ___p1; \
 })
 
+/*
+ * This must resolve to hwsync on SMP for the context switch path.
+ * See _switch, and core scheduler context switch memory ordering
+ * comments.
+ */
 #define smp_mb__before_spinlock()   smp_mb()
 
 #include <asm-generic/barrier.h>
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 273a35926534..fb143859cc68 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -512,13 +512,24 @@ _GLOBAL(_switch)
  std r23,_CCR(r1)
  std r1,KSP(r3) /* Set old stack pointer */
 
-#ifdef CONFIG_SMP
- /* We need a sync somewhere here to make sure that if the
- * previous task gets rescheduled on another CPU, it sees all
- * stores it has performed on this one.
+ /*
+ * On SMP kernels, care must be taken because a task may be
+ * scheduled off CPUx and on to CPUy. Memory ordering must be
+ * considered.
+ *
+ * Cacheable stores on CPUx will be visible when the task is
+ * scheduled on CPUy by virtue of the core scheduler barriers
+ * (see "Notes on Program-Order guarantees on SMP systems." in
+ * kernel/sched/core.c).
+ *
+ * Uncacheable stores in the case of involuntary preemption must
+ * be taken care of. The smp_mb__before_spin_lock() in __schedule()
+ * is implemented as hwsync on powerpc, which orders MMIO too. So
+ * long as there is an hwsync in the context switch path, it will
+ * be executed on the source CPU after the task has performed
+ * all MMIO ops on that CPU, and on the destination CPU before the
+ * task performs any MMIO ops there.
  */
- sync
-#endif /* CONFIG_SMP */
 
  /*
  * The kernel context switch path must contain a spin_lock,
--
2.11.0

Reply | Threaded
Open this post in threaded view
|

[PATCH 4/4] powerpc/64s: context switch avoid cpabort

Nicholas Piggin-2
In reply to this post by Nicholas Piggin-2
The ISA v3.0B copy-paste facility only requires cpabort when switching
to a process that has foreign real addresses mapped (direct access to
accelerators), to clear a potential copy buffer filled by a previous
thread. There is no accelerator driver implemented yet, so cpabort can
be removed. It can be be re-added when a driver is implemented.

POWER9 DD1 requires the copy buffer to always be cleared on context
switch, but if accelerators are not in use, then an unpaired copy from
a dummy region is sufficient to clear data out of the copy buffer.

This increases context switch performance by about 5% on POWER9.

Signed-off-by: Nicholas Piggin <[hidden email]>
---
 arch/powerpc/include/asm/ppc-opcode.h |  8 ++++----
 arch/powerpc/kernel/entry_64.S        |  9 ---------
 arch/powerpc/kernel/process.c         | 27 ++++++++++++++++++++++++++-
 3 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index 3a8d278e7421..3b6bbf5a8683 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -189,8 +189,7 @@
 /* sorted alphabetically */
 #define PPC_INST_BHRBE 0x7c00025c
 #define PPC_INST_CLRBHRB 0x7c00035c
-#define PPC_INST_COPY 0x7c00060c
-#define PPC_INST_COPY_FIRST 0x7c20060c
+#define PPC_INST_COPY 0x7c20060c
 #define PPC_INST_CP_ABORT 0x7c00068c
 #define PPC_INST_DCBA 0x7c0005ec
 #define PPC_INST_DCBA_MASK 0xfc0007fe
@@ -223,8 +222,7 @@
 #define PPC_INST_MSGSNDP 0x7c00011c
 #define PPC_INST_MTTMR 0x7c0003dc
 #define PPC_INST_NOP 0x60000000
-#define PPC_INST_PASTE 0x7c00070c
-#define PPC_INST_PASTE_LAST 0x7c20070d
+#define PPC_INST_PASTE 0x7c20070d
 #define PPC_INST_POPCNTB 0x7c0000f4
 #define PPC_INST_POPCNTB_MASK 0xfc0007fe
 #define PPC_INST_POPCNTD 0x7c0003f4
@@ -392,6 +390,8 @@
 
 /* Deal with instructions that older assemblers aren't aware of */
 #define PPC_CP_ABORT stringify_in_c(.long PPC_INST_CP_ABORT)
+#define PPC_COPY(a, b) stringify_in_c(.long PPC_INST_COPY | \
+ ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_DCBAL(a, b) stringify_in_c(.long PPC_INST_DCBAL | \
  __PPC_RA(a) | __PPC_RB(b))
 #define PPC_DCBZL(a, b) stringify_in_c(.long PPC_INST_DCBZL | \
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index fb143859cc68..da9486e2fd89 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -536,15 +536,6 @@ _GLOBAL(_switch)
  * which contains larx/stcx, which will clear any reservation
  * of the task being switched.
  */
-
-BEGIN_FTR_SECTION
-/*
- * A cp_abort (copy paste abort) here ensures that when context switching, a
- * copy from one process can't leak into the paste of another.
- */
- PPC_CP_ABORT
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
-
 #ifdef CONFIG_PPC_BOOK3S
 /* Cancel all explict user streams as they will have no use after context
  * switch and will stop the HW from creating streams itself
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 45faa9a32a01..6273b5d5baec 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1137,6 +1137,11 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 #endif
 }
 
+#ifdef CONFIG_PPC_BOOK3S_64
+#define CP_SIZE 128
+static const u8 dummy_copy_buffer[CP_SIZE] __attribute__((aligned(CP_SIZE)));
+#endif
+
 struct task_struct *__switch_to(struct task_struct *prev,
  struct task_struct *new)
 {
@@ -1226,8 +1231,28 @@ struct task_struct *__switch_to(struct task_struct *prev,
  batch->active = 1;
  }
 
- if (current_thread_info()->task->thread.regs)
+ if (current_thread_info()->task->thread.regs) {
  restore_math(current_thread_info()->task->thread.regs);
+
+ /*
+ * The copy-paste buffer can only store into foreign real
+ * addresses, so unprivileged processes can not see the
+ * data or use it in any way unless they have foreign real
+ * mappings. We don't have a VAS driver that allocates those
+ * yet, so no cpabort is required.
+ */
+ if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
+ /*
+ * DD1 allows paste into normal system memory, so we
+ * do an unpaired copy here to clear the buffer and
+ * prevent a covert channel being set up.
+ *
+ * cpabort is not used because it is quite expensive.
+ */
+ asm volatile(PPC_COPY(%0, %1)
+ : : "r"(dummy_copy_buffer), "r"(0));
+ }
+ }
 #endif /* CONFIG_PPC_STD_MMU_64 */
 
  return last;
--
2.11.0

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/4] powerpc/64: context switch an hwsync instruction can be avoided

Peter Zijlstra-5
In reply to this post by Nicholas Piggin-2
On Fri, Jun 09, 2017 at 01:36:08AM +1000, Nicholas Piggin wrote:

> The hwsync in the context switch code to prevent MMIO access being
> reordered from the point of view of a single process if it gets
> migrated to a different CPU is not required because there is an hwsync
> performed earlier in the context switch path.
>
> Comment this so it's clear enough if anything changes on the scheduler
> or the powerpc sides. Remove the hwsync from _switch.
>
> This improves context switch performance by 2-3% on POWER8.
>
> Cc: Peter Zijlstra <[hidden email]>
> Signed-off-by: Nicholas Piggin <[hidden email]>

Acked-by: Peter Zijlstra (Intel) <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/4] powerpc/64: context switch avoid reservation-clearing instruction

Michael Ellerman-2
In reply to this post by Nicholas Piggin-2
Nicholas Piggin <[hidden email]> writes:

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 803c3bc274c4..1f0688ad09d7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2875,6 +2875,12 @@ context_switch(struct rq *rq, struct task_struct *prev,
>   rq_unpin_lock(rq, rf);
>   spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
>  
> + /*
> + * Some architectures require that a spin lock is taken before
> + * _switch. The rq_lock satisfies this condition. See powerpc
> + * _switch for details.
> + */
> +
>   /* Here we just switch the register state and the stack. */
>   switch_to(prev, next, prev);
>   barrier();

I dropped this hunk, if you want to merge it you can resend it and get
an ack from Peterz.

cheers
Reply | Threaded
Open this post in threaded view
|

Re: [1/4] powerpc/64s: context switch leave interrupts hard enabled for radix

Michael Ellerman-3
In reply to this post by Nicholas Piggin-2
On Thu, 2017-06-08 at 15:36:06 UTC, Nicholas Piggin wrote:

> Commit 4387e9ff25 ("[POWERPC] Fix PMU + soft interrupt disable bug")
> hard disabled interrupts over the low level context switch, because
> the SLB management can't cope with a PMU interrupt accesing the stack
> in that window.
>
> Radix based kernel mapping does not use the SLB so it does not require
> interrupts hard disabled here.
>
> This is worth 1-2% in context switch performance on POWER9.
>
> Signed-off-by: Nicholas Piggin <[hidden email]>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e4c0fc5f72bca11432297168338aef

cheers