[PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

Pingfan Liu-2
commit 52cdbdd49853 ("driver core: correct device's shutdown order")
places an assumption of supplier<-consumer order on the process of probe.
But it turns out to break down the parent <- child order in some scene.
E.g in pci, a bridge is enabled by pci core, and behind it, the devices
have been probed. Then comes the bridge's module, which enables extra
feature(such as hotplug) on this bridge.
This will break the parent<-children order and cause failure when
"kexec -e" in some scenario.

v2 -> v3:
It is a little hard to impose both "parent<-child" and "supplier<-consumer"
on devices_kset. Hence v3 drops this method, postpones the issue to shutdown
time instead of probing, and utilizes device-tree info during shutdown
instead of the item's seq inside devices_kset.

Pingfan Liu (4):
  drivers/base: fold the routine of device's shutdown into a func
  drivers/base: utilize device tree info to shutdown devices
  drivers/base: clean up the usage of devices_kset_move_last()
  Revert "driver core: correct device's shutdown order"

 drivers/base/base.h    |   1 -
 drivers/base/core.c    | 196 +++++++++++++++++++++++--------------------------
 drivers/base/dd.c      |   8 --
 include/linux/device.h |   1 +
 4 files changed, 92 insertions(+), 114 deletions(-)

Cc: Greg Kroah-Hartman <[hidden email]>
Cc: Rafael J. Wysocki <[hidden email]>
Cc: Grygorii Strashko <[hidden email]>
Cc: Christoph Hellwig <[hidden email]>
Cc: Bjorn Helgaas <[hidden email]>
Cc: Dave Young <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]

--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCHv3 1/4] drivers/base: fold the routine of device's shutdown into a func

Pingfan Liu-2
Pack the code into a function to ease the using and reading.

Cc: Greg Kroah-Hartman <[hidden email]>
Cc: Rafael J. Wysocki <[hidden email]>
Cc: Grygorii Strashko <[hidden email]>
Cc: Christoph Hellwig <[hidden email]>
Cc: Bjorn Helgaas <[hidden email]>
Cc: Dave Young <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Signed-off-by: Pingfan Liu <[hidden email]>
---
 drivers/base/core.c | 100 +++++++++++++++++++++++++++-------------------------
 1 file changed, 52 insertions(+), 48 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index df3e1a4..a48868f 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2802,12 +2802,62 @@ int device_move(struct device *dev, struct device *new_parent,
 }
 EXPORT_SYMBOL_GPL(device_move);
 
+static void __device_shutdown(struct device *dev)
+{
+ struct device *parent;
+ /*
+ * hold reference count of device's parent to
+ * prevent it from being freed because parent's
+ * lock is to be held
+ */
+ parent = get_device(dev->parent);
+ get_device(dev);
+ /*
+ * Make sure the device is off the kset list, in the
+ * event that dev->*->shutdown() doesn't remove it.
+ */
+ list_del_init(&dev->kobj.entry);
+ spin_unlock(&devices_kset->list_lock);
+
+ /* hold lock to avoid race with probe/release */
+ if (parent)
+ device_lock(parent);
+ device_lock(dev);
+
+ /* Don't allow any more runtime suspends */
+ pm_runtime_get_noresume(dev);
+ pm_runtime_barrier(dev);
+
+ if (dev->class && dev->class->shutdown_pre) {
+ if (initcall_debug)
+ dev_info(dev, "shutdown_pre\n");
+ dev->class->shutdown_pre(dev);
+ }
+ if (dev->bus && dev->bus->shutdown) {
+ if (initcall_debug)
+ dev_info(dev, "shutdown\n");
+ dev->bus->shutdown(dev);
+ } else if (dev->driver && dev->driver->shutdown) {
+ if (initcall_debug)
+ dev_info(dev, "shutdown\n");
+ dev->driver->shutdown(dev);
+ }
+
+ device_unlock(dev);
+ if (parent)
+ device_unlock(parent);
+
+ put_device(dev);
+ put_device(parent);
+ spin_lock(&devices_kset->list_lock);
+}
+
 /**
  * device_shutdown - call ->shutdown() on each device to shutdown.
  */
 void device_shutdown(void)
 {
- struct device *dev, *parent;
+ struct device *dev;
 
  spin_lock(&devices_kset->list_lock);
  /*
@@ -2818,53 +2868,7 @@ void device_shutdown(void)
  while (!list_empty(&devices_kset->list)) {
  dev = list_entry(devices_kset->list.prev, struct device,
  kobj.entry);
-
- /*
- * hold reference count of device's parent to
- * prevent it from being freed because parent's
- * lock is to be held
- */
- parent = get_device(dev->parent);
- get_device(dev);
- /*
- * Make sure the device is off the kset list, in the
- * event that dev->*->shutdown() doesn't remove it.
- */
- list_del_init(&dev->kobj.entry);
- spin_unlock(&devices_kset->list_lock);
-
- /* hold lock to avoid race with probe/release */
- if (parent)
- device_lock(parent);
- device_lock(dev);
-
- /* Don't allow any more runtime suspends */
- pm_runtime_get_noresume(dev);
- pm_runtime_barrier(dev);
-
- if (dev->class && dev->class->shutdown_pre) {
- if (initcall_debug)
- dev_info(dev, "shutdown_pre\n");
- dev->class->shutdown_pre(dev);
- }
- if (dev->bus && dev->bus->shutdown) {
- if (initcall_debug)
- dev_info(dev, "shutdown\n");
- dev->bus->shutdown(dev);
- } else if (dev->driver && dev->driver->shutdown) {
- if (initcall_debug)
- dev_info(dev, "shutdown\n");
- dev->driver->shutdown(dev);
- }
-
- device_unlock(dev);
- if (parent)
- device_unlock(parent);
-
- put_device(dev);
- put_device(parent);
-
- spin_lock(&devices_kset->list_lock);
+ __device_shutdown(dev);
  }
  spin_unlock(&devices_kset->list_lock);
 }
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

Pingfan Liu-2
In reply to this post by Pingfan Liu-2
commit 52cdbdd49853 ("driver core: correct device's shutdown order")
places an assumption of supplier<-consumer order on the process of probe.
But it turns out to break down the parent <- child order in some scene.
E.g in pci, a bridge is enabled by pci core, and behind it, the devices
have been probed. Then comes the bridge's module, which enables extra
feature(such as hotplug) on this bridge. This will break the
parent<-children order and cause failure when "kexec -e" in some scenario.

The detailed description of the scenario:
An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
to some issue. For this case, the bridge is moved after its children in
devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
write back buffer in flight due to the former shutdown of the bridge which
clears the BusMaster bit.

It is a little hard to impose both "parent<-child" and "supplier<-consumer"
order on devices_kset. Take the following scene:
step0: before a consumer's probing, (note child_a is supplier of consumer_a)
  [ consumer-X, child_a, ...., child_z] [... consumer_a, ..., consumer_z, ...] supplier-X
                                         ^^^^^^^^^^ affected range ^^^^^^^^^^
step1: when probing, moving consumer-X after supplier-X
  [ child_a, ...., child_z] [.... consumer_a, ..., consumer_z, ...] supplier-X, consumer-X
step2: the children of consumer-X should be re-ordered to maintain the seq
  [... consumer_a, ..., consumer_z, ....] supplier-X  [consumer-X, child_a, ...., child_z]
step3: the consumer_a should be re-ordered to maintain the seq
  [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a ..., child_z]

It requires two nested recursion to drain out all out-of-order item in
"affected range". To avoid such complicated code, this patch suggests
to utilize the info in device tree, instead of using the order of
devices_kset during shutdown. It iterates the device tree, and firstly
shutdown a device's children and consumers. After this patch, the buggy
commit is hollow and left to clean.

Cc: Greg Kroah-Hartman <[hidden email]>
Cc: Rafael J. Wysocki <[hidden email]>
Cc: Grygorii Strashko <[hidden email]>
Cc: Christoph Hellwig <[hidden email]>
Cc: Bjorn Helgaas <[hidden email]>
Cc: Dave Young <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Signed-off-by: Pingfan Liu <[hidden email]>
---
 drivers/base/core.c    | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/device.h |  1 +
 2 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index a48868f..684b994 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
  INIT_LIST_HEAD(&dev->links.consumers);
  INIT_LIST_HEAD(&dev->links.suppliers);
  dev->links.status = DL_DEV_NO_DRIVER;
+ dev->shutdown = false;
 }
 EXPORT_SYMBOL_GPL(device_initialize);
 
@@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
  * lock is to be held
  */
  parent = get_device(dev->parent);
- get_device(dev);
  /*
  * Make sure the device is off the kset list, in the
  * event that dev->*->shutdown() doesn't remove it.
@@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device *dev)
  dev_info(dev, "shutdown\n");
  dev->driver->shutdown(dev);
  }
-
+ dev->shutdown = true;
  device_unlock(dev);
  if (parent)
  device_unlock(parent);
 
- put_device(dev);
  put_device(parent);
  spin_lock(&devices_kset->list_lock);
 }
 
+/* shutdown dev's children and consumer firstly, then itself */
+static int device_for_each_child_shutdown(struct device *dev)
+{
+ struct klist_iter i;
+ struct device *child;
+ struct device_link *link;
+
+ /* already shutdown, then skip this sub tree */
+ if (dev->shutdown)
+ return 0;
+
+ if (!dev->p)
+ goto check_consumers;
+
+ /* there is breakage of lock in __device_shutdown(), and the redundant
+ * ref++ on srcu protected consumer is harmless since shutdown is not
+ * hot path.
+ */
+ get_device(dev);
+
+ klist_iter_init(&dev->p->klist_children, &i);
+ while ((child = next_device(&i)))
+ device_for_each_child_shutdown(child);
+ klist_iter_exit(&i);
+
+check_consumers:
+ list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
+ if (!link->consumer->shutdown)
+ device_for_each_child_shutdown(link->consumer);
+ }
+
+ __device_shutdown(dev);
+ put_device(dev);
+ return 0;
+}
+
 /**
  * device_shutdown - call ->shutdown() on each device to shutdown.
  */
 void device_shutdown(void)
 {
  struct device *dev;
+ int idx;
 
+ idx = device_links_read_lock();
  spin_lock(&devices_kset->list_lock);
  /*
  * Walk the devices list backward, shutting down each in turn.
@@ -2866,11 +2903,12 @@ void device_shutdown(void)
  * devices offline, even as the system is shutting down.
  */
  while (!list_empty(&devices_kset->list)) {
- dev = list_entry(devices_kset->list.prev, struct device,
+ dev = list_entry(devices_kset->list.next, struct device,
  kobj.entry);
- __device_shutdown(dev);
+ device_for_each_child_shutdown(dev);
  }
  spin_unlock(&devices_kset->list_lock);
+ device_links_read_unlock(idx);
 }
 
 /*
diff --git a/include/linux/device.h b/include/linux/device.h
index 055a69d..8a0f784 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1003,6 +1003,7 @@ struct device {
  bool offline:1;
  bool of_node_reused:1;
  bool dma_32bit_limit:1;
+ bool shutdown:1; /* one direction: false->true */
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCHv3 3/4] drivers/base: clean up the usage of devices_kset_move_last()

Pingfan Liu-2
In reply to this post by Pingfan Liu-2
Clean up the referring to the code in commit 52cdbdd49853 ("driver core:
correct device's shutdown order"). So later we can revert it safely.

Cc: Greg Kroah-Hartman <[hidden email]>
Cc: Rafael J. Wysocki <[hidden email]>
Cc: Grygorii Strashko <[hidden email]>
Cc: Christoph Hellwig <[hidden email]>
Cc: Bjorn Helgaas <[hidden email]>
Cc: Dave Young <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Signed-off-by: Pingfan Liu <[hidden email]>
---
 drivers/base/core.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 684b994..db3deb8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -127,13 +127,6 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
 {
  struct device_link *link;
 
- /*
- * Devices that have not been registered yet will be put to the ends
- * of the lists during the registration, so skip them here.
- */
- if (device_is_registered(dev))
- devices_kset_move_last(dev);
-
  if (device_pm_initialized(dev))
  device_pm_move_last(dev);
 
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCHv3 4/4] Revert "driver core: correct device's shutdown order"

Pingfan Liu-2
In reply to this post by Pingfan Liu-2
This reverts commit 52cdbdd49853dfa856082edb0f4c4c0249d9df07.
Since the device_shutdown() does not rely on the order in devices_kset any
more, reverting that commit safely.

Cc: Greg Kroah-Hartman <[hidden email]>
Cc: Rafael J. Wysocki <[hidden email]>
Cc: Grygorii Strashko <[hidden email]>
Cc: Christoph Hellwig <[hidden email]>
Cc: Bjorn Helgaas <[hidden email]>
Cc: Dave Young <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Signed-off-by: Pingfan Liu <[hidden email]>
---
 drivers/base/base.h |  1 -
 drivers/base/core.c | 49 -------------------------------------------------
 drivers/base/dd.c   |  8 --------
 3 files changed, 58 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index a75c302..5bc9064 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -135,7 +135,6 @@ extern void device_unblock_probing(void);
 
 /* /sys/devices directory */
 extern struct kset *devices_kset;
-extern void devices_kset_move_last(struct device *dev);
 
 #if defined(CONFIG_MODULES) && defined(CONFIG_SYSFS)
 extern void module_add_driver(struct module *mod, struct device_driver *drv);
diff --git a/drivers/base/core.c b/drivers/base/core.c
index db3deb8..570aeee 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1259,52 +1259,6 @@ static DEVICE_ATTR_RO(dev);
 struct kset *devices_kset;
 
 /**
- * devices_kset_move_before - Move device in the devices_kset's list.
- * @deva: Device to move.
- * @devb: Device @deva should come before.
- */
-static void devices_kset_move_before(struct device *deva, struct device *devb)
-{
- if (!devices_kset)
- return;
- pr_debug("devices_kset: Moving %s before %s\n",
- dev_name(deva), dev_name(devb));
- spin_lock(&devices_kset->list_lock);
- list_move_tail(&deva->kobj.entry, &devb->kobj.entry);
- spin_unlock(&devices_kset->list_lock);
-}
-
-/**
- * devices_kset_move_after - Move device in the devices_kset's list.
- * @deva: Device to move
- * @devb: Device @deva should come after.
- */
-static void devices_kset_move_after(struct device *deva, struct device *devb)
-{
- if (!devices_kset)
- return;
- pr_debug("devices_kset: Moving %s after %s\n",
- dev_name(deva), dev_name(devb));
- spin_lock(&devices_kset->list_lock);
- list_move(&deva->kobj.entry, &devb->kobj.entry);
- spin_unlock(&devices_kset->list_lock);
-}
-
-/**
- * devices_kset_move_last - move the device to the end of devices_kset's list.
- * @dev: device to move
- */
-void devices_kset_move_last(struct device *dev)
-{
- if (!devices_kset)
- return;
- pr_debug("devices_kset: Moving %s to end of list\n", dev_name(dev));
- spin_lock(&devices_kset->list_lock);
- list_move_tail(&dev->kobj.entry, &devices_kset->list);
- spin_unlock(&devices_kset->list_lock);
-}
-
-/**
  * device_create_file - create sysfs attribute file for device.
  * @dev: device.
  * @attr: device attribute descriptor.
@@ -2776,15 +2730,12 @@ int device_move(struct device *dev, struct device *new_parent,
  break;
  case DPM_ORDER_DEV_AFTER_PARENT:
  device_pm_move_after(dev, new_parent);
- devices_kset_move_after(dev, new_parent);
  break;
  case DPM_ORDER_PARENT_BEFORE_DEV:
  device_pm_move_before(new_parent, dev);
- devices_kset_move_before(new_parent, dev);
  break;
  case DPM_ORDER_DEV_LAST:
  device_pm_move_last(dev);
- devices_kset_move_last(dev);
  break;
  }
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 1435d72..6ebcd65 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -434,14 +434,6 @@ static int really_probe(struct device *dev, struct device_driver *drv)
  goto probe_failed;
  }
 
- /*
- * Ensure devices are listed in devices_kset in correct order
- * It's important to move Dev to the end of devices_kset before
- * calling .probe, because it could be recursive and parent Dev
- * should always go first
- */
- devices_kset_move_last(dev);
-
  if (dev->bus->probe) {
  ret = dev->bus->probe(dev);
  if (ret)
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

Pingfan Liu-2
In reply to this post by Pingfan Liu-2
On Tue, Jul 3, 2018 at 3:51 PM Lukas Wunner <[hidden email]> wrote:

>
> On Tue, Jul 03, 2018 at 02:50:40PM +0800, Pingfan Liu wrote:
> > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > places an assumption of supplier<-consumer order on the process of probe.
> > But it turns out to break down the parent <- child order in some scene.
> > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > have been probed. Then comes the bridge's module, which enables extra
> > feature(such as hotplug) on this bridge. This will break the
> > parent<-children order and cause failure when "kexec -e" in some scenario.
> >
> > The detailed description of the scenario:
> > An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
> > match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> > to some issue. For this case, the bridge is moved after its children in
> > devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
> > write back buffer in flight due to the former shutdown of the bridge which
> > clears the BusMaster bit.
>
> If you revert commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services
> during shutdown"), does the issue go away?

Yes, it is gone.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

Andy Shevchenko-2
In reply to this post by Pingfan Liu-2
I think Pavel would be interested to see this as well (he is doing
some parallel device shutdown stuff)

On Tue, Jul 3, 2018 at 9:50 AM, Pingfan Liu <[hidden email]> wrote:

> commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> places an assumption of supplier<-consumer order on the process of probe.
> But it turns out to break down the parent <- child order in some scene.
> E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> have been probed. Then comes the bridge's module, which enables extra
> feature(such as hotplug) on this bridge. This will break the
> parent<-children order and cause failure when "kexec -e" in some scenario.
>
> The detailed description of the scenario:
> An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
> match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> to some issue. For this case, the bridge is moved after its children in
> devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
> write back buffer in flight due to the former shutdown of the bridge which
> clears the BusMaster bit.
>
> It is a little hard to impose both "parent<-child" and "supplier<-consumer"
> order on devices_kset. Take the following scene:
> step0: before a consumer's probing, (note child_a is supplier of consumer_a)
>   [ consumer-X, child_a, ...., child_z] [... consumer_a, ..., consumer_z, ...] supplier-X
>                                          ^^^^^^^^^^ affected range ^^^^^^^^^^
> step1: when probing, moving consumer-X after supplier-X
>   [ child_a, ...., child_z] [.... consumer_a, ..., consumer_z, ...] supplier-X, consumer-X
> step2: the children of consumer-X should be re-ordered to maintain the seq
>   [... consumer_a, ..., consumer_z, ....] supplier-X  [consumer-X, child_a, ...., child_z]
> step3: the consumer_a should be re-ordered to maintain the seq
>   [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a ..., child_z]
>
> It requires two nested recursion to drain out all out-of-order item in
> "affected range". To avoid such complicated code, this patch suggests
> to utilize the info in device tree, instead of using the order of
> devices_kset during shutdown. It iterates the device tree, and firstly
> shutdown a device's children and consumers. After this patch, the buggy
> commit is hollow and left to clean.
>
> Cc: Greg Kroah-Hartman <[hidden email]>
> Cc: Rafael J. Wysocki <[hidden email]>
> Cc: Grygorii Strashko <[hidden email]>
> Cc: Christoph Hellwig <[hidden email]>
> Cc: Bjorn Helgaas <[hidden email]>
> Cc: Dave Young <[hidden email]>
> Cc: [hidden email]
> Cc: [hidden email]
> Signed-off-by: Pingfan Liu <[hidden email]>
> ---
>  drivers/base/core.c    | 48 +++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/device.h |  1 +
>  2 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index a48868f..684b994 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
>         INIT_LIST_HEAD(&dev->links.consumers);
>         INIT_LIST_HEAD(&dev->links.suppliers);
>         dev->links.status = DL_DEV_NO_DRIVER;
> +       dev->shutdown = false;
>  }
>  EXPORT_SYMBOL_GPL(device_initialize);
>
> @@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
>          * lock is to be held
>          */
>         parent = get_device(dev->parent);
> -       get_device(dev);
>         /*
>          * Make sure the device is off the kset list, in the
>          * event that dev->*->shutdown() doesn't remove it.
> @@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device *dev)
>                         dev_info(dev, "shutdown\n");
>                 dev->driver->shutdown(dev);
>         }
> -
> +       dev->shutdown = true;
>         device_unlock(dev);
>         if (parent)
>                 device_unlock(parent);
>
> -       put_device(dev);
>         put_device(parent);
>         spin_lock(&devices_kset->list_lock);
>  }
>
> +/* shutdown dev's children and consumer firstly, then itself */
> +static int device_for_each_child_shutdown(struct device *dev)
> +{
> +       struct klist_iter i;
> +       struct device *child;
> +       struct device_link *link;
> +
> +       /* already shutdown, then skip this sub tree */
> +       if (dev->shutdown)
> +               return 0;
> +
> +       if (!dev->p)
> +               goto check_consumers;
> +
> +       /* there is breakage of lock in __device_shutdown(), and the redundant
> +        * ref++ on srcu protected consumer is harmless since shutdown is not
> +        * hot path.
> +        */
> +       get_device(dev);
> +
> +       klist_iter_init(&dev->p->klist_children, &i);
> +       while ((child = next_device(&i)))
> +               device_for_each_child_shutdown(child);
> +       klist_iter_exit(&i);
> +
> +check_consumers:
> +       list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
> +               if (!link->consumer->shutdown)
> +                       device_for_each_child_shutdown(link->consumer);
> +       }
> +
> +       __device_shutdown(dev);
> +       put_device(dev);
> +       return 0;
> +}
> +
>  /**
>   * device_shutdown - call ->shutdown() on each device to shutdown.
>   */
>  void device_shutdown(void)
>  {
>         struct device *dev;
> +       int idx;
>
> +       idx = device_links_read_lock();
>         spin_lock(&devices_kset->list_lock);
>         /*
>          * Walk the devices list backward, shutting down each in turn.
> @@ -2866,11 +2903,12 @@ void device_shutdown(void)
>          * devices offline, even as the system is shutting down.
>          */
>         while (!list_empty(&devices_kset->list)) {
> -               dev = list_entry(devices_kset->list.prev, struct device,
> +               dev = list_entry(devices_kset->list.next, struct device,
>                                 kobj.entry);
> -               __device_shutdown(dev);
> +               device_for_each_child_shutdown(dev);
>         }
>         spin_unlock(&devices_kset->list_lock);
> +       device_links_read_unlock(idx);
>  }
>
>  /*
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 055a69d..8a0f784 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -1003,6 +1003,7 @@ struct device {
>         bool                    offline:1;
>         bool                    of_node_reused:1;
>         bool                    dma_32bit_limit:1;
> +       bool                    shutdown:1; /* one direction: false->true */
>  };
>
>  static inline struct device *kobj_to_dev(struct kobject *kobj)
> --
> 2.7.4
>



--
With Best Regards,
Andy Shevchenko
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 3/4] drivers/base: clean up the usage of devices_kset_move_last()

Rafael J. Wysocki-2
In reply to this post by Pingfan Liu-2
On Tuesday, July 3, 2018 8:50:41 AM CEST Pingfan Liu wrote:

> Clean up the referring to the code in commit 52cdbdd49853 ("driver core:
> correct device's shutdown order"). So later we can revert it safely.
>
> Cc: Greg Kroah-Hartman <[hidden email]>
> Cc: Rafael J. Wysocki <[hidden email]>
> Cc: Grygorii Strashko <[hidden email]>
> Cc: Christoph Hellwig <[hidden email]>
> Cc: Bjorn Helgaas <[hidden email]>
> Cc: Dave Young <[hidden email]>
> Cc: [hidden email]
> Cc: [hidden email]
> Signed-off-by: Pingfan Liu <[hidden email]>
> ---
>  drivers/base/core.c | 7 -------
>  1 file changed, 7 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 684b994..db3deb8 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -127,13 +127,6 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
>  {
>   struct device_link *link;
>  
> - /*
> - * Devices that have not been registered yet will be put to the ends
> - * of the lists during the registration, so skip them here.
> - */
> - if (device_is_registered(dev))
> - devices_kset_move_last(dev);
> -
>   if (device_pm_initialized(dev))
>   device_pm_move_last(dev);

You can't do this.

If you do it, that will break power management in some situations.

Thanks,
Rafael

Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

Rafael J. Wysocki-2
In reply to this post by Pingfan Liu-2
On Tuesday, July 3, 2018 8:50:38 AM CEST Pingfan Liu wrote:
> commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> places an assumption of supplier<-consumer order on the process of probe.
> But it turns out to break down the parent <- child order in some scene.
> E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> have been probed. Then comes the bridge's module, which enables extra
> feature(such as hotplug) on this bridge.

So what *exactly* does happen in that case?

Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

Pavel Tatashin
In reply to this post by Andy Shevchenko-2
Thank you Andy for the heads up. I might need to rebase my work
(http://lkml.kernel.org/r/20180629182541.6735-1-pasha.tatashin@...)
based on this change. But, it is possible it is going to be harder to
parallelize based on device tree. I will need to think about it.

Pavel

On Tue, Jul 3, 2018 at 6:59 AM Andy Shevchenko
<[hidden email]> wrote:

>
> I think Pavel would be interested to see this as well (he is doing
> some parallel device shutdown stuff)
>
> On Tue, Jul 3, 2018 at 9:50 AM, Pingfan Liu <[hidden email]> wrote:
> > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > places an assumption of supplier<-consumer order on the process of probe.
> > But it turns out to break down the parent <- child order in some scene.
> > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > have been probed. Then comes the bridge's module, which enables extra
> > feature(such as hotplug) on this bridge. This will break the
> > parent<-children order and cause failure when "kexec -e" in some scenario.
> >
> > The detailed description of the scenario:
> > An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
> > match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> > to some issue. For this case, the bridge is moved after its children in
> > devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
> > write back buffer in flight due to the former shutdown of the bridge which
> > clears the BusMaster bit.
> >
> > It is a little hard to impose both "parent<-child" and "supplier<-consumer"
> > order on devices_kset. Take the following scene:
> > step0: before a consumer's probing, (note child_a is supplier of consumer_a)
> >   [ consumer-X, child_a, ...., child_z] [... consumer_a, ..., consumer_z, ...] supplier-X
> >                                          ^^^^^^^^^^ affected range ^^^^^^^^^^
> > step1: when probing, moving consumer-X after supplier-X
> >   [ child_a, ...., child_z] [.... consumer_a, ..., consumer_z, ...] supplier-X, consumer-X
> > step2: the children of consumer-X should be re-ordered to maintain the seq
> >   [... consumer_a, ..., consumer_z, ....] supplier-X  [consumer-X, child_a, ...., child_z]
> > step3: the consumer_a should be re-ordered to maintain the seq
> >   [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a ..., child_z]
> >
> > It requires two nested recursion to drain out all out-of-order item in
> > "affected range". To avoid such complicated code, this patch suggests
> > to utilize the info in device tree, instead of using the order of
> > devices_kset during shutdown. It iterates the device tree, and firstly
> > shutdown a device's children and consumers. After this patch, the buggy
> > commit is hollow and left to clean.
> >
> > Cc: Greg Kroah-Hartman <[hidden email]>
> > Cc: Rafael J. Wysocki <[hidden email]>
> > Cc: Grygorii Strashko <[hidden email]>
> > Cc: Christoph Hellwig <[hidden email]>
> > Cc: Bjorn Helgaas <[hidden email]>
> > Cc: Dave Young <[hidden email]>
> > Cc: [hidden email]
> > Cc: [hidden email]
> > Signed-off-by: Pingfan Liu <[hidden email]>
> > ---
> >  drivers/base/core.c    | 48 +++++++++++++++++++++++++++++++++++++++++++-----
> >  include/linux/device.h |  1 +
> >  2 files changed, 44 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index a48868f..684b994 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
> >         INIT_LIST_HEAD(&dev->links.consumers);
> >         INIT_LIST_HEAD(&dev->links.suppliers);
> >         dev->links.status = DL_DEV_NO_DRIVER;
> > +       dev->shutdown = false;
> >  }
> >  EXPORT_SYMBOL_GPL(device_initialize);
> >
> > @@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
> >          * lock is to be held
> >          */
> >         parent = get_device(dev->parent);
> > -       get_device(dev);
> >         /*
> >          * Make sure the device is off the kset list, in the
> >          * event that dev->*->shutdown() doesn't remove it.
> > @@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device *dev)
> >                         dev_info(dev, "shutdown\n");
> >                 dev->driver->shutdown(dev);
> >         }
> > -
> > +       dev->shutdown = true;
> >         device_unlock(dev);
> >         if (parent)
> >                 device_unlock(parent);
> >
> > -       put_device(dev);
> >         put_device(parent);
> >         spin_lock(&devices_kset->list_lock);
> >  }
> >
> > +/* shutdown dev's children and consumer firstly, then itself */
> > +static int device_for_each_child_shutdown(struct device *dev)
> > +{
> > +       struct klist_iter i;
> > +       struct device *child;
> > +       struct device_link *link;
> > +
> > +       /* already shutdown, then skip this sub tree */
> > +       if (dev->shutdown)
> > +               return 0;
> > +
> > +       if (!dev->p)
> > +               goto check_consumers;
> > +
> > +       /* there is breakage of lock in __device_shutdown(), and the redundant
> > +        * ref++ on srcu protected consumer is harmless since shutdown is not
> > +        * hot path.
> > +        */
> > +       get_device(dev);
> > +
> > +       klist_iter_init(&dev->p->klist_children, &i);
> > +       while ((child = next_device(&i)))
> > +               device_for_each_child_shutdown(child);
> > +       klist_iter_exit(&i);
> > +
> > +check_consumers:
> > +       list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
> > +               if (!link->consumer->shutdown)
> > +                       device_for_each_child_shutdown(link->consumer);
> > +       }
> > +
> > +       __device_shutdown(dev);
> > +       put_device(dev);
> > +       return 0;
> > +}
> > +
> >  /**
> >   * device_shutdown - call ->shutdown() on each device to shutdown.
> >   */
> >  void device_shutdown(void)
> >  {
> >         struct device *dev;
> > +       int idx;
> >
> > +       idx = device_links_read_lock();
> >         spin_lock(&devices_kset->list_lock);
> >         /*
> >          * Walk the devices list backward, shutting down each in turn.
> > @@ -2866,11 +2903,12 @@ void device_shutdown(void)
> >          * devices offline, even as the system is shutting down.
> >          */
> >         while (!list_empty(&devices_kset->list)) {
> > -               dev = list_entry(devices_kset->list.prev, struct device,
> > +               dev = list_entry(devices_kset->list.next, struct device,
> >                                 kobj.entry);
> > -               __device_shutdown(dev);
> > +               device_for_each_child_shutdown(dev);
> >         }
> >         spin_unlock(&devices_kset->list_lock);
> > +       device_links_read_unlock(idx);
> >  }
> >
> >  /*
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 055a69d..8a0f784 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -1003,6 +1003,7 @@ struct device {
> >         bool                    offline:1;
> >         bool                    of_node_reused:1;
> >         bool                    dma_32bit_limit:1;
> > +       bool                    shutdown:1; /* one direction: false->true */
> >  };
> >
> >  static inline struct device *kobj_to_dev(struct kobject *kobj)
> > --
> > 2.7.4
> >
>
>
>
> --
> With Best Regards,
> Andy Shevchenko
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

Pingfan Liu-2
In reply to this post by Rafael J. Wysocki-2
On Tue, Jul 3, 2018 at 10:36 PM Rafael J. Wysocki <[hidden email]> wrote:

>
> On Tuesday, July 3, 2018 8:50:38 AM CEST Pingfan Liu wrote:
> > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > places an assumption of supplier<-consumer order on the process of probe.
> > But it turns out to break down the parent <- child order in some scene.
> > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > have been probed. Then comes the bridge's module, which enables extra
> > feature(such as hotplug) on this bridge.
>
> So what *exactly* does happen in that case?
>
I saw the  shpc_probe() is called on the bridge, although the probing
failed on that bare-metal. But if it success, then it will enable the
hotplug feature on the bridge.

Thanks,
Pingfan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

Pingfan Liu-2
In reply to this post by Pingfan Liu-2
On Tue, Jul 3, 2018 at 5:26 PM Pingfan Liu <[hidden email]> wrote:

>
> On Tue, Jul 3, 2018 at 3:51 PM Lukas Wunner <[hidden email]> wrote:
> >
> > On Tue, Jul 03, 2018 at 02:50:40PM +0800, Pingfan Liu wrote:
> > > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > > places an assumption of supplier<-consumer order on the process of probe.
> > > But it turns out to break down the parent <- child order in some scene.
> > > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > > have been probed. Then comes the bridge's module, which enables extra
> > > feature(such as hotplug) on this bridge. This will break the
> > > parent<-children order and cause failure when "kexec -e" in some scenario.
> > >
> > > The detailed description of the scenario:
> > > An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
> > > match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> > > to some issue. For this case, the bridge is moved after its children in
> > > devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
> > > write back buffer in flight due to the former shutdown of the bridge which
> > > clears the BusMaster bit.
> >
> > If you revert commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services
> > during shutdown"), does the issue go away?
>
> Yes, it is gone.

Have not figured out why the issue was gone. But I think it just cover
some fault.

re-fetch the boot log of mainline kernel without any patch, and filter
out the pci domain 0004
grep "devices_kset: Moving 0004:" newlog.txt
[    2.114986] devices_kset: Moving 0004:00:00.0 to end of list   <---
pcie port drive's probe, but it failed
[    2.115192] devices_kset: Moving 0004:01:00.0 to end of list
[    2.115591] devices_kset: Moving 0004:02:02.0 to end of list
[    2.115923] devices_kset: Moving 0004:02:0a.0 to end of list
[    2.116141] devices_kset: Moving 0004:02:0b.0 to end of list
[    2.116358] devices_kset: Moving 0004:02:0c.0 to end of list
[    3.181860] devices_kset: Moving 0004:03:00.0 to end of list  <---
the ata disk controller which sits behind the bridge
[   10.267081] devices_kset: Moving 0004:00:00.0 to end of list  <---
shpc_probe() on this bridge, failed too.

Hence we have the bridge (parent) after the child in devices_kset.

Thanks,
Pingfan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 3/4] drivers/base: clean up the usage of devices_kset_move_last()

Pingfan Liu-2
In reply to this post by Rafael J. Wysocki-2
On Tue, Jul 3, 2018 at 10:28 PM Rafael J. Wysocki <[hidden email]> wrote:

>
> On Tuesday, July 3, 2018 8:50:41 AM CEST Pingfan Liu wrote:
> > Clean up the referring to the code in commit 52cdbdd49853 ("driver core:
> > correct device's shutdown order"). So later we can revert it safely.
> >
> > Cc: Greg Kroah-Hartman <[hidden email]>
> > Cc: Rafael J. Wysocki <[hidden email]>
> > Cc: Grygorii Strashko <[hidden email]>
> > Cc: Christoph Hellwig <[hidden email]>
> > Cc: Bjorn Helgaas <[hidden email]>
> > Cc: Dave Young <[hidden email]>
> > Cc: [hidden email]
> > Cc: [hidden email]
> > Signed-off-by: Pingfan Liu <[hidden email]>
> > ---
> >  drivers/base/core.c | 7 -------
> >  1 file changed, 7 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 684b994..db3deb8 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -127,13 +127,6 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
> >  {
> >       struct device_link *link;
> >
> > -     /*
> > -      * Devices that have not been registered yet will be put to the ends
> > -      * of the lists during the registration, so skip them here.
> > -      */
> > -     if (device_is_registered(dev))
> > -             devices_kset_move_last(dev);
> > -
> >       if (device_pm_initialized(dev))
> >               device_pm_move_last(dev);
>
> You can't do this.
>
> If you do it, that will break power management in some situations.
>
Could you shed light on it? I had a quick browsing of pm code, but it
is a big function, and I got lost in it.
If the above code causes failure, then does it imply that the seq in
devices_kset should be the same as dpm_list? But in device_shutdown(),
it only intersect with pm by
pm_runtime_get_noresume(dev) and pm_runtime_barrier(dev). How do these
function affect the seq in dpm_list? Need your help to see how to
handle this issue.

Thanks and regards,
Pingfan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 3/4] drivers/base: clean up the usage of devices_kset_move_last()

Rafael J. Wysocki-2
On Wednesday, July 4, 2018 6:40:09 AM CEST Pingfan Liu wrote:

> On Tue, Jul 3, 2018 at 10:28 PM Rafael J. Wysocki <[hidden email]> wrote:
> >
> > On Tuesday, July 3, 2018 8:50:41 AM CEST Pingfan Liu wrote:
> > > Clean up the referring to the code in commit 52cdbdd49853 ("driver core:
> > > correct device's shutdown order"). So later we can revert it safely.
> > >
> > > Cc: Greg Kroah-Hartman <[hidden email]>
> > > Cc: Rafael J. Wysocki <[hidden email]>
> > > Cc: Grygorii Strashko <[hidden email]>
> > > Cc: Christoph Hellwig <[hidden email]>
> > > Cc: Bjorn Helgaas <[hidden email]>
> > > Cc: Dave Young <[hidden email]>
> > > Cc: [hidden email]
> > > Cc: [hidden email]
> > > Signed-off-by: Pingfan Liu <[hidden email]>
> > > ---
> > >  drivers/base/core.c | 7 -------
> > >  1 file changed, 7 deletions(-)
> > >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index 684b994..db3deb8 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -127,13 +127,6 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
> > >  {
> > >       struct device_link *link;
> > >
> > > -     /*
> > > -      * Devices that have not been registered yet will be put to the ends
> > > -      * of the lists during the registration, so skip them here.
> > > -      */
> > > -     if (device_is_registered(dev))
> > > -             devices_kset_move_last(dev);
> > > -
> > >       if (device_pm_initialized(dev))
> > >               device_pm_move_last(dev);
> >
> > You can't do this.
> >
> > If you do it, that will break power management in some situations.
> >
> Could you shed light on it? I had a quick browsing of pm code, but it
> is a big function, and I got lost in it.
> If the above code causes failure, then does it imply that the seq in
> devices_kset should be the same as dpm_list?

Generally, yes it should.

> But in device_shutdown(), it only intersect with pm by
> pm_runtime_get_noresume(dev) and pm_runtime_barrier(dev). How do these
> function affect the seq in dpm_list?

They are not related to dpm_list directly.

However, if you shut down a supplier device before its consumer and that
involves power management, then the consumer shutdown may fail and lock up
the system

I asked you elsewhere to clearly describe the problem you are trying to
address.  Please do that in the first place.

Thanks,
Rafael

Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

Rafael J. Wysocki-2
In reply to this post by Pingfan Liu-2
On Wednesday, July 4, 2018 4:47:07 AM CEST Pingfan Liu wrote:

> On Tue, Jul 3, 2018 at 10:36 PM Rafael J. Wysocki <[hidden email]> wrote:
> >
> > On Tuesday, July 3, 2018 8:50:38 AM CEST Pingfan Liu wrote:
> > > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > > places an assumption of supplier<-consumer order on the process of probe.
> > > But it turns out to break down the parent <- child order in some scene.
> > > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > > have been probed. Then comes the bridge's module, which enables extra
> > > feature(such as hotplug) on this bridge.
> >
> > So what *exactly* does happen in that case?
> >
> I saw the  shpc_probe() is called on the bridge, although the probing
> failed on that bare-metal. But if it success, then it will enable the
> hotplug feature on the bridge.

I don't understand what you are saying here, sorry.

device_reorder_to_tail() walks the entire device hierarchy below the target
and moves all of the children in there *after* their parents.

How can it break "the parent <- child order" then?

Thanks,
Rafael

Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

kbuild test robot
In reply to this post by Pingfan Liu-2
Hi Pingfan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on driver-core/driver-core-testing]
[also build test WARNING on v4.18-rc3 next-20180704]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Pingfan-Liu/drivers-base-bugfix-for-supplier-consumer-ordering-in-device_kset/20180703-184317
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org)
   mm/mempool.c:228: warning: Function parameter or member 'pool' not described in 'mempool_init'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ibss' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.connect' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.keys' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ie' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ie_len' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.bssid' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ssid' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.default_key' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.default_mgmt_key' not described in 'wireless_dev'
   include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.prev_bssid_valid' not described in 'wireless_dev'
   include/net/mac80211.h:2282: warning: Function parameter or member 'radiotap_timestamp.units_pos' not described in 'ieee80211_hw'
   include/net/mac80211.h:2282: warning: Function parameter or member 'radiotap_timestamp.accuracy' not described in 'ieee80211_hw'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.rts_cts_rate_idx' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.use_rts' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.use_cts_prot' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.short_preamble' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.skip_table' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.jiffies' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.vif' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.hw_key' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.flags' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'control.enqueue_time' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'ack' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'ack.cookie' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.ack_signal' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.ampdu_ack_len' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.ampdu_len' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.antenna' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.tx_time' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.is_valid_ack_signal' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'status.status_driver_data' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'driver_rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'pad' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:955: warning: Function parameter or member 'rate_driver_data' not described in 'ieee80211_tx_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'rx_stats_avg' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'rx_stats_avg.signal' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'rx_stats_avg.chain_signal' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.filtered' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.retry_failed' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.retry_count' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.lost_packets' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.last_tdls_pkt_time' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.msdu_retries' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.msdu_failed' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.last_ack' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.last_ack_signal' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.ack_signal_filled' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.avg_ack_signal' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.packets' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.bytes' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.last_rate' not described in 'sta_info'
   net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.msdu' not described in 'sta_info'
   kernel/sched/fair.c:3760: warning: Function parameter or member 'flags' not described in 'attach_entity_load_avg'
   include/linux/device.h:93: warning: bad line: this bus.
   include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_excl.cb' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_excl.poll' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_excl.active' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_shared.cb' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_shared.poll' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_shared.active' not described in 'dma_buf'
   include/linux/dma-fence-array.h:54: warning: Function parameter or member 'work' not described in 'dma_fence_array'
   include/linux/gpio/driver.h:142: warning: Function parameter or member 'request_key' not described in 'gpio_irq_chip'
   include/linux/iio/hw-consumer.h:1: warning: no structured comments found
   include/linux/device.h:94: warning: bad line: this bus.
>> include/linux/device.h:1008: warning: Function parameter or member 'shutdown' not described in 'device'
   include/linux/input/sparse-keymap.h:46: warning: Function parameter or member 'sw' not described in 'key_entry'
   include/linux/regulator/driver.h:227: warning: Function parameter or member 'resume_early' not described in 'regulator_ops'
   drivers/regulator/core.c:4465: warning: Excess function parameter 'state' description in 'regulator_suspend_late'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw0' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw1' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw2' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw3' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.eadm' not described in 'irb'
   drivers/usb/dwc3/gadget.c:510: warning: Excess function parameter 'dwc' description in 'dwc3_gadget_start_config'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_pin' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_unpin' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_res_obj' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_get_sg_table' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_import_sg_table' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_vmap' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_vunmap' not described in 'drm_driver'
   include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_mmap' not described in 'drm_driver'
   drivers/gpu/drm/i915/i915_vma.h:48: warning: cannot understand function prototype: 'struct i915_vma '
   drivers/gpu/drm/i915/i915_vma.h:1: warning: no structured comments found
   include/drm/tinydrm/tinydrm.h:34: warning: Function parameter or member 'fb_dirty' not described in 'tinydrm_device'
   drivers/gpu/drm/tinydrm/mipi-dbi.c:272: warning: Function parameter or member 'crtc_state' not described in 'mipi_dbi_enable_flush'
   drivers/gpu/drm/tinydrm/mipi-dbi.c:272: warning: Function parameter or member 'plane_state' not described in 'mipi_dbi_enable_flush'

vim +1008 include/linux/device.h

^1da177e Linus Torvalds 2005-04-16 @1008  

:::::: The code at line 1008 was first introduced by commit
:::::: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Linux-2.6.12-rc2

:::::: TO: Linus Torvalds <[hidden email]>
:::::: CC: Linus Torvalds <[hidden email]>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

.config.gz (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 3/4] drivers/base: clean up the usage of devices_kset_move_last()

Pingfan Liu-2
In reply to this post by Rafael J. Wysocki-2
On Wed, Jul 4, 2018 at 6:18 PM Rafael J. Wysocki <[hidden email]> wrote:

>
> On Wednesday, July 4, 2018 6:40:09 AM CEST Pingfan Liu wrote:
> > On Tue, Jul 3, 2018 at 10:28 PM Rafael J. Wysocki <[hidden email]> wrote:
> > >
> > > On Tuesday, July 3, 2018 8:50:41 AM CEST Pingfan Liu wrote:
> > > > Clean up the referring to the code in commit 52cdbdd49853 ("driver core:
> > > > correct device's shutdown order"). So later we can revert it safely.
> > > >
> > > > Cc: Greg Kroah-Hartman <[hidden email]>
> > > > Cc: Rafael J. Wysocki <[hidden email]>
> > > > Cc: Grygorii Strashko <[hidden email]>
> > > > Cc: Christoph Hellwig <[hidden email]>
> > > > Cc: Bjorn Helgaas <[hidden email]>
> > > > Cc: Dave Young <[hidden email]>
> > > > Cc: [hidden email]
> > > > Cc: [hidden email]
> > > > Signed-off-by: Pingfan Liu <[hidden email]>
> > > > ---
> > > >  drivers/base/core.c | 7 -------
> > > >  1 file changed, 7 deletions(-)
> > > >
> > > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > > index 684b994..db3deb8 100644
> > > > --- a/drivers/base/core.c
> > > > +++ b/drivers/base/core.c
> > > > @@ -127,13 +127,6 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
> > > >  {
> > > >       struct device_link *link;
> > > >
> > > > -     /*
> > > > -      * Devices that have not been registered yet will be put to the ends
> > > > -      * of the lists during the registration, so skip them here.
> > > > -      */
> > > > -     if (device_is_registered(dev))
> > > > -             devices_kset_move_last(dev);
> > > > -
> > > >       if (device_pm_initialized(dev))
> > > >               device_pm_move_last(dev);
> > >
> > > You can't do this.
> > >
> > > If you do it, that will break power management in some situations.
> > >
> > Could you shed light on it? I had a quick browsing of pm code, but it
> > is a big function, and I got lost in it.
> > If the above code causes failure, then does it imply that the seq in
> > devices_kset should be the same as dpm_list?
>
> Generally, yes it should.
>
> > But in device_shutdown(), it only intersect with pm by
> > pm_runtime_get_noresume(dev) and pm_runtime_barrier(dev). How do these
> > function affect the seq in dpm_list?
>
> They are not related to dpm_list directly.
>
> However, if you shut down a supplier device before its consumer and that
> involves power management, then the consumer shutdown may fail and lock up
> the system
>
Ah, get your point. The patch in this series "[PATCHv3 2/4]
drivers/base: utilize device tree info to shutdown devices" still obey
the shutdown order "parent<-child" and "supplier<-consumer". It just
utilizes device-tree info to achieve this, since it turns out not easy
to maintain such order in devices_kset. As I described in the commit
log of [2/4], it needs two nested recursion, and should consider the
breakage of devices_kset's spinlock.

> I asked you elsewhere to clearly describe the problem you are trying to
> address.  Please do that in the first place.
>
OK, I will reply your question in [0/4]

Thanks,
Pingfan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

Pingfan Liu-2
In reply to this post by Rafael J. Wysocki-2
On Wed, Jul 4, 2018 at 6:23 PM Rafael J. Wysocki <[hidden email]> wrote:

>
> On Wednesday, July 4, 2018 4:47:07 AM CEST Pingfan Liu wrote:
> > On Tue, Jul 3, 2018 at 10:36 PM Rafael J. Wysocki <[hidden email]> wrote:
> > >
> > > On Tuesday, July 3, 2018 8:50:38 AM CEST Pingfan Liu wrote:
> > > > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > > > places an assumption of supplier<-consumer order on the process of probe.
> > > > But it turns out to break down the parent <- child order in some scene.
> > > > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > > > have been probed. Then comes the bridge's module, which enables extra
> > > > feature(such as hotplug) on this bridge.
> > >
> > > So what *exactly* does happen in that case?
> > >
> > I saw the  shpc_probe() is called on the bridge, although the probing
> > failed on that bare-metal. But if it success, then it will enable the
> > hotplug feature on the bridge.
>
> I don't understand what you are saying here, sorry.
>
On the system, I observe the following:
[    2.114986] devices_kset: Moving 0004:00:00.0 to end of list
<---pcie port drive's probe, but it failed
[    2.115192] devices_kset: Moving 0004:01:00.0 to end of list
[    2.115591] devices_kset: Moving 0004:02:02.0 to end of list
[    2.115923] devices_kset: Moving 0004:02:0a.0 to end of list
[    2.116141] devices_kset: Moving 0004:02:0b.0 to end of list
[    2.116358] devices_kset: Moving 0004:02:0c.0 to end of list
[    3.181860] devices_kset: Moving 0004:03:00.0 to end of list
<---the ata disk controller which sits behind the bridge
[   10.267081] devices_kset: Moving 0004:00:00.0 to end of list
 <---shpc_probe() on this bridge, failed too.

As you can the the parent device "0004:00:00.0" is moved twice, and
finally, it is after the "0004:03:00.0", this will break the
"parent<-child" order in devices_kset. This is caused by the code
really_probe()->devices_kset_move_last(). Apparently, it makes
assumption that child device's probing comes after its parent's. But
it does not stand up in the case.

> device_reorder_to_tail() walks the entire device hierarchy below the target
> and moves all of the children in there *after* their parents.
>
As described, the bug is not related with device_reorder_to_tail(), it
is related with really_probe()->devices_kset_move_last(). So [2/4]
uses different method to achieve the "parent<-child" and
"supplier<-consumer" order. The [3/4] clean up some code in
device_reorder_to_tail(), since I need to revert the commit.

> How can it break "the parent <- child order" then?
>
As described, it does not, just not be in use any longer.

Thanks and regards,
Pingfan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

Rafael J. Wysocki-4
On Thu, Jul 5, 2018 at 4:44 AM, Pingfan Liu <[hidden email]> wrote:

> On Wed, Jul 4, 2018 at 6:23 PM Rafael J. Wysocki <[hidden email]> wrote:
>>
>> On Wednesday, July 4, 2018 4:47:07 AM CEST Pingfan Liu wrote:
>> > On Tue, Jul 3, 2018 at 10:36 PM Rafael J. Wysocki <[hidden email]> wrote:
>> > >
>> > > On Tuesday, July 3, 2018 8:50:38 AM CEST Pingfan Liu wrote:
>> > > > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
>> > > > places an assumption of supplier<-consumer order on the process of probe.
>> > > > But it turns out to break down the parent <- child order in some scene.
>> > > > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
>> > > > have been probed. Then comes the bridge's module, which enables extra
>> > > > feature(such as hotplug) on this bridge.
>> > >
>> > > So what *exactly* does happen in that case?
>> > >
>> > I saw the  shpc_probe() is called on the bridge, although the probing
>> > failed on that bare-metal. But if it success, then it will enable the
>> > hotplug feature on the bridge.
>>
>> I don't understand what you are saying here, sorry.
>>
> On the system, I observe the following:
> [    2.114986] devices_kset: Moving 0004:00:00.0 to end of list
> <---pcie port drive's probe, but it failed
> [    2.115192] devices_kset: Moving 0004:01:00.0 to end of list
> [    2.115591] devices_kset: Moving 0004:02:02.0 to end of list
> [    2.115923] devices_kset: Moving 0004:02:0a.0 to end of list
> [    2.116141] devices_kset: Moving 0004:02:0b.0 to end of list
> [    2.116358] devices_kset: Moving 0004:02:0c.0 to end of list
> [    3.181860] devices_kset: Moving 0004:03:00.0 to end of list
> <---the ata disk controller which sits behind the bridge
> [   10.267081] devices_kset: Moving 0004:00:00.0 to end of list
>  <---shpc_probe() on this bridge, failed too.
>
> As you can the the parent device "0004:00:00.0" is moved twice, and
> finally, it is after the "0004:03:00.0", this will break the
> "parent<-child" order in devices_kset. This is caused by the code
> really_probe()->devices_kset_move_last(). Apparently, it makes
> assumption that child device's probing comes after its parent's. But
> it does not stand up in the case.
>
>> device_reorder_to_tail() walks the entire device hierarchy below the target
>> and moves all of the children in there *after* their parents.
>>
> As described, the bug is not related with device_reorder_to_tail(), it
> is related with really_probe()->devices_kset_move_last().

OK, so calling devices_kset_move_last() from really_probe() clearly is
a mistake.

I'm not really sure what the intention of it was as the changelog of
commit 52cdbdd49853d doesn't really explain that (why would it be
insufficient without that change?) and I'm quite sure that in the
majority of cases it is unnecessary.

I *think* that it attempted to cover a use case similar to the device
links one, but it should have moved children along with the parent
every time like device_link_add() does.

> So [2/4] uses different method to achieve the "parent<-child" and
> "supplier<-consumer" order. The [3/4] clean up some code in
> device_reorder_to_tail(), since I need to revert the commit.

OK, let me look at that one.

Thanks,
Rafael
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

Rafael J. Wysocki-2
In reply to this post by Pingfan Liu-2
On Tuesday, July 3, 2018 8:50:40 AM CEST Pingfan Liu wrote:

> commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> places an assumption of supplier<-consumer order on the process of probe.
> But it turns out to break down the parent <- child order in some scene.
> E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> have been probed. Then comes the bridge's module, which enables extra
> feature(such as hotplug) on this bridge. This will break the
> parent<-children order and cause failure when "kexec -e" in some scenario.
>
> The detailed description of the scenario:
> An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
> match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> to some issue. For this case, the bridge is moved after its children in
> devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
> write back buffer in flight due to the former shutdown of the bridge which
> clears the BusMaster bit.
>
> It is a little hard to impose both "parent<-child" and "supplier<-consumer"
> order on devices_kset. Take the following scene:
> step0: before a consumer's probing, (note child_a is supplier of consumer_a)
>   [ consumer-X, child_a, ...., child_z] [... consumer_a, ..., consumer_z, ...] supplier-X
>                                          ^^^^^^^^^^ affected range ^^^^^^^^^^
> step1: when probing, moving consumer-X after supplier-X
>   [ child_a, ...., child_z] [.... consumer_a, ..., consumer_z, ...] supplier-X, consumer-X
> step2: the children of consumer-X should be re-ordered to maintain the seq
>   [... consumer_a, ..., consumer_z, ....] supplier-X  [consumer-X, child_a, ...., child_z]
> step3: the consumer_a should be re-ordered to maintain the seq
>   [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a ..., child_z]
>
> It requires two nested recursion to drain out all out-of-order item in
> "affected range". To avoid such complicated code, this patch suggests
> to utilize the info in device tree, instead of using the order of
> devices_kset during shutdown. It iterates the device tree, and firstly
> shutdown a device's children and consumers. After this patch, the buggy
> commit is hollow and left to clean.
>
> Cc: Greg Kroah-Hartman <[hidden email]>
> Cc: Rafael J. Wysocki <[hidden email]>
> Cc: Grygorii Strashko <[hidden email]>
> Cc: Christoph Hellwig <[hidden email]>
> Cc: Bjorn Helgaas <[hidden email]>
> Cc: Dave Young <[hidden email]>
> Cc: [hidden email]
> Cc: [hidden email]
> Signed-off-by: Pingfan Liu <[hidden email]>
> ---
>  drivers/base/core.c    | 48 +++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/device.h |  1 +
>  2 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index a48868f..684b994 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
>   INIT_LIST_HEAD(&dev->links.consumers);
>   INIT_LIST_HEAD(&dev->links.suppliers);
>   dev->links.status = DL_DEV_NO_DRIVER;
> + dev->shutdown = false;
>  }
>  EXPORT_SYMBOL_GPL(device_initialize);
>  
> @@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
>   * lock is to be held
>   */
>   parent = get_device(dev->parent);
> - get_device(dev);

Why is the get_/put_device() not needed any more?

>   /*
>   * Make sure the device is off the kset list, in the
>   * event that dev->*->shutdown() doesn't remove it.
> @@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device *dev)
>   dev_info(dev, "shutdown\n");
>   dev->driver->shutdown(dev);
>   }
> -
> + dev->shutdown = true;
>   device_unlock(dev);
>   if (parent)
>   device_unlock(parent);
>  
> - put_device(dev);
>   put_device(parent);
>   spin_lock(&devices_kset->list_lock);
>  }
>  
> +/* shutdown dev's children and consumer firstly, then itself */
> +static int device_for_each_child_shutdown(struct device *dev)

Confusing name.

What about device_shutdown_subordinate()?

> +{
> + struct klist_iter i;
> + struct device *child;
> + struct device_link *link;
> +
> + /* already shutdown, then skip this sub tree */
> + if (dev->shutdown)
> + return 0;
> +
> + if (!dev->p)
> + goto check_consumers;
> +
> + /* there is breakage of lock in __device_shutdown(), and the redundant
> + * ref++ on srcu protected consumer is harmless since shutdown is not
> + * hot path.
> + */
> + get_device(dev);
> +
> + klist_iter_init(&dev->p->klist_children, &i);
> + while ((child = next_device(&i)))
> + device_for_each_child_shutdown(child);

Why don't you use device_for_each_child() here?

> + klist_iter_exit(&i);
> +
> +check_consumers:
> + list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
> + if (!link->consumer->shutdown)
> + device_for_each_child_shutdown(link->consumer);
> + }
> +
> + __device_shutdown(dev);
> + put_device(dev);

Possible reference counter imbalance AFAICS.

> + return 0;
> +}

Well, instead of doing this dance, we might as well walk dpm_list here as it
is in the right order.

Of course, that would require dpm_list to be available for CONFIG_PM unset,
but it may be a better approach long term.

> +
>  /**
>   * device_shutdown - call ->shutdown() on each device to shutdown.
>   */
>  void device_shutdown(void)
>  {
>   struct device *dev;
> + int idx;
>  
> + idx = device_links_read_lock();
>   spin_lock(&devices_kset->list_lock);
>   /*
>   * Walk the devices list backward, shutting down each in turn.
> @@ -2866,11 +2903,12 @@ void device_shutdown(void)
>   * devices offline, even as the system is shutting down.
>   */
>   while (!list_empty(&devices_kset->list)) {
> - dev = list_entry(devices_kset->list.prev, struct device,
> + dev = list_entry(devices_kset->list.next, struct device,
>   kobj.entry);
> - __device_shutdown(dev);
> + device_for_each_child_shutdown(dev);
>   }
>   spin_unlock(&devices_kset->list_lock);
> + device_links_read_unlock(idx);
>  }
>  
>  /*
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 055a69d..8a0f784 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -1003,6 +1003,7 @@ struct device {
>   bool offline:1;
>   bool of_node_reused:1;
>   bool dma_32bit_limit:1;
> + bool shutdown:1; /* one direction: false->true */
>  };
>  
>  static inline struct device *kobj_to_dev(struct kobject *kobj)
>

If the device_kset_move_last() in really_probe() is the only problem,
I'd rather try to fix that one in the first place.

Why is it needed?

Thanks,
Rafael

12