Patchwork [0/2] cpufreq/opp: rework regulator initialization

login
register
mail settings
Submitter Sudeep Holla
Date Feb. 8, 2019, 11:39 a.m.
Message ID <20190208113904.GB7913@e107155-lin>
Download mbox | patch
Permalink /patch/721495/
State New
Headers show

Comments

Sudeep Holla - Feb. 8, 2019, 11:39 a.m.
On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > >
> > > The driver could do the I2C transactions in its suspend/resume
> > > callbacks and do nothing in online/offline if those are part of
> > > system-wide suspend/resume.
> >
> > These are per-policy things that we need to do, not sure if driver
> > suspend/resume is a good place for that. It is more for a case where
> > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > hot-unplugged during system suspend and hotplugged later on. This is
> > more like complete removal/addition of devices instead of
> > suspend/resume.
>
> No, it isn't.  We don't remove devices on offline.  We migrate stuff
> away from them and (opportunistically) power them down.
>
> If this is system suspend, the driver kind of knows that offline will
> take place, so it can prepare for it.  Likewise, when online takes
> place during system-wide resume, it generally is known that this is
> system-wide resume (there is a flag to indicate that in CPU hotplug),
> it can be "smart" and avoid accessing suspended devices.  Deferring
> the frequency set up until the driver resume time should do the trick
> I suppose.

I agree. The reason we don't see this generally on boot is because all
the CPUs are brought online before CPUfreq is initialised. While during
system suspend, we call cpufreq_online which in turn calls ->init in
the hotplug state machine.

So as Rafael suggests we need to do some trick, but can it be done in
the core itself ? I may be missing something, but how about the patch
below:

Regards,
Sudeep

--
Rafael J. Wysocki - Feb. 8, 2019, 12:03 p.m.
On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > > >
> > > > The driver could do the I2C transactions in its suspend/resume
> > > > callbacks and do nothing in online/offline if those are part of
> > > > system-wide suspend/resume.
> > >
> > > These are per-policy things that we need to do, not sure if driver
> > > suspend/resume is a good place for that. It is more for a case where
> > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > > hot-unplugged during system suspend and hotplugged later on. This is
> > > more like complete removal/addition of devices instead of
> > > suspend/resume.
> >
> > No, it isn't.  We don't remove devices on offline.  We migrate stuff
> > away from them and (opportunistically) power them down.
> >
> > If this is system suspend, the driver kind of knows that offline will
> > take place, so it can prepare for it.  Likewise, when online takes
> > place during system-wide resume, it generally is known that this is
> > system-wide resume (there is a flag to indicate that in CPU hotplug),
> > it can be "smart" and avoid accessing suspended devices.  Deferring
> > the frequency set up until the driver resume time should do the trick
> > I suppose.
>
> I agree. The reason we don't see this generally on boot is because all
> the CPUs are brought online before CPUfreq is initialised. While during
> system suspend, we call cpufreq_online which in turn calls ->init in
> the hotplug state machine.
>
> So as Rafael suggests we need to do some trick, but can it be done in
> the core itself ? I may be missing something, but how about the patch
> below:
>
> Regards,
> Sudeep
>
> --
> diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> index e35a886e00bc..7d8b0b99f91d 100644
> --- i/drivers/cpufreq/cpufreq.c
> +++ w/drivers/cpufreq/cpufreq.c
> @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
>                 policy->max = policy->user_policy.max;
>         }
>
> -       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
> +       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
> +           !cpufreq_suspended) {
>                 policy->cur = cpufreq_driver->get(policy->cpu);
>                 if (!policy->cur) {
>                         pr_err("%s: ->get() failed\n", __func__);

It looks like we need to skip the "initial freq check" block below.

Also this doesn't really help the case when the driver ->init() messes
up with things.

> @@ -1702,6 +1703,11 @@ void cpufreq_resume(void)
>                                 pr_err("%s: Failed to start governor for policy: %p\n",
>                                        __func__, policy);
>                 }
> +               policy->cur = cpufreq_driver->get(policy->cpu);
> +               if (!policy->cur) {
> +                       pr_err("%s: ->get() failed\n", __func__);
> +                       goto out_destroy_policy;
> +               }
>         }
>  }
>
Sudeep Holla - Feb. 8, 2019, 12:09 p.m.
On Fri, Feb 08, 2019 at 01:03:10PM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > >
> > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > > > >
> > > > > The driver could do the I2C transactions in its suspend/resume
> > > > > callbacks and do nothing in online/offline if those are part of
> > > > > system-wide suspend/resume.
> > > >
> > > > These are per-policy things that we need to do, not sure if driver
> > > > suspend/resume is a good place for that. It is more for a case where
> > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > > > hot-unplugged during system suspend and hotplugged later on. This is
> > > > more like complete removal/addition of devices instead of
> > > > suspend/resume.
> > >
> > > No, it isn't.  We don't remove devices on offline.  We migrate stuff
> > > away from them and (opportunistically) power them down.
> > >
> > > If this is system suspend, the driver kind of knows that offline will
> > > take place, so it can prepare for it.  Likewise, when online takes
> > > place during system-wide resume, it generally is known that this is
> > > system-wide resume (there is a flag to indicate that in CPU hotplug),
> > > it can be "smart" and avoid accessing suspended devices.  Deferring
> > > the frequency set up until the driver resume time should do the trick
> > > I suppose.
> >
> > I agree. The reason we don't see this generally on boot is because all
> > the CPUs are brought online before CPUfreq is initialised. While during
> > system suspend, we call cpufreq_online which in turn calls ->init in
> > the hotplug state machine.
> >
> > So as Rafael suggests we need to do some trick, but can it be done in
> > the core itself ? I may be missing something, but how about the patch
> > below:
> >
> > Regards,
> > Sudeep
> >
> > --
> > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> > index e35a886e00bc..7d8b0b99f91d 100644
> > --- i/drivers/cpufreq/cpufreq.c
> > +++ w/drivers/cpufreq/cpufreq.c
> > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
> >                 policy->max = policy->user_policy.max;
> >         }
> >
> > -       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
> > +       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
> > +           !cpufreq_suspended) {
> >                 policy->cur = cpufreq_driver->get(policy->cpu);
> >                 if (!policy->cur) {
> >                         pr_err("%s: ->get() failed\n", __func__);
> 
> It looks like we need to skip the "initial freq check" block below.
> 

Indeed, copy pasted an earlier version of diff. I found that I even
used a goto label wrong which I fixed along with the additional check
in "initial freq check" when I tried to compile :).

> Also this doesn't really help the case when the driver ->init() messes
> up with things.
>

Yes, in that case additional logic in the driver also needed. I am fine
if we enforce driver to deal with this issue, but was thinking if we can
make it generic. Also I was just trying to avoid adding _suspend/resume
to driver just to avoid this issue.

--
Regards,
Sudeep
Rafael J. Wysocki - Feb. 8, 2019, 12:23 p.m.
On Fri, Feb 8, 2019 at 1:09 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Feb 08, 2019 at 01:03:10PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote:
> > > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > >
> > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote:
> > > > > > There are cpufreq driver suspend and resume callbacks, maybe use them?
> > > > > >
> > > > > > The driver could do the I2C transactions in its suspend/resume
> > > > > > callbacks and do nothing in online/offline if those are part of
> > > > > > system-wide suspend/resume.
> > > > >
> > > > > These are per-policy things that we need to do, not sure if driver
> > > > > suspend/resume is a good place for that. It is more for a case where
> > > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are
> > > > > hot-unplugged during system suspend and hotplugged later on. This is
> > > > > more like complete removal/addition of devices instead of
> > > > > suspend/resume.
> > > >
> > > > No, it isn't.  We don't remove devices on offline.  We migrate stuff
> > > > away from them and (opportunistically) power them down.
> > > >
> > > > If this is system suspend, the driver kind of knows that offline will
> > > > take place, so it can prepare for it.  Likewise, when online takes
> > > > place during system-wide resume, it generally is known that this is
> > > > system-wide resume (there is a flag to indicate that in CPU hotplug),
> > > > it can be "smart" and avoid accessing suspended devices.  Deferring
> > > > the frequency set up until the driver resume time should do the trick
> > > > I suppose.
> > >
> > > I agree. The reason we don't see this generally on boot is because all
> > > the CPUs are brought online before CPUfreq is initialised. While during
> > > system suspend, we call cpufreq_online which in turn calls ->init in
> > > the hotplug state machine.
> > >
> > > So as Rafael suggests we need to do some trick, but can it be done in
> > > the core itself ? I may be missing something, but how about the patch
> > > below:
> > >
> > > Regards,
> > > Sudeep
> > >
> > > --
> > > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
> > > index e35a886e00bc..7d8b0b99f91d 100644
> > > --- i/drivers/cpufreq/cpufreq.c
> > > +++ w/drivers/cpufreq/cpufreq.c
> > > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu)
> > >                 policy->max = policy->user_policy.max;
> > >         }
> > >
> > > -       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
> > > +       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
> > > +           !cpufreq_suspended) {
> > >                 policy->cur = cpufreq_driver->get(policy->cpu);
> > >                 if (!policy->cur) {
> > >                         pr_err("%s: ->get() failed\n", __func__);
> >
> > It looks like we need to skip the "initial freq check" block below.
> >
>
> Indeed, copy pasted an earlier version of diff. I found that I even
> used a goto label wrong which I fixed along with the additional check
> in "initial freq check" when I tried to compile :).
>
> > Also this doesn't really help the case when the driver ->init() messes
> > up with things.
> >
>
> Yes, in that case additional logic in the driver also needed. I am fine
> if we enforce driver to deal with this issue, but was thinking if we can
> make it generic. Also I was just trying to avoid adding _suspend/resume
> to driver just to avoid this issue.

I was wondering if cpufreq_offline()/online() could be invoked from
cpufreq_suspend()/resume() for the nonboot CPUs - if the driver needs
it (there could be a driver flag to indicate that).

If they are made exit immediately when cpufreq_suspended is set (and
the requisite driver flag is set too), that might work AFAICS.
Sudeep Holla - Feb. 8, 2019, 2:28 p.m.
On Fri, Feb 08, 2019 at 01:23:37PM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 8, 2019 at 1:09 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >

[...]

> > Yes, in that case additional logic in the driver also needed. I am fine
> > if we enforce driver to deal with this issue, but was thinking if we can
> > make it generic. Also I was just trying to avoid adding _suspend/resume
> > to driver just to avoid this issue.
>
> I was wondering if cpufreq_offline()/online() could be invoked from
> cpufreq_suspend()/resume() for the nonboot CPUs - if the driver needs
> it (there could be a driver flag to indicate that).
>
> If they are made exit immediately when cpufreq_suspended is set (and
> the requisite driver flag is set too), that might work AFAICS.

Yes that sounds feasible. It should be fine to assume it's safe to call
cpufreq_online on a CPU even for CPU that might have failed to come
online or didn't reached a state in CPUHP from where CPUFreq callback
is executed or am I missing something ?

--
Regards,
Sudeep

Patch

diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c
index e35a886e00bc..7d8b0b99f91d 100644
--- i/drivers/cpufreq/cpufreq.c
+++ w/drivers/cpufreq/cpufreq.c
@@ -1241,7 +1241,8 @@  static int cpufreq_online(unsigned int cpu)
                policy->max = policy->user_policy.max;
        }

-       if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
+       if (cpufreq_driver->get && !cpufreq_driver->setpolicy &&
+           !cpufreq_suspended) {
                policy->cur = cpufreq_driver->get(policy->cpu);
                if (!policy->cur) {
                        pr_err("%s: ->get() failed\n", __func__);
@@ -1702,6 +1703,11 @@  void cpufreq_resume(void)
                                pr_err("%s: Failed to start governor for policy: %p\n",
                                       __func__, policy);
                }
+               policy->cur = cpufreq_driver->get(policy->cpu);
+               if (!policy->cur) {
+                       pr_err("%s: ->get() failed\n", __func__);
+                       goto out_destroy_policy;
+               }
        }
 }