Patchwork [V2] PCI: dwc ep: cache config until DBI regs available

login
register
mail settings
Submitter Stephen Warren
Date Nov. 26, 2018, 11:09 p.m.
Message ID <20181126230958.8175-1-swarren@wwwdotorg.org>
Download mbox | patch
Permalink /patch/665509/
State New
Headers show

Comments

Stephen Warren - Nov. 26, 2018, 11:09 p.m.
From: Stephen Warren <swarren@nvidia.com>

Some implementations of the DWC PCIe endpoint controller do not allow
access to DBI registers until the attached host has started REFCLK,
released PERST, and the endpoint driver has initialized clocking of the
DBI registers based on that. One such system is NVIDIA's T194 SoC. The
PCIe endpoint subsystem and DWC driver currently don't work on such
hardware, since they assume that all endpoint configuration can happen
at any arbitrary time.

Enhance the DWC endpoint driver to support such systems by caching all
endpoint configuration in software, and only writing the configuration
to hardware once it's been initialized. This is implemented by splitting
all endpoint controller ops into two functions; the first which simply
records/caches the desired configuration whenever called by the
associated function driver and optionally calls the second, and the
second which actually programs the configuration into hardware, which
may be called either by the first function, or later when it's known
that the DBI registers are available.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
---
v2:
Replace hw_regs_available with hw_regs_not_available. A 0 value in this
field is now equivalent to the existing behaviour, so that drivers that
allocate struct dw_pcie_ep with kzalloc or equivalent do not need to
explicitly set this new field in order to maintain existing behaviour.

Note: I have only compiled-tested this patch in the context of the
mainline kernel since NVIDIA Tegra PCIe EP support is not yet present.
However, I've built and tested an equivalent patch in our downstream
kernel. I did have to manually port it to mainline due to conflicts
though.

This patch was developed on top of "PCI: designware: don't hard-code
DBI/ATU offset" which I sent a little while back, so may rely on that
for patch context.
---
 .../pci/controller/dwc/pcie-designware-ep.c   | 197 ++++++++++++++----
 drivers/pci/controller/dwc/pcie-designware.h  |  26 ++-
 2 files changed, 179 insertions(+), 44 deletions(-)
Stephen Warren - Dec. 3, 2018, 4:31 p.m.
On 11/26/18 4:09 PM, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> Some implementations of the DWC PCIe endpoint controller do not allow
> access to DBI registers until the attached host has started REFCLK,
> released PERST, and the endpoint driver has initialized clocking of the
> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
> PCIe endpoint subsystem and DWC driver currently don't work on such
> hardware, since they assume that all endpoint configuration can happen
> at any arbitrary time.
> 
> Enhance the DWC endpoint driver to support such systems by caching all
> endpoint configuration in software, and only writing the configuration
> to hardware once it's been initialized. This is implemented by splitting
> all endpoint controller ops into two functions; the first which simply
> records/caches the desired configuration whenever called by the
> associated function driver and optionally calls the second, and the
> second which actually programs the configuration into hardware, which
> may be called either by the first function, or later when it's known
> that the DBI registers are available.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> v2:
> Replace hw_regs_available with hw_regs_not_available. A 0 value in this
> field is now equivalent to the existing behaviour, so that drivers that
> allocate struct dw_pcie_ep with kzalloc or equivalent do not need to
> explicitly set this new field in order to maintain existing behaviour.
> 
> Note: I have only compiled-tested this patch in the context of the
> mainline kernel since NVIDIA Tegra PCIe EP support is not yet present.
> However, I've built and tested an equivalent patch in our downstream
> kernel. I did have to manually port it to mainline due to conflicts
> though.

Any thoughts on this? Thanks.
Gustavo Pimentel - Dec. 4, 2018, 11:02 a.m.
[CC Kishon]

On 26/11/2018 23:09, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> Some implementations of the DWC PCIe endpoint controller do not allow
> access to DBI registers until the attached host has started REFCLK,
> released PERST, and the endpoint driver has initialized clocking of the
> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
> PCIe endpoint subsystem and DWC driver currently don't work on such
> hardware, since they assume that all endpoint configuration can happen
> at any arbitrary time.
> 
> Enhance the DWC endpoint driver to support such systems by caching all
> endpoint configuration in software, and only writing the configuration
> to hardware once it's been initialized. This is implemented by splitting
> all endpoint controller ops into two functions; the first which simply
> records/caches the desired configuration whenever called by the
> associated function driver and optionally calls the second, and the
> second which actually programs the configuration into hardware, which
> may be called either by the first function, or later when it's known
> that the DBI registers are available.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> v2:
> Replace hw_regs_available with hw_regs_not_available. A 0 value in this
> field is now equivalent to the existing behaviour, so that drivers that
> allocate struct dw_pcie_ep with kzalloc or equivalent do not need to
> explicitly set this new field in order to maintain existing behaviour.
> 
> Note: I have only compiled-tested this patch in the context of the
> mainline kernel since NVIDIA Tegra PCIe EP support is not yet present.
> However, I've built and tested an equivalent patch in our downstream
> kernel. I did have to manually port it to mainline due to conflicts
> though.
> 
> This patch was developed on top of "PCI: designware: don't hard-code
> DBI/ATU offset" which I sent a little while back, so may rely on that
> for patch context.
> ---
>  .../pci/controller/dwc/pcie-designware-ep.c   | 197 ++++++++++++++----
>  drivers/pci/controller/dwc/pcie-designware.h  |  26 ++-
>  2 files changed, 179 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 880210366e71..e69b6ef504ed 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -73,11 +73,10 @@ static u8 dw_pcie_ep_find_capability(struct dw_pcie *pci, u8 cap)
>  	return __dw_pcie_ep_find_next_cap(pci, next_cap_ptr, cap);
>  }
>  
> -static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
> -				   struct pci_epf_header *hdr)
> +static void dw_pcie_ep_write_header_regs(struct dw_pcie_ep *ep)
>  {
> -	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	struct pci_epf_header *hdr = &(ep->cached_hdr);
>  
>  	dw_pcie_dbi_ro_wr_en(pci);
>  	dw_pcie_writew_dbi(pci, PCI_VENDOR_ID, hdr->vendorid);
> @@ -94,6 +93,19 @@ static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
>  	dw_pcie_writeb_dbi(pci, PCI_INTERRUPT_PIN,
>  			   hdr->interrupt_pin);
>  	dw_pcie_dbi_ro_wr_dis(pci);
> +}
> +
> +static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
> +				   struct pci_epf_header *hdr)
> +{
> +	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> +
> +	ep->cached_hdr = *hdr;
> +
> +	if (ep->hw_regs_not_available)
> +		return 0;
> +
> +	dw_pcie_ep_write_header_regs(ep);
>  
>  	return 0;
>  }
> @@ -112,6 +124,15 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
>  		return -EINVAL;
>  	}
>  
> +	ep->cached_inbound_atus[free_win].bar = bar;
> +	ep->cached_inbound_atus[free_win].cpu_addr = cpu_addr;
> +	ep->cached_inbound_atus[free_win].as_type = as_type;
> +	ep->cached_bars[bar].atu_index = free_win;
> +	set_bit(free_win, ep->ib_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return 0;
> +
>  	ret = dw_pcie_prog_inbound_atu(pci, free_win, bar, cpu_addr,
>  				       as_type);
>  	if (ret < 0) {
> @@ -119,9 +140,6 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
>  		return ret;
>  	}
>  
> -	ep->bar_to_atu[bar] = free_win;
> -	set_bit(free_win, ep->ib_window_map);
> -
>  	return 0;
>  }
>  
> @@ -137,27 +155,66 @@ static int dw_pcie_ep_outbound_atu(struct dw_pcie_ep *ep, phys_addr_t phys_addr,
>  		return -EINVAL;
>  	}
>  
> +	ep->cached_outbound_atus[free_win].addr = phys_addr;
> +	ep->cached_outbound_atus[free_win].pci_addr = pci_addr;
> +	ep->cached_outbound_atus[free_win].size = size;
> +	set_bit(free_win, ep->ob_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return 0;
> +
>  	dw_pcie_prog_outbound_atu(pci, free_win, PCIE_ATU_TYPE_MEM,
>  				  phys_addr, pci_addr, size);
>  
> -	set_bit(free_win, ep->ob_window_map);
> -	ep->outbound_addr[free_win] = phys_addr;
> -
>  	return 0;
>  }
>  
> +static void dw_pcie_ep_clear_bar_regs(struct dw_pcie_ep *ep, u8 func_no,
> +				      enum pci_barno bar)
> +{
> +	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	int flags = ep->cached_bars[bar].flags;
> +	u32 atu_index = ep->cached_bars[bar].atu_index;
> +
> +	__dw_pcie_ep_reset_bar(pci, bar, flags);
> +
> +	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
> +}
> +
>  static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no,
>  				 struct pci_epf_bar *epf_bar)
>  {
>  	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> -	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	enum pci_barno bar = epf_bar->barno;
> -	u32 atu_index = ep->bar_to_atu[bar];
> +	u32 atu_index = ep->cached_bars[bar].atu_index;
>  
> -	__dw_pcie_ep_reset_bar(pci, bar, epf_bar->flags);
> -
> -	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
>  	clear_bit(atu_index, ep->ib_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return;
> +
> +	dw_pcie_ep_clear_bar_regs(ep, func_no, bar);
> +}
> +
> +static void dw_pcie_ep_set_bar_regs(struct dw_pcie_ep *ep, u8 func_no,
> +				    enum pci_barno bar)
> +{
> +	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	size_t size = ep->cached_bars[bar].size;
> +	int flags = ep->cached_bars[bar].flags;
> +	u32 reg = PCI_BASE_ADDRESS_0 + (4 * bar);
> +
> +	dw_pcie_dbi_ro_wr_en(pci);
> +
> +	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
> +	dw_pcie_writel_dbi(pci, reg, flags);
> +
> +	if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64) {
> +		dw_pcie_writel_dbi2(pci, reg + 4, upper_32_bits(size - 1));
> +		dw_pcie_writel_dbi(pci, reg + 4, 0);
> +	}
> +
> +	dw_pcie_dbi_ro_wr_dis(pci);
>  }
>  
>  static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
> @@ -165,12 +222,10 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
>  {
>  	int ret;
>  	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> -	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	enum pci_barno bar = epf_bar->barno;
>  	size_t size = epf_bar->size;
>  	int flags = epf_bar->flags;
>  	enum dw_pcie_as_type as_type;
> -	u32 reg = PCI_BASE_ADDRESS_0 + (4 * bar);
>  
>  	if (!(flags & PCI_BASE_ADDRESS_SPACE))
>  		as_type = DW_PCIE_AS_MEM;
> @@ -181,17 +236,13 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
>  	if (ret)
>  		return ret;
>  
> -	dw_pcie_dbi_ro_wr_en(pci);
> +	ep->cached_bars[bar].size = size;
> +	ep->cached_bars[bar].flags = flags;
>  
> -	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
> -	dw_pcie_writel_dbi(pci, reg, flags);
> -
> -	if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64) {
> -		dw_pcie_writel_dbi2(pci, reg + 4, upper_32_bits(size - 1));
> -		dw_pcie_writel_dbi(pci, reg + 4, 0);
> -	}
> +	if (ep->hw_regs_not_available)
> +		return 0;
>  
> -	dw_pcie_dbi_ro_wr_dis(pci);
> +	dw_pcie_ep_set_bar_regs(ep, func_no, bar);
>  
>  	return 0;
>  }
> @@ -202,7 +253,7 @@ static int dw_pcie_find_index(struct dw_pcie_ep *ep, phys_addr_t addr,
>  	u32 index;
>  
>  	for (index = 0; index < ep->num_ob_windows; index++) {
> -		if (ep->outbound_addr[index] != addr)
> +		if (ep->cached_outbound_atus[index].addr != addr)
>  			continue;
>  		*atu_index = index;
>  		return 0;
> @@ -223,8 +274,12 @@ static void dw_pcie_ep_unmap_addr(struct pci_epc *epc, u8 func_no,
>  	if (ret < 0)
>  		return;
>  
> -	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_OUTBOUND);
>  	clear_bit(atu_index, ep->ob_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return;
> +
> +	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_OUTBOUND);
>  }
>  
>  static int dw_pcie_ep_map_addr(struct pci_epc *epc, u8 func_no,
> @@ -254,7 +309,10 @@ static int dw_pcie_ep_get_msi(struct pci_epc *epc, u8 func_no)
>  		return -EINVAL;
>  
>  	reg = ep->msi_cap + PCI_MSI_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msi_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	if (!(val & PCI_MSI_FLAGS_ENABLE))
>  		return -EINVAL;
>  
> @@ -273,12 +331,18 @@ static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 interrupts)
>  		return -EINVAL;
>  
>  	reg = ep->msi_cap + PCI_MSI_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msi_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	val &= ~PCI_MSI_FLAGS_QMASK;
>  	val |= (interrupts << 1) & PCI_MSI_FLAGS_QMASK;
> -	dw_pcie_dbi_ro_wr_en(pci);
> -	dw_pcie_writew_dbi(pci, reg, val);
> -	dw_pcie_dbi_ro_wr_dis(pci);
> +	ep->cached_msi_flags = val;
> +	if (!ep->hw_regs_not_available) {
> +		dw_pcie_dbi_ro_wr_en(pci);
> +		dw_pcie_writew_dbi(pci, reg, val);
> +		dw_pcie_dbi_ro_wr_dis(pci);
> +	}
>  
>  	return 0;
>  }
> @@ -293,7 +357,10 @@ static int dw_pcie_ep_get_msix(struct pci_epc *epc, u8 func_no)
>  		return -EINVAL;
>  
>  	reg = ep->msix_cap + PCI_MSIX_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msix_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	if (!(val & PCI_MSIX_FLAGS_ENABLE))
>  		return -EINVAL;
>  
> @@ -312,16 +379,48 @@ static int dw_pcie_ep_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts)
>  		return -EINVAL;
>  
>  	reg = ep->msix_cap + PCI_MSIX_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msix_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	val &= ~PCI_MSIX_FLAGS_QSIZE;
>  	val |= interrupts;
> -	dw_pcie_dbi_ro_wr_en(pci);
> -	dw_pcie_writew_dbi(pci, reg, val);
> -	dw_pcie_dbi_ro_wr_dis(pci);
> +	ep->cached_msix_flags = val;
> +	if (!ep->hw_regs_not_available) {
> +		dw_pcie_dbi_ro_wr_en(pci);
> +		dw_pcie_writew_dbi(pci, reg, val);
> +		dw_pcie_dbi_ro_wr_dis(pci);
> +	}
>  
>  	return 0;
>  }
>  
> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
> +{
> +	struct dw_pcie_ep *ep = &(pci->ep);
> +	int i;
> +
> +	ep->hw_regs_not_available = false;
> +
> +	dw_pcie_ep_write_header_regs(ep);
> +	for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
> +		dw_pcie_prog_inbound_atu(pci, i,
> +			ep->cached_inbound_atus[i].bar,
> +			ep->cached_inbound_atus[i].cpu_addr,
> +			ep->cached_inbound_atus[i].as_type);
> +		dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
> +	}
> +	for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
> +		dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
> +			ep->cached_outbound_atus[i].addr,
> +			ep->cached_outbound_atus[i].pci_addr,
> +			ep->cached_outbound_atus[i].size);
> +	dw_pcie_dbi_ro_wr_en(pci);
> +	dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
> +	dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
> +	dw_pcie_dbi_ro_wr_dis(pci);
> +}
> +
>  static int dw_pcie_ep_raise_irq(struct pci_epc *epc, u8 func_no,
>  				enum pci_epc_irq_type type, u16 interrupt_num)
>  {
> @@ -330,6 +429,9 @@ static int dw_pcie_ep_raise_irq(struct pci_epc *epc, u8 func_no,
>  	if (!ep->ops->raise_irq)
>  		return -EINVAL;
>  
> +	if (ep->hw_regs_not_available)
> +		return -EAGAIN;
> +
>  	return ep->ops->raise_irq(ep, func_no, type, interrupt_num);
>  }
>  
> @@ -391,6 +493,9 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	bool has_upper;
>  	int ret;
>  
> +	if (ep->hw_regs_not_available)
> +		return -EAGAIN;
> +
>  	if (!ep->msi_cap)
>  		return -EINVAL;
>  
> @@ -436,6 +541,9 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	void __iomem *msix_tbl;
>  	int ret;
>  
> +	if (ep->hw_regs_not_available)
> +		return -EAGAIN;
> +
>  	reg = ep->msix_cap + PCI_MSIX_TABLE;
>  	tbl_offset = dw_pcie_readl_dbi(pci, reg);
>  	bir = (tbl_offset & PCI_MSIX_TABLE_BIR);
> @@ -494,7 +602,6 @@ void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
>  int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  {
>  	int ret;
> -	void *addr;
>  	struct pci_epc *epc;
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	struct device *dev = pci->dev;
> @@ -543,11 +650,17 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	if (!ep->ob_window_map)
>  		return -ENOMEM;
>  
> -	addr = devm_kcalloc(dev, ep->num_ob_windows, sizeof(phys_addr_t),
> -			    GFP_KERNEL);
> -	if (!addr)
> +	ep->cached_inbound_atus = devm_kzalloc(dev,
> +		sizeof(ep->cached_inbound_atus[0]) * ep->num_ib_windows,
> +		GFP_KERNEL);
> +	if (!ep->cached_inbound_atus)
> +		return -ENOMEM;
> +
> +	ep->cached_outbound_atus = devm_kzalloc(dev,
> +		sizeof(ep->cached_outbound_atus[0]) * ep->num_ob_windows,
> +		GFP_KERNEL);
> +	if (!ep->cached_outbound_atus)
>  		return -ENOMEM;
> -	ep->outbound_addr = addr;
>  
>  	epc = devm_pci_epc_create(dev, &epc_ops);
>  	if (IS_ERR(epc)) {
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index e895f37b1dee..c5fa5aecda0e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -192,8 +192,6 @@ struct dw_pcie_ep {
>  	phys_addr_t		phys_base;
>  	size_t			addr_size;
>  	size_t			page_size;
> -	u8			bar_to_atu[6];
> -	phys_addr_t		*outbound_addr;
>  	unsigned long		*ib_window_map;
>  	unsigned long		*ob_window_map;
>  	u32			num_ib_windows;
> @@ -202,6 +200,25 @@ struct dw_pcie_ep {
>  	phys_addr_t		msi_mem_phys;
>  	u8			msi_cap;	/* MSI capability offset */
>  	u8			msix_cap;	/* MSI-X capability offset */
> +	bool			hw_regs_not_available;
> +	struct pci_epf_header	cached_hdr;
> +	struct {
> +		size_t			size;
> +		int			flags;
> +		u8			atu_index;
> +	}			cached_bars[6];
> +	struct {
> +		enum pci_barno		bar;
> +		dma_addr_t		cpu_addr;
> +		enum dw_pcie_as_type	as_type;
> +	}			*cached_inbound_atus;
> +	struct {
> +		phys_addr_t		addr;
> +		u64			pci_addr;
> +		size_t			size;
> +	}			*cached_outbound_atus;
> +	u32			cached_msi_flags;
> +	u32			cached_msix_flags;
>  };
>  
>  struct dw_pcie_ops {
> @@ -363,6 +380,7 @@ static inline int dw_pcie_allocate_domains(struct pcie_port *pp)
>  void dw_pcie_ep_linkup(struct dw_pcie_ep *ep);
>  int dw_pcie_ep_init(struct dw_pcie_ep *ep);
>  void dw_pcie_ep_exit(struct dw_pcie_ep *ep);
> +void dw_pcie_ep_set_regs_available(struct dw_pcie *pci);
>  int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no);
>  int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
>  			     u8 interrupt_num);
> @@ -383,6 +401,10 @@ static inline void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
>  {
>  }
>  
> +static inline void dw_pcie_ep_set_regs_available(struct dw_pcie *pci)
> +{
> +}
> +
>  static inline int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no)
>  {
>  	return 0;
> 

The best person to give any kind of input about this is Kishon, since he
implemented the EP from the beginning.

Kishon, can you comment this patch?

Regards,
Gustavo
Stephen Warren - Dec. 10, 2018, 6:04 p.m.
On 12/4/18 4:02 AM, Gustavo Pimentel wrote:
> [CC Kishon]
> 
> On 26/11/2018 23:09, Stephen Warren wrote:
>> From: Stephen Warren <swarren@nvidia.com>
>>
>> Some implementations of the DWC PCIe endpoint controller do not allow
>> access to DBI registers until the attached host has started REFCLK,
>> released PERST, and the endpoint driver has initialized clocking of the
>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>> PCIe endpoint subsystem and DWC driver currently don't work on such
>> hardware, since they assume that all endpoint configuration can happen
>> at any arbitrary time.
>>
>> Enhance the DWC endpoint driver to support such systems by caching all
>> endpoint configuration in software, and only writing the configuration
>> to hardware once it's been initialized. This is implemented by splitting
>> all endpoint controller ops into two functions; the first which simply
>> records/caches the desired configuration whenever called by the
>> associated function driver and optionally calls the second, and the
>> second which actually programs the configuration into hardware, which
>> may be called either by the first function, or later when it's known
>> that the DBI registers are available.

Kishon, any thoughts on this? Thanks.
Kishon Vijay Abraham I - Dec. 11, 2018, 4:36 a.m.
Hi,

On 27/11/18 4:39 AM, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> Some implementations of the DWC PCIe endpoint controller do not allow
> access to DBI registers until the attached host has started REFCLK,
> released PERST, and the endpoint driver has initialized clocking of the
> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
> PCIe endpoint subsystem and DWC driver currently don't work on such
> hardware, since they assume that all endpoint configuration can happen
> at any arbitrary time.
> 
> Enhance the DWC endpoint driver to support such systems by caching all
> endpoint configuration in software, and only writing the configuration
> to hardware once it's been initialized. This is implemented by splitting
> all endpoint controller ops into two functions; the first which simply
> records/caches the desired configuration whenever called by the
> associated function driver and optionally calls the second, and the
> second which actually programs the configuration into hardware, which
> may be called either by the first function, or later when it's known
> that the DBI registers are available.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> v2:
> Replace hw_regs_available with hw_regs_not_available. A 0 value in this
> field is now equivalent to the existing behaviour, so that drivers that
> allocate struct dw_pcie_ep with kzalloc or equivalent do not need to
> explicitly set this new field in order to maintain existing behaviour.
> 
> Note: I have only compiled-tested this patch in the context of the
> mainline kernel since NVIDIA Tegra PCIe EP support is not yet present.
> However, I've built and tested an equivalent patch in our downstream
> kernel. I did have to manually port it to mainline due to conflicts
> though.
> 
> This patch was developed on top of "PCI: designware: don't hard-code
> DBI/ATU offset" which I sent a little while back, so may rely on that
> for patch context.
> ---
>  .../pci/controller/dwc/pcie-designware-ep.c   | 197 ++++++++++++++----
>  drivers/pci/controller/dwc/pcie-designware.h  |  26 ++-
>  2 files changed, 179 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 880210366e71..e69b6ef504ed 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -73,11 +73,10 @@ static u8 dw_pcie_ep_find_capability(struct dw_pcie *pci, u8 cap)
>  	return __dw_pcie_ep_find_next_cap(pci, next_cap_ptr, cap);
>  }
>  
> -static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
> -				   struct pci_epf_header *hdr)
> +static void dw_pcie_ep_write_header_regs(struct dw_pcie_ep *ep)
>  {
> -	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	struct pci_epf_header *hdr = &(ep->cached_hdr);
>  
>  	dw_pcie_dbi_ro_wr_en(pci);
>  	dw_pcie_writew_dbi(pci, PCI_VENDOR_ID, hdr->vendorid);
> @@ -94,6 +93,19 @@ static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
>  	dw_pcie_writeb_dbi(pci, PCI_INTERRUPT_PIN,
>  			   hdr->interrupt_pin);
>  	dw_pcie_dbi_ro_wr_dis(pci);
> +}
> +
> +static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
> +				   struct pci_epf_header *hdr)
> +{
> +	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> +
> +	ep->cached_hdr = *hdr;
> +
> +	if (ep->hw_regs_not_available)
> +		return 0;
> +
> +	dw_pcie_ep_write_header_regs(ep);
>  
>  	return 0;
>  }
> @@ -112,6 +124,15 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
>  		return -EINVAL;
>  	}
>  
> +	ep->cached_inbound_atus[free_win].bar = bar;
> +	ep->cached_inbound_atus[free_win].cpu_addr = cpu_addr;
> +	ep->cached_inbound_atus[free_win].as_type = as_type;
> +	ep->cached_bars[bar].atu_index = free_win;
> +	set_bit(free_win, ep->ib_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return 0;
> +
>  	ret = dw_pcie_prog_inbound_atu(pci, free_win, bar, cpu_addr,
>  				       as_type);
>  	if (ret < 0) {
> @@ -119,9 +140,6 @@ static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
>  		return ret;
>  	}
>  
> -	ep->bar_to_atu[bar] = free_win;
> -	set_bit(free_win, ep->ib_window_map);
> -
>  	return 0;
>  }
>  
> @@ -137,27 +155,66 @@ static int dw_pcie_ep_outbound_atu(struct dw_pcie_ep *ep, phys_addr_t phys_addr,
>  		return -EINVAL;
>  	}
>  
> +	ep->cached_outbound_atus[free_win].addr = phys_addr;
> +	ep->cached_outbound_atus[free_win].pci_addr = pci_addr;
> +	ep->cached_outbound_atus[free_win].size = size;
> +	set_bit(free_win, ep->ob_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return 0;
> +
>  	dw_pcie_prog_outbound_atu(pci, free_win, PCIE_ATU_TYPE_MEM,
>  				  phys_addr, pci_addr, size);
>  
> -	set_bit(free_win, ep->ob_window_map);
> -	ep->outbound_addr[free_win] = phys_addr;
> -
>  	return 0;
>  }
>  
> +static void dw_pcie_ep_clear_bar_regs(struct dw_pcie_ep *ep, u8 func_no,
> +				      enum pci_barno bar)
> +{
> +	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	int flags = ep->cached_bars[bar].flags;
> +	u32 atu_index = ep->cached_bars[bar].atu_index;
> +
> +	__dw_pcie_ep_reset_bar(pci, bar, flags);
> +
> +	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
> +}
> +
>  static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no,
>  				 struct pci_epf_bar *epf_bar)
>  {
>  	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> -	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	enum pci_barno bar = epf_bar->barno;
> -	u32 atu_index = ep->bar_to_atu[bar];
> +	u32 atu_index = ep->cached_bars[bar].atu_index;
>  
> -	__dw_pcie_ep_reset_bar(pci, bar, epf_bar->flags);
> -
> -	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
>  	clear_bit(atu_index, ep->ib_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return;
> +
> +	dw_pcie_ep_clear_bar_regs(ep, func_no, bar);
> +}
> +
> +static void dw_pcie_ep_set_bar_regs(struct dw_pcie_ep *ep, u8 func_no,
> +				    enum pci_barno bar)
> +{
> +	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	size_t size = ep->cached_bars[bar].size;
> +	int flags = ep->cached_bars[bar].flags;
> +	u32 reg = PCI_BASE_ADDRESS_0 + (4 * bar);
> +
> +	dw_pcie_dbi_ro_wr_en(pci);
> +
> +	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
> +	dw_pcie_writel_dbi(pci, reg, flags);
> +
> +	if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64) {
> +		dw_pcie_writel_dbi2(pci, reg + 4, upper_32_bits(size - 1));
> +		dw_pcie_writel_dbi(pci, reg + 4, 0);
> +	}
> +
> +	dw_pcie_dbi_ro_wr_dis(pci);
>  }
>  
>  static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
> @@ -165,12 +222,10 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
>  {
>  	int ret;
>  	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> -	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	enum pci_barno bar = epf_bar->barno;
>  	size_t size = epf_bar->size;
>  	int flags = epf_bar->flags;
>  	enum dw_pcie_as_type as_type;
> -	u32 reg = PCI_BASE_ADDRESS_0 + (4 * bar);
>  
>  	if (!(flags & PCI_BASE_ADDRESS_SPACE))
>  		as_type = DW_PCIE_AS_MEM;
> @@ -181,17 +236,13 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
>  	if (ret)
>  		return ret;
>  
> -	dw_pcie_dbi_ro_wr_en(pci);
> +	ep->cached_bars[bar].size = size;
> +	ep->cached_bars[bar].flags = flags;
>  
> -	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
> -	dw_pcie_writel_dbi(pci, reg, flags);
> -
> -	if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64) {
> -		dw_pcie_writel_dbi2(pci, reg + 4, upper_32_bits(size - 1));
> -		dw_pcie_writel_dbi(pci, reg + 4, 0);
> -	}
> +	if (ep->hw_regs_not_available)
> +		return 0;
>  
> -	dw_pcie_dbi_ro_wr_dis(pci);
> +	dw_pcie_ep_set_bar_regs(ep, func_no, bar);
>  
>  	return 0;
>  }
> @@ -202,7 +253,7 @@ static int dw_pcie_find_index(struct dw_pcie_ep *ep, phys_addr_t addr,
>  	u32 index;
>  
>  	for (index = 0; index < ep->num_ob_windows; index++) {
> -		if (ep->outbound_addr[index] != addr)
> +		if (ep->cached_outbound_atus[index].addr != addr)
>  			continue;
>  		*atu_index = index;
>  		return 0;
> @@ -223,8 +274,12 @@ static void dw_pcie_ep_unmap_addr(struct pci_epc *epc, u8 func_no,
>  	if (ret < 0)
>  		return;
>  
> -	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_OUTBOUND);
>  	clear_bit(atu_index, ep->ob_window_map);
> +
> +	if (ep->hw_regs_not_available)
> +		return;
> +
> +	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_OUTBOUND);
>  }
>  
>  static int dw_pcie_ep_map_addr(struct pci_epc *epc, u8 func_no,
> @@ -254,7 +309,10 @@ static int dw_pcie_ep_get_msi(struct pci_epc *epc, u8 func_no)
>  		return -EINVAL;
>  
>  	reg = ep->msi_cap + PCI_MSI_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msi_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	if (!(val & PCI_MSI_FLAGS_ENABLE))
>  		return -EINVAL;
>  
> @@ -273,12 +331,18 @@ static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 interrupts)
>  		return -EINVAL;
>  
>  	reg = ep->msi_cap + PCI_MSI_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msi_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	val &= ~PCI_MSI_FLAGS_QMASK;
>  	val |= (interrupts << 1) & PCI_MSI_FLAGS_QMASK;
> -	dw_pcie_dbi_ro_wr_en(pci);
> -	dw_pcie_writew_dbi(pci, reg, val);
> -	dw_pcie_dbi_ro_wr_dis(pci);
> +	ep->cached_msi_flags = val;
> +	if (!ep->hw_regs_not_available) {
> +		dw_pcie_dbi_ro_wr_en(pci);
> +		dw_pcie_writew_dbi(pci, reg, val);
> +		dw_pcie_dbi_ro_wr_dis(pci);
> +	}
>  
>  	return 0;
>  }
> @@ -293,7 +357,10 @@ static int dw_pcie_ep_get_msix(struct pci_epc *epc, u8 func_no)
>  		return -EINVAL;
>  
>  	reg = ep->msix_cap + PCI_MSIX_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msix_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	if (!(val & PCI_MSIX_FLAGS_ENABLE))
>  		return -EINVAL;
>  
> @@ -312,16 +379,48 @@ static int dw_pcie_ep_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts)
>  		return -EINVAL;
>  
>  	reg = ep->msix_cap + PCI_MSIX_FLAGS;
> -	val = dw_pcie_readw_dbi(pci, reg);
> +	if (ep->hw_regs_not_available)
> +		val = ep->cached_msix_flags;
> +	else
> +		val = dw_pcie_readw_dbi(pci, reg);
>  	val &= ~PCI_MSIX_FLAGS_QSIZE;
>  	val |= interrupts;
> -	dw_pcie_dbi_ro_wr_en(pci);
> -	dw_pcie_writew_dbi(pci, reg, val);
> -	dw_pcie_dbi_ro_wr_dis(pci);
> +	ep->cached_msix_flags = val;
> +	if (!ep->hw_regs_not_available) {
> +		dw_pcie_dbi_ro_wr_en(pci);
> +		dw_pcie_writew_dbi(pci, reg, val);
> +		dw_pcie_dbi_ro_wr_dis(pci);
> +	}
>  
>  	return 0;
>  }
>  
> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
> +{

When will this function be invoked? Does the wrapper get an interrupt when
refclk is enabled where this function will be invoked?
> +	struct dw_pcie_ep *ep = &(pci->ep);
> +	int i;
> +
> +	ep->hw_regs_not_available = false;

This can race with epc_ops.

> +
> +	dw_pcie_ep_write_header_regs(ep);
> +	for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
> +		dw_pcie_prog_inbound_atu(pci, i,
> +			ep->cached_inbound_atus[i].bar,
> +			ep->cached_inbound_atus[i].cpu_addr,
> +			ep->cached_inbound_atus[i].as_type);

Depending on the context in which this function is invoked, programming
inbound/outbound ATU can also race with EPC ops.
> +		dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
> +	}
> +	for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
> +		dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
> +			ep->cached_outbound_atus[i].addr,
> +			ep->cached_outbound_atus[i].pci_addr,
> +			ep->cached_outbound_atus[i].size);
> +	dw_pcie_dbi_ro_wr_en(pci);
> +	dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
> +	dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
> +	dw_pcie_dbi_ro_wr_dis(pci);

IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
ready to be initialized or not. Only if the epc_init indicates it's ready to be
initialized, the endpoint function driver should go ahead with further
initialization. Or else it should wait for a notification from EPC to indicate
when it's ready to be initialized.

Thanks
Kishon
Stephen Warren - Dec. 11, 2018, 5:23 p.m.
On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
> Hi,
> 
> On 27/11/18 4:39 AM, Stephen Warren wrote:
>> From: Stephen Warren <swarren@nvidia.com>
>>
>> Some implementations of the DWC PCIe endpoint controller do not allow
>> access to DBI registers until the attached host has started REFCLK,
>> released PERST, and the endpoint driver has initialized clocking of the
>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>> PCIe endpoint subsystem and DWC driver currently don't work on such
>> hardware, since they assume that all endpoint configuration can happen
>> at any arbitrary time.
>>
>> Enhance the DWC endpoint driver to support such systems by caching all
>> endpoint configuration in software, and only writing the configuration
>> to hardware once it's been initialized. This is implemented by splitting
>> all endpoint controller ops into two functions; the first which simply
>> records/caches the desired configuration whenever called by the
>> associated function driver and optionally calls the second, and the
>> second which actually programs the configuration into hardware, which
>> may be called either by the first function, or later when it's known
>> that the DBI registers are available.

>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c

>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>> +{
> 
> When will this function be invoked? Does the wrapper get an interrupt when
> refclk is enabled where this function will be invoked?

Yes, there's an IRQ from the HW that indicates when PEXRST is released. 
I don't recall right now if this IRQ is something that exists for all 
DWC instantiations, or is Tegra-specific.

>> +	struct dw_pcie_ep *ep = &(pci->ep);
>> +	int i;
>> +
>> +	ep->hw_regs_not_available = false;
> 
> This can race with epc_ops.
> 
>> +
>> +	dw_pcie_ep_write_header_regs(ep);
>> +	for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>> +		dw_pcie_prog_inbound_atu(pci, i,
>> +			ep->cached_inbound_atus[i].bar,
>> +			ep->cached_inbound_atus[i].cpu_addr,
>> +			ep->cached_inbound_atus[i].as_type);
> 
> Depending on the context in which this function is invoked, programming
> inbound/outbound ATU can also race with EPC ops.
 >
>> +		dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>> +	}
>> +	for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>> +		dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>> +			ep->cached_outbound_atus[i].addr,
>> +			ep->cached_outbound_atus[i].pci_addr,
>> +			ep->cached_outbound_atus[i].size);
>> +	dw_pcie_dbi_ro_wr_en(pci);
>> +	dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>> +	dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>> +	dw_pcie_dbi_ro_wr_dis(pci);
> 
> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
> ready to be initialized or not. Only if the epc_init indicates it's ready to be
> initialized, the endpoint function driver should go ahead with further
> initialization. Or else it should wait for a notification from EPC to indicate
> when it's ready to be initialized.

(Did you mean epf op or epc op?)

I'm not sure how exactly how that would work; do you want the DWC core 
driver or the endpoint subsystem to poll that epc op to find out when 
the HW is ready to be initialized? Or do you envisage the controller 
driver still calling dw_pcie_set_regs_available() (possibly renamed), 
which in turn calls ->epc_init() calls for some reason?

If you don't want to cache the endpoint configuration, perhaps you want:

a) Endpoint function doesn't pro-actively call the endpoint controller 
functions to configure the endpoint.

b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() 
(or whatever name), which lets the core know the HW can be configured. 
Perhaps this schedules a work queue item to implement locking to avoid 
the races you mentioned.

c) Endpoint core calls pci_epf_init() which calls epf op ->init().

One gotcha with this approach, which the caching approach helps avoid:

Once PEXRST is released, the system must respond to PCIe enumeration 
requests within 50ms. Thus, SW must very quickly respond to the IRQ 
indicating PEXRST release and program the endpoint configuration into 
HW. By caching the configuration in the DWC driver and 
immediately/synchronously applying it in the PEXRST IRQ handler, we 
reduce the number of steps and amount of code taken to program the HW, 
so it should get done pretty quickly. If instead we call back into the 
endpoint function driver's ->init() op, we run the risk of that op doing 
other stuff besides just calling the endpoint HW configuration APIs 
(e.g. perhaps the function driver defers memory buffer allocation or 
IOVA programming to that ->init function) which in turns makes it much 
less likely the 50ms requirement will be hit. Perhaps we can solve this 
by naming the op well and providing lots of comments, but my guess is 
that endpoint function authors won't notice that...
Stephen Warren - Dec. 14, 2018, 5:01 p.m.
On 12/11/18 10:23 AM, Stephen Warren wrote:
> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On 27/11/18 4:39 AM, Stephen Warren wrote:
>>> From: Stephen Warren <swarren@nvidia.com>
>>>
>>> Some implementations of the DWC PCIe endpoint controller do not allow
>>> access to DBI registers until the attached host has started REFCLK,
>>> released PERST, and the endpoint driver has initialized clocking of the
>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>>> PCIe endpoint subsystem and DWC driver currently don't work on such
>>> hardware, since they assume that all endpoint configuration can happen
>>> at any arbitrary time.
>>>
>>> Enhance the DWC endpoint driver to support such systems by caching all
>>> endpoint configuration in software, and only writing the configuration
>>> to hardware once it's been initialized. This is implemented by splitting
>>> all endpoint controller ops into two functions; the first which simply
>>> records/caches the desired configuration whenever called by the
>>> associated function driver and optionally calls the second, and the
>>> second which actually programs the configuration into hardware, which
>>> may be called either by the first function, or later when it's known
>>> that the DBI registers are available.
> 
>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c 
>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c
> 
>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>>> +{
>>
>> When will this function be invoked? Does the wrapper get an interrupt 
>> when
>> refclk is enabled where this function will be invoked?
> 
> Yes, there's an IRQ from the HW that indicates when PEXRST is released. 
> I don't recall right now if this IRQ is something that exists for all 
> DWC instantiations, or is Tegra-specific.
> 
>>> +    struct dw_pcie_ep *ep = &(pci->ep);
>>> +    int i;
>>> +
>>> +    ep->hw_regs_not_available = false;
>>
>> This can race with epc_ops.
>>
>>> +
>>> +    dw_pcie_ep_write_header_regs(ep);
>>> +    for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>>> +        dw_pcie_prog_inbound_atu(pci, i,
>>> +            ep->cached_inbound_atus[i].bar,
>>> +            ep->cached_inbound_atus[i].cpu_addr,
>>> +            ep->cached_inbound_atus[i].as_type);
>>
>> Depending on the context in which this function is invoked, programming
>> inbound/outbound ATU can also race with EPC ops.
>  >
>>> +        dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>>> +    }
>>> +    for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>>> +        dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>>> +            ep->cached_outbound_atus[i].addr,
>>> +            ep->cached_outbound_atus[i].pci_addr,
>>> +            ep->cached_outbound_atus[i].size);
>>> +    dw_pcie_dbi_ro_wr_en(pci);
>>> +    dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>>> +    dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>>> +    dw_pcie_dbi_ro_wr_dis(pci);
>>
>> IMHO we should add a new epc ops ->epc_init() which indicates if the 
>> EPC is
>> ready to be initialized or not. Only if the epc_init indicates it's 
>> ready to be
>> initialized, the endpoint function driver should go ahead with further
>> initialization. Or else it should wait for a notification from EPC to 
>> indicate
>> when it's ready to be initialized.
> 
> (Did you mean epf op or epc op?)
> 
> I'm not sure how exactly how that would work; do you want the DWC core 
> driver or the endpoint subsystem to poll that epc op to find out when 
> the HW is ready to be initialized? Or do you envisage the controller 
> driver still calling dw_pcie_set_regs_available() (possibly renamed), 
> which in turn calls ->epc_init() calls for some reason?
> 
> If you don't want to cache the endpoint configuration, perhaps you want:
> 
> a) Endpoint function doesn't pro-actively call the endpoint controller 
> functions to configure the endpoint.
> 
> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() 
> (or whatever name), which lets the core know the HW can be configured. 
> Perhaps this schedules a work queue item to implement locking to avoid 
> the races you mentioned.
> 
> c) Endpoint core calls pci_epf_init() which calls epf op ->init().
> 
> One gotcha with this approach, which the caching approach helps avoid:
> 
> Once PEXRST is released, the system must respond to PCIe enumeration 
> requests within 50ms. Thus, SW must very quickly respond to the IRQ 
> indicating PEXRST release and program the endpoint configuration into 
> HW. By caching the configuration in the DWC driver and 
> immediately/synchronously applying it in the PEXRST IRQ handler, we 
> reduce the number of steps and amount of code taken to program the HW, 
> so it should get done pretty quickly. If instead we call back into the 
> endpoint function driver's ->init() op, we run the risk of that op doing 
> other stuff besides just calling the endpoint HW configuration APIs 
> (e.g. perhaps the function driver defers memory buffer allocation or 
> IOVA programming to that ->init function) which in turns makes it much 
> less likely the 50ms requirement will be hit. Perhaps we can solve this 
> by naming the op well and providing lots of comments, but my guess is 
> that endpoint function authors won't notice that...

Kishon,

Do you have any further details exactly how you'd prefer this to work? 
Does the approach I describe in points a/b/c above sound like what you 
want? Thanks.
Kishon Vijay Abraham I - Dec. 19, 2018, 2:37 p.m.
Hi,

On 14/12/18 10:31 PM, Stephen Warren wrote:
> On 12/11/18 10:23 AM, Stephen Warren wrote:
>> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
>>> Hi,
>>>
>>> On 27/11/18 4:39 AM, Stephen Warren wrote:
>>>> From: Stephen Warren <swarren@nvidia.com>
>>>>
>>>> Some implementations of the DWC PCIe endpoint controller do not allow
>>>> access to DBI registers until the attached host has started REFCLK,
>>>> released PERST, and the endpoint driver has initialized clocking of the
>>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>>>> PCIe endpoint subsystem and DWC driver currently don't work on such
>>>> hardware, since they assume that all endpoint configuration can happen
>>>> at any arbitrary time.
>>>>
>>>> Enhance the DWC endpoint driver to support such systems by caching all
>>>> endpoint configuration in software, and only writing the configuration
>>>> to hardware once it's been initialized. This is implemented by splitting
>>>> all endpoint controller ops into two functions; the first which simply
>>>> records/caches the desired configuration whenever called by the
>>>> associated function driver and optionally calls the second, and the
>>>> second which actually programs the configuration into hardware, which
>>>> may be called either by the first function, or later when it's known
>>>> that the DBI registers are available.
>>
>>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>
>>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>>>> +{
>>>
>>> When will this function be invoked? Does the wrapper get an interrupt when
>>> refclk is enabled where this function will be invoked?
>>
>> Yes, there's an IRQ from the HW that indicates when PEXRST is released. I
>> don't recall right now if this IRQ is something that exists for all DWC
>> instantiations, or is Tegra-specific.
>>
>>>> +    struct dw_pcie_ep *ep = &(pci->ep);
>>>> +    int i;
>>>> +
>>>> +    ep->hw_regs_not_available = false;
>>>
>>> This can race with epc_ops.
>>>
>>>> +
>>>> +    dw_pcie_ep_write_header_regs(ep);
>>>> +    for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>>>> +        dw_pcie_prog_inbound_atu(pci, i,
>>>> +            ep->cached_inbound_atus[i].bar,
>>>> +            ep->cached_inbound_atus[i].cpu_addr,
>>>> +            ep->cached_inbound_atus[i].as_type);
>>>
>>> Depending on the context in which this function is invoked, programming
>>> inbound/outbound ATU can also race with EPC ops.
>>  >
>>>> +        dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>>>> +    }
>>>> +    for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>>>> +        dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>>>> +            ep->cached_outbound_atus[i].addr,
>>>> +            ep->cached_outbound_atus[i].pci_addr,
>>>> +            ep->cached_outbound_atus[i].size);
>>>> +    dw_pcie_dbi_ro_wr_en(pci);
>>>> +    dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>>>> +    dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>>>> +    dw_pcie_dbi_ro_wr_dis(pci);
>>>
>>> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
>>> ready to be initialized or not. Only if the epc_init indicates it's ready to be
>>> initialized, the endpoint function driver should go ahead with further
>>> initialization. Or else it should wait for a notification from EPC to indicate
>>> when it's ready to be initialized.
>>
>> (Did you mean epf op or epc op?)
>>
>> I'm not sure how exactly how that would work; do you want the DWC core driver
>> or the endpoint subsystem to poll that epc op to find out when the HW is
>> ready to be initialized? Or do you envisage the controller driver still
>> calling dw_pcie_set_regs_available() (possibly renamed), which in turn calls
>> ->epc_init() calls for some reason?
>>
>> If you don't want to cache the endpoint configuration, perhaps you want:
>>
>> a) Endpoint function doesn't pro-actively call the endpoint controller
>> functions to configure the endpoint.
>>
>> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() (or
>> whatever name), which lets the core know the HW can be configured. Perhaps
>> this schedules a work queue item to implement locking to avoid the races you
>> mentioned.
>>
>> c) Endpoint core calls pci_epf_init() which calls epf op ->init().
>>
>> One gotcha with this approach, which the caching approach helps avoid:
>>
>> Once PEXRST is released, the system must respond to PCIe enumeration requests
>> within 50ms. Thus, SW must very quickly respond to the IRQ indicating PEXRST
>> release and program the endpoint configuration into HW. By caching the
>> configuration in the DWC driver and immediately/synchronously applying it in
>> the PEXRST IRQ handler, we reduce the number of steps and amount of code
>> taken to program the HW, so it should get done pretty quickly. If instead we
>> call back into the endpoint function driver's ->init() op, we run the risk of
>> that op doing other stuff besides just calling the endpoint HW configuration
>> APIs (e.g. perhaps the function driver defers memory buffer allocation or
>> IOVA programming to that ->init function) which in turns makes it much less
>> likely the 50ms requirement will be hit. Perhaps we can solve this by naming
>> the op well and providing lots of comments, but my guess is that endpoint
>> function authors won't notice that...
> 
> Kishon,
> 
> Do you have any further details exactly how you'd prefer this to work? Does the
> approach I describe in points a/b/c above sound like what you want? Thanks.

Agree with your PERST comment.

What I have in mind is we add a new epc_init() ops. I feel there are more uses
for it (For e.g I have an internal patch which uses epc_init to initialize DMA.
Hopefully I'll post it soon).
If you look at pci_epf_test, pci_epf_test_bind() is where the function actually
starts to write to HW (i.e using pci_epc_*).
So before the endpoint function invokes pci_epc_write_header(), it should
invoke epc_init(). Only if that succeeds, it should go ahead with other
initialization.
If epc_init_* fails, we can have a particular error value to indicate the
controller is waiting for clock from host (so that we don't return error from
->bind()). Once the controller receives the clock, it can send an atomic
notification to the endpoint function driver to indicate it is ready to be
initialized. (Atomic notification makes it easy to handle for multi function
endpoint devices.)
The endpoint function can then initialize the controller.
I think except for pci_epf_test_alloc_space() all other functions are
configuring the HW (in pci_epf_test_bind). pci_epf_test_alloc_space() could be
moved to pci_epf_test_probe() so there are no expensive operations to be done
once the controller is ready to be initialized.
I have epc_init() and the atomic notification part already implemented and I'm
planning to post it before next week. Once that is merged, we might have to
reorder function in pci_epf_test driver and you have to return the correct
error value for epc_init() if the clock is not there.

Thanks
Kishon
Stephen Warren - Jan. 2, 2019, 4:34 p.m.
On 12/19/18 7:37 AM, Kishon Vijay Abraham I wrote:
> Hi,
> 
> On 14/12/18 10:31 PM, Stephen Warren wrote:
>> On 12/11/18 10:23 AM, Stephen Warren wrote:
>>> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
>>>> Hi,
>>>>
>>>> On 27/11/18 4:39 AM, Stephen Warren wrote:
>>>>> From: Stephen Warren <swarren@nvidia.com>
>>>>>
>>>>> Some implementations of the DWC PCIe endpoint controller do not allow
>>>>> access to DBI registers until the attached host has started REFCLK,
>>>>> released PERST, and the endpoint driver has initialized clocking of the
>>>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>>>>> PCIe endpoint subsystem and DWC driver currently don't work on such
>>>>> hardware, since they assume that all endpoint configuration can happen
>>>>> at any arbitrary time.
>>>>>
>>>>> Enhance the DWC endpoint driver to support such systems by caching all
>>>>> endpoint configuration in software, and only writing the configuration
>>>>> to hardware once it's been initialized. This is implemented by splitting
>>>>> all endpoint controller ops into two functions; the first which simply
>>>>> records/caches the desired configuration whenever called by the
>>>>> associated function driver and optionally calls the second, and the
>>>>> second which actually programs the configuration into hardware, which
>>>>> may be called either by the first function, or later when it's known
>>>>> that the DBI registers are available.
>>>
>>>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>
>>>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>>>>> +{
>>>>
>>>> When will this function be invoked? Does the wrapper get an interrupt when
>>>> refclk is enabled where this function will be invoked?
>>>
>>> Yes, there's an IRQ from the HW that indicates when PEXRST is released. I
>>> don't recall right now if this IRQ is something that exists for all DWC
>>> instantiations, or is Tegra-specific.
>>>
>>>>> +    struct dw_pcie_ep *ep = &(pci->ep);
>>>>> +    int i;
>>>>> +
>>>>> +    ep->hw_regs_not_available = false;
>>>>
>>>> This can race with epc_ops.
>>>>
>>>>> +
>>>>> +    dw_pcie_ep_write_header_regs(ep);
>>>>> +    for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>>>>> +        dw_pcie_prog_inbound_atu(pci, i,
>>>>> +            ep->cached_inbound_atus[i].bar,
>>>>> +            ep->cached_inbound_atus[i].cpu_addr,
>>>>> +            ep->cached_inbound_atus[i].as_type);
>>>>
>>>> Depending on the context in which this function is invoked, programming
>>>> inbound/outbound ATU can also race with EPC ops.
>>>   >
>>>>> +        dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>>>>> +    }
>>>>> +    for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>>>>> +        dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>>>>> +            ep->cached_outbound_atus[i].addr,
>>>>> +            ep->cached_outbound_atus[i].pci_addr,
>>>>> +            ep->cached_outbound_atus[i].size);
>>>>> +    dw_pcie_dbi_ro_wr_en(pci);
>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>>>>> +    dw_pcie_dbi_ro_wr_dis(pci);
>>>>
>>>> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
>>>> ready to be initialized or not. Only if the epc_init indicates it's ready to be
>>>> initialized, the endpoint function driver should go ahead with further
>>>> initialization. Or else it should wait for a notification from EPC to indicate
>>>> when it's ready to be initialized.
>>>
>>> (Did you mean epf op or epc op?)
>>>
>>> I'm not sure how exactly how that would work; do you want the DWC core driver
>>> or the endpoint subsystem to poll that epc op to find out when the HW is
>>> ready to be initialized? Or do you envisage the controller driver still
>>> calling dw_pcie_set_regs_available() (possibly renamed), which in turn calls
>>> ->epc_init() calls for some reason?
>>>
>>> If you don't want to cache the endpoint configuration, perhaps you want:
>>>
>>> a) Endpoint function doesn't pro-actively call the endpoint controller
>>> functions to configure the endpoint.
>>>
>>> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() (or
>>> whatever name), which lets the core know the HW can be configured. Perhaps
>>> this schedules a work queue item to implement locking to avoid the races you
>>> mentioned.
>>>
>>> c) Endpoint core calls pci_epf_init() which calls epf op ->init().
>>>
>>> One gotcha with this approach, which the caching approach helps avoid:
>>>
>>> Once PEXRST is released, the system must respond to PCIe enumeration requests
>>> within 50ms. Thus, SW must very quickly respond to the IRQ indicating PEXRST
>>> release and program the endpoint configuration into HW. By caching the
>>> configuration in the DWC driver and immediately/synchronously applying it in
>>> the PEXRST IRQ handler, we reduce the number of steps and amount of code
>>> taken to program the HW, so it should get done pretty quickly. If instead we
>>> call back into the endpoint function driver's ->init() op, we run the risk of
>>> that op doing other stuff besides just calling the endpoint HW configuration
>>> APIs (e.g. perhaps the function driver defers memory buffer allocation or
>>> IOVA programming to that ->init function) which in turns makes it much less
>>> likely the 50ms requirement will be hit. Perhaps we can solve this by naming
>>> the op well and providing lots of comments, but my guess is that endpoint
>>> function authors won't notice that...
>>
>> Kishon,
>>
>> Do you have any further details exactly how you'd prefer this to work? Does the
>> approach I describe in points a/b/c above sound like what you want? Thanks.
> 
> Agree with your PERST comment.
> 
> What I have in mind is we add a new epc_init() ops. I feel there are more uses
> for it (For e.g I have an internal patch which uses epc_init to initialize DMA.
> Hopefully I'll post it soon).
> If you look at pci_epf_test, pci_epf_test_bind() is where the function actually
> starts to write to HW (i.e using pci_epc_*).
> So before the endpoint function invokes pci_epc_write_header(), it should
> invoke epc_init(). Only if that succeeds, it should go ahead with other
> initialization.
> If epc_init_* fails, we can have a particular error value to indicate the
> controller is waiting for clock from host (so that we don't return error from
> ->bind()). Once the controller receives the clock, it can send an atomic
> notification to the endpoint function driver to indicate it is ready to be
> initialized. (Atomic notification makes it easy to handle for multi function
> endpoint devices.)
> The endpoint function can then initialize the controller.
> I think except for pci_epf_test_alloc_space() all other functions are
> configuring the HW (in pci_epf_test_bind). pci_epf_test_alloc_space() could be
> moved to pci_epf_test_probe() so there are no expensive operations to be done
> once the controller is ready to be initialized.
> I have epc_init() and the atomic notification part already implemented and I'm
> planning to post it before next week. Once that is merged, we might have to
> reorder function in pci_epf_test driver and you have to return the correct
> error value for epc_init() if the clock is not there.

Kishon, did you manage to post the patches that implement epc_init()? If 
so, a link would be appreciated. Thanks.
Kishon Vijay Abraham I - Jan. 4, 2019, 8:02 a.m.
Hi Stephen,

On 02/01/19 10:04 PM, Stephen Warren wrote:
> On 12/19/18 7:37 AM, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On 14/12/18 10:31 PM, Stephen Warren wrote:
>>> On 12/11/18 10:23 AM, Stephen Warren wrote:
>>>> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
>>>>> Hi,
>>>>>
>>>>> On 27/11/18 4:39 AM, Stephen Warren wrote:
>>>>>> From: Stephen Warren <swarren@nvidia.com>
>>>>>>
>>>>>> Some implementations of the DWC PCIe endpoint controller do not allow
>>>>>> access to DBI registers until the attached host has started REFCLK,
>>>>>> released PERST, and the endpoint driver has initialized clocking of the
>>>>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>>>>>> PCIe endpoint subsystem and DWC driver currently don't work on such
>>>>>> hardware, since they assume that all endpoint configuration can happen
>>>>>> at any arbitrary time.
>>>>>>
>>>>>> Enhance the DWC endpoint driver to support such systems by caching all
>>>>>> endpoint configuration in software, and only writing the configuration
>>>>>> to hardware once it's been initialized. This is implemented by splitting
>>>>>> all endpoint controller ops into two functions; the first which simply
>>>>>> records/caches the desired configuration whenever called by the
>>>>>> associated function driver and optionally calls the second, and the
>>>>>> second which actually programs the configuration into hardware, which
>>>>>> may be called either by the first function, or later when it's known
>>>>>> that the DBI registers are available.
>>>>
>>>>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>
>>>>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>>>>>> +{
>>>>>
>>>>> When will this function be invoked? Does the wrapper get an interrupt when
>>>>> refclk is enabled where this function will be invoked?
>>>>
>>>> Yes, there's an IRQ from the HW that indicates when PEXRST is released. I
>>>> don't recall right now if this IRQ is something that exists for all DWC
>>>> instantiations, or is Tegra-specific.
>>>>
>>>>>> +    struct dw_pcie_ep *ep = &(pci->ep);
>>>>>> +    int i;
>>>>>> +
>>>>>> +    ep->hw_regs_not_available = false;
>>>>>
>>>>> This can race with epc_ops.
>>>>>
>>>>>> +
>>>>>> +    dw_pcie_ep_write_header_regs(ep);
>>>>>> +    for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>>>>>> +        dw_pcie_prog_inbound_atu(pci, i,
>>>>>> +            ep->cached_inbound_atus[i].bar,
>>>>>> +            ep->cached_inbound_atus[i].cpu_addr,
>>>>>> +            ep->cached_inbound_atus[i].as_type);
>>>>>
>>>>> Depending on the context in which this function is invoked, programming
>>>>> inbound/outbound ATU can also race with EPC ops.
>>>>   >
>>>>>> +        dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>>>>>> +    }
>>>>>> +    for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>>>>>> +        dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>>>>>> +            ep->cached_outbound_atus[i].addr,
>>>>>> +            ep->cached_outbound_atus[i].pci_addr,
>>>>>> +            ep->cached_outbound_atus[i].size);
>>>>>> +    dw_pcie_dbi_ro_wr_en(pci);
>>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>>>>>> +    dw_pcie_dbi_ro_wr_dis(pci);
>>>>>
>>>>> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
>>>>> ready to be initialized or not. Only if the epc_init indicates it's ready
>>>>> to be
>>>>> initialized, the endpoint function driver should go ahead with further
>>>>> initialization. Or else it should wait for a notification from EPC to
>>>>> indicate
>>>>> when it's ready to be initialized.
>>>>
>>>> (Did you mean epf op or epc op?)
>>>>
>>>> I'm not sure how exactly how that would work; do you want the DWC core driver
>>>> or the endpoint subsystem to poll that epc op to find out when the HW is
>>>> ready to be initialized? Or do you envisage the controller driver still
>>>> calling dw_pcie_set_regs_available() (possibly renamed), which in turn calls
>>>> ->epc_init() calls for some reason?
>>>>
>>>> If you don't want to cache the endpoint configuration, perhaps you want:
>>>>
>>>> a) Endpoint function doesn't pro-actively call the endpoint controller
>>>> functions to configure the endpoint.
>>>>
>>>> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() (or
>>>> whatever name), which lets the core know the HW can be configured. Perhaps
>>>> this schedules a work queue item to implement locking to avoid the races you
>>>> mentioned.
>>>>
>>>> c) Endpoint core calls pci_epf_init() which calls epf op ->init().
>>>>
>>>> One gotcha with this approach, which the caching approach helps avoid:
>>>>
>>>> Once PEXRST is released, the system must respond to PCIe enumeration requests
>>>> within 50ms. Thus, SW must very quickly respond to the IRQ indicating PEXRST
>>>> release and program the endpoint configuration into HW. By caching the
>>>> configuration in the DWC driver and immediately/synchronously applying it in
>>>> the PEXRST IRQ handler, we reduce the number of steps and amount of code
>>>> taken to program the HW, so it should get done pretty quickly. If instead we
>>>> call back into the endpoint function driver's ->init() op, we run the risk of
>>>> that op doing other stuff besides just calling the endpoint HW configuration
>>>> APIs (e.g. perhaps the function driver defers memory buffer allocation or
>>>> IOVA programming to that ->init function) which in turns makes it much less
>>>> likely the 50ms requirement will be hit. Perhaps we can solve this by naming
>>>> the op well and providing lots of comments, but my guess is that endpoint
>>>> function authors won't notice that...
>>>
>>> Kishon,
>>>
>>> Do you have any further details exactly how you'd prefer this to work? Does the
>>> approach I describe in points a/b/c above sound like what you want? Thanks.
>>
>> Agree with your PERST comment.
>>
>> What I have in mind is we add a new epc_init() ops. I feel there are more uses
>> for it (For e.g I have an internal patch which uses epc_init to initialize DMA.
>> Hopefully I'll post it soon).
>> If you look at pci_epf_test, pci_epf_test_bind() is where the function actually
>> starts to write to HW (i.e using pci_epc_*).
>> So before the endpoint function invokes pci_epc_write_header(), it should
>> invoke epc_init(). Only if that succeeds, it should go ahead with other
>> initialization.
>> If epc_init_* fails, we can have a particular error value to indicate the
>> controller is waiting for clock from host (so that we don't return error from
>> ->bind()). Once the controller receives the clock, it can send an atomic
>> notification to the endpoint function driver to indicate it is ready to be
>> initialized. (Atomic notification makes it easy to handle for multi function
>> endpoint devices.)
>> The endpoint function can then initialize the controller.
>> I think except for pci_epf_test_alloc_space() all other functions are
>> configuring the HW (in pci_epf_test_bind). pci_epf_test_alloc_space() could be
>> moved to pci_epf_test_probe() so there are no expensive operations to be done
>> once the controller is ready to be initialized.
>> I have epc_init() and the atomic notification part already implemented and I'm
>> planning to post it before next week. Once that is merged, we might have to
>> reorder function in pci_epf_test driver and you have to return the correct
>> error value for epc_init() if the clock is not there.
> 
> Kishon, did you manage to post the patches that implement epc_init()? If so, a
> link would be appreciated. Thanks.

I haven't posted the patches yet. Sorry for the delay. Give me some more time
please (till next week).

Thanks
Kishon
Kishon Vijay Abraham I - Jan. 8, 2019, 12:05 p.m.
Hi Stephen,

On 04/01/19 1:32 PM, Kishon Vijay Abraham I wrote:
> Hi Stephen,
> 
> On 02/01/19 10:04 PM, Stephen Warren wrote:
>> On 12/19/18 7:37 AM, Kishon Vijay Abraham I wrote:
>>> Hi,
>>>
>>> On 14/12/18 10:31 PM, Stephen Warren wrote:
>>>> On 12/11/18 10:23 AM, Stephen Warren wrote:
>>>>> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 27/11/18 4:39 AM, Stephen Warren wrote:
>>>>>>> From: Stephen Warren <swarren@nvidia.com>
>>>>>>>
>>>>>>> Some implementations of the DWC PCIe endpoint controller do not allow
>>>>>>> access to DBI registers until the attached host has started REFCLK,
>>>>>>> released PERST, and the endpoint driver has initialized clocking of the
>>>>>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>>>>>>> PCIe endpoint subsystem and DWC driver currently don't work on such
>>>>>>> hardware, since they assume that all endpoint configuration can happen
>>>>>>> at any arbitrary time.
>>>>>>>
>>>>>>> Enhance the DWC endpoint driver to support such systems by caching all
>>>>>>> endpoint configuration in software, and only writing the configuration
>>>>>>> to hardware once it's been initialized. This is implemented by splitting
>>>>>>> all endpoint controller ops into two functions; the first which simply
>>>>>>> records/caches the desired configuration whenever called by the
>>>>>>> associated function driver and optionally calls the second, and the
>>>>>>> second which actually programs the configuration into hardware, which
>>>>>>> may be called either by the first function, or later when it's known
>>>>>>> that the DBI registers are available.
>>>>>
>>>>>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>>
>>>>>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>>>>>>> +{
>>>>>>
>>>>>> When will this function be invoked? Does the wrapper get an interrupt when
>>>>>> refclk is enabled where this function will be invoked?
>>>>>
>>>>> Yes, there's an IRQ from the HW that indicates when PEXRST is released. I
>>>>> don't recall right now if this IRQ is something that exists for all DWC
>>>>> instantiations, or is Tegra-specific.
>>>>>
>>>>>>> +    struct dw_pcie_ep *ep = &(pci->ep);
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    ep->hw_regs_not_available = false;
>>>>>>
>>>>>> This can race with epc_ops.
>>>>>>
>>>>>>> +
>>>>>>> +    dw_pcie_ep_write_header_regs(ep);
>>>>>>> +    for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>>>>>>> +        dw_pcie_prog_inbound_atu(pci, i,
>>>>>>> +            ep->cached_inbound_atus[i].bar,
>>>>>>> +            ep->cached_inbound_atus[i].cpu_addr,
>>>>>>> +            ep->cached_inbound_atus[i].as_type);
>>>>>>
>>>>>> Depending on the context in which this function is invoked, programming
>>>>>> inbound/outbound ATU can also race with EPC ops.
>>>>>   >
>>>>>>> +        dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>>>>>>> +    }
>>>>>>> +    for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>>>>>>> +        dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>>>>>>> +            ep->cached_outbound_atus[i].addr,
>>>>>>> +            ep->cached_outbound_atus[i].pci_addr,
>>>>>>> +            ep->cached_outbound_atus[i].size);
>>>>>>> +    dw_pcie_dbi_ro_wr_en(pci);
>>>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>>>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>>>>>>> +    dw_pcie_dbi_ro_wr_dis(pci);
>>>>>>
>>>>>> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
>>>>>> ready to be initialized or not. Only if the epc_init indicates it's ready
>>>>>> to be
>>>>>> initialized, the endpoint function driver should go ahead with further
>>>>>> initialization. Or else it should wait for a notification from EPC to
>>>>>> indicate
>>>>>> when it's ready to be initialized.
>>>>>
>>>>> (Did you mean epf op or epc op?)
>>>>>
>>>>> I'm not sure how exactly how that would work; do you want the DWC core driver
>>>>> or the endpoint subsystem to poll that epc op to find out when the HW is
>>>>> ready to be initialized? Or do you envisage the controller driver still
>>>>> calling dw_pcie_set_regs_available() (possibly renamed), which in turn calls
>>>>> ->epc_init() calls for some reason?
>>>>>
>>>>> If you don't want to cache the endpoint configuration, perhaps you want:
>>>>>
>>>>> a) Endpoint function doesn't pro-actively call the endpoint controller
>>>>> functions to configure the endpoint.
>>>>>
>>>>> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() (or
>>>>> whatever name), which lets the core know the HW can be configured. Perhaps
>>>>> this schedules a work queue item to implement locking to avoid the races you
>>>>> mentioned.
>>>>>
>>>>> c) Endpoint core calls pci_epf_init() which calls epf op ->init().
>>>>>
>>>>> One gotcha with this approach, which the caching approach helps avoid:
>>>>>
>>>>> Once PEXRST is released, the system must respond to PCIe enumeration requests
>>>>> within 50ms. Thus, SW must very quickly respond to the IRQ indicating PEXRST
>>>>> release and program the endpoint configuration into HW. By caching the
>>>>> configuration in the DWC driver and immediately/synchronously applying it in
>>>>> the PEXRST IRQ handler, we reduce the number of steps and amount of code
>>>>> taken to program the HW, so it should get done pretty quickly. If instead we
>>>>> call back into the endpoint function driver's ->init() op, we run the risk of
>>>>> that op doing other stuff besides just calling the endpoint HW configuration
>>>>> APIs (e.g. perhaps the function driver defers memory buffer allocation or
>>>>> IOVA programming to that ->init function) which in turns makes it much less
>>>>> likely the 50ms requirement will be hit. Perhaps we can solve this by naming
>>>>> the op well and providing lots of comments, but my guess is that endpoint
>>>>> function authors won't notice that...
>>>>
>>>> Kishon,
>>>>
>>>> Do you have any further details exactly how you'd prefer this to work? Does the
>>>> approach I describe in points a/b/c above sound like what you want? Thanks.
>>>
>>> Agree with your PERST comment.
>>>
>>> What I have in mind is we add a new epc_init() ops. I feel there are more uses
>>> for it (For e.g I have an internal patch which uses epc_init to initialize DMA.
>>> Hopefully I'll post it soon).
>>> If you look at pci_epf_test, pci_epf_test_bind() is where the function actually
>>> starts to write to HW (i.e using pci_epc_*).
>>> So before the endpoint function invokes pci_epc_write_header(), it should
>>> invoke epc_init(). Only if that succeeds, it should go ahead with other
>>> initialization.
>>> If epc_init_* fails, we can have a particular error value to indicate the
>>> controller is waiting for clock from host (so that we don't return error from
>>> ->bind()). Once the controller receives the clock, it can send an atomic
>>> notification to the endpoint function driver to indicate it is ready to be
>>> initialized. (Atomic notification makes it easy to handle for multi function
>>> endpoint devices.)
>>> The endpoint function can then initialize the controller.
>>> I think except for pci_epf_test_alloc_space() all other functions are
>>> configuring the HW (in pci_epf_test_bind). pci_epf_test_alloc_space() could be
>>> moved to pci_epf_test_probe() so there are no expensive operations to be done
>>> once the controller is ready to be initialized.
>>> I have epc_init() and the atomic notification part already implemented and I'm
>>> planning to post it before next week. Once that is merged, we might have to
>>> reorder function in pci_epf_test driver and you have to return the correct
>>> error value for epc_init() if the clock is not there.
>>
>> Kishon, did you manage to post the patches that implement epc_init()? If so, a
>> link would be appreciated. Thanks.
> 
> I haven't posted the patches yet. Sorry for the delay. Give me some more time
> please (till next week).

I have posted one set of cleanup for EPC features [1] by introducing
epc_get_features(). Some of the things I initially thought should be in
epc_init actually fits in epc_get_features. However I still believe for your
usecase we should introduce ->epc_init().

Thanks
Kishon

[1] -> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1891393.html
> 
> Thanks
> Kishon
>

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 880210366e71..e69b6ef504ed 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -73,11 +73,10 @@  static u8 dw_pcie_ep_find_capability(struct dw_pcie *pci, u8 cap)
 	return __dw_pcie_ep_find_next_cap(pci, next_cap_ptr, cap);
 }
 
-static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
-				   struct pci_epf_header *hdr)
+static void dw_pcie_ep_write_header_regs(struct dw_pcie_ep *ep)
 {
-	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	struct pci_epf_header *hdr = &(ep->cached_hdr);
 
 	dw_pcie_dbi_ro_wr_en(pci);
 	dw_pcie_writew_dbi(pci, PCI_VENDOR_ID, hdr->vendorid);
@@ -94,6 +93,19 @@  static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
 	dw_pcie_writeb_dbi(pci, PCI_INTERRUPT_PIN,
 			   hdr->interrupt_pin);
 	dw_pcie_dbi_ro_wr_dis(pci);
+}
+
+static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
+				   struct pci_epf_header *hdr)
+{
+	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+
+	ep->cached_hdr = *hdr;
+
+	if (ep->hw_regs_not_available)
+		return 0;
+
+	dw_pcie_ep_write_header_regs(ep);
 
 	return 0;
 }
@@ -112,6 +124,15 @@  static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
 		return -EINVAL;
 	}
 
+	ep->cached_inbound_atus[free_win].bar = bar;
+	ep->cached_inbound_atus[free_win].cpu_addr = cpu_addr;
+	ep->cached_inbound_atus[free_win].as_type = as_type;
+	ep->cached_bars[bar].atu_index = free_win;
+	set_bit(free_win, ep->ib_window_map);
+
+	if (ep->hw_regs_not_available)
+		return 0;
+
 	ret = dw_pcie_prog_inbound_atu(pci, free_win, bar, cpu_addr,
 				       as_type);
 	if (ret < 0) {
@@ -119,9 +140,6 @@  static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
 		return ret;
 	}
 
-	ep->bar_to_atu[bar] = free_win;
-	set_bit(free_win, ep->ib_window_map);
-
 	return 0;
 }
 
@@ -137,27 +155,66 @@  static int dw_pcie_ep_outbound_atu(struct dw_pcie_ep *ep, phys_addr_t phys_addr,
 		return -EINVAL;
 	}
 
+	ep->cached_outbound_atus[free_win].addr = phys_addr;
+	ep->cached_outbound_atus[free_win].pci_addr = pci_addr;
+	ep->cached_outbound_atus[free_win].size = size;
+	set_bit(free_win, ep->ob_window_map);
+
+	if (ep->hw_regs_not_available)
+		return 0;
+
 	dw_pcie_prog_outbound_atu(pci, free_win, PCIE_ATU_TYPE_MEM,
 				  phys_addr, pci_addr, size);
 
-	set_bit(free_win, ep->ob_window_map);
-	ep->outbound_addr[free_win] = phys_addr;
-
 	return 0;
 }
 
+static void dw_pcie_ep_clear_bar_regs(struct dw_pcie_ep *ep, u8 func_no,
+				      enum pci_barno bar)
+{
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	int flags = ep->cached_bars[bar].flags;
+	u32 atu_index = ep->cached_bars[bar].atu_index;
+
+	__dw_pcie_ep_reset_bar(pci, bar, flags);
+
+	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
+}
+
 static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no,
 				 struct pci_epf_bar *epf_bar)
 {
 	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
-	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 	enum pci_barno bar = epf_bar->barno;
-	u32 atu_index = ep->bar_to_atu[bar];
+	u32 atu_index = ep->cached_bars[bar].atu_index;
 
-	__dw_pcie_ep_reset_bar(pci, bar, epf_bar->flags);
-
-	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_INBOUND);
 	clear_bit(atu_index, ep->ib_window_map);
+
+	if (ep->hw_regs_not_available)
+		return;
+
+	dw_pcie_ep_clear_bar_regs(ep, func_no, bar);
+}
+
+static void dw_pcie_ep_set_bar_regs(struct dw_pcie_ep *ep, u8 func_no,
+				    enum pci_barno bar)
+{
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	size_t size = ep->cached_bars[bar].size;
+	int flags = ep->cached_bars[bar].flags;
+	u32 reg = PCI_BASE_ADDRESS_0 + (4 * bar);
+
+	dw_pcie_dbi_ro_wr_en(pci);
+
+	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
+	dw_pcie_writel_dbi(pci, reg, flags);
+
+	if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64) {
+		dw_pcie_writel_dbi2(pci, reg + 4, upper_32_bits(size - 1));
+		dw_pcie_writel_dbi(pci, reg + 4, 0);
+	}
+
+	dw_pcie_dbi_ro_wr_dis(pci);
 }
 
 static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
@@ -165,12 +222,10 @@  static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
 {
 	int ret;
 	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
-	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 	enum pci_barno bar = epf_bar->barno;
 	size_t size = epf_bar->size;
 	int flags = epf_bar->flags;
 	enum dw_pcie_as_type as_type;
-	u32 reg = PCI_BASE_ADDRESS_0 + (4 * bar);
 
 	if (!(flags & PCI_BASE_ADDRESS_SPACE))
 		as_type = DW_PCIE_AS_MEM;
@@ -181,17 +236,13 @@  static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no,
 	if (ret)
 		return ret;
 
-	dw_pcie_dbi_ro_wr_en(pci);
+	ep->cached_bars[bar].size = size;
+	ep->cached_bars[bar].flags = flags;
 
-	dw_pcie_writel_dbi2(pci, reg, lower_32_bits(size - 1));
-	dw_pcie_writel_dbi(pci, reg, flags);
-
-	if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64) {
-		dw_pcie_writel_dbi2(pci, reg + 4, upper_32_bits(size - 1));
-		dw_pcie_writel_dbi(pci, reg + 4, 0);
-	}
+	if (ep->hw_regs_not_available)
+		return 0;
 
-	dw_pcie_dbi_ro_wr_dis(pci);
+	dw_pcie_ep_set_bar_regs(ep, func_no, bar);
 
 	return 0;
 }
@@ -202,7 +253,7 @@  static int dw_pcie_find_index(struct dw_pcie_ep *ep, phys_addr_t addr,
 	u32 index;
 
 	for (index = 0; index < ep->num_ob_windows; index++) {
-		if (ep->outbound_addr[index] != addr)
+		if (ep->cached_outbound_atus[index].addr != addr)
 			continue;
 		*atu_index = index;
 		return 0;
@@ -223,8 +274,12 @@  static void dw_pcie_ep_unmap_addr(struct pci_epc *epc, u8 func_no,
 	if (ret < 0)
 		return;
 
-	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_OUTBOUND);
 	clear_bit(atu_index, ep->ob_window_map);
+
+	if (ep->hw_regs_not_available)
+		return;
+
+	dw_pcie_disable_atu(pci, atu_index, DW_PCIE_REGION_OUTBOUND);
 }
 
 static int dw_pcie_ep_map_addr(struct pci_epc *epc, u8 func_no,
@@ -254,7 +309,10 @@  static int dw_pcie_ep_get_msi(struct pci_epc *epc, u8 func_no)
 		return -EINVAL;
 
 	reg = ep->msi_cap + PCI_MSI_FLAGS;
-	val = dw_pcie_readw_dbi(pci, reg);
+	if (ep->hw_regs_not_available)
+		val = ep->cached_msi_flags;
+	else
+		val = dw_pcie_readw_dbi(pci, reg);
 	if (!(val & PCI_MSI_FLAGS_ENABLE))
 		return -EINVAL;
 
@@ -273,12 +331,18 @@  static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 interrupts)
 		return -EINVAL;
 
 	reg = ep->msi_cap + PCI_MSI_FLAGS;
-	val = dw_pcie_readw_dbi(pci, reg);
+	if (ep->hw_regs_not_available)
+		val = ep->cached_msi_flags;
+	else
+		val = dw_pcie_readw_dbi(pci, reg);
 	val &= ~PCI_MSI_FLAGS_QMASK;
 	val |= (interrupts << 1) & PCI_MSI_FLAGS_QMASK;
-	dw_pcie_dbi_ro_wr_en(pci);
-	dw_pcie_writew_dbi(pci, reg, val);
-	dw_pcie_dbi_ro_wr_dis(pci);
+	ep->cached_msi_flags = val;
+	if (!ep->hw_regs_not_available) {
+		dw_pcie_dbi_ro_wr_en(pci);
+		dw_pcie_writew_dbi(pci, reg, val);
+		dw_pcie_dbi_ro_wr_dis(pci);
+	}
 
 	return 0;
 }
@@ -293,7 +357,10 @@  static int dw_pcie_ep_get_msix(struct pci_epc *epc, u8 func_no)
 		return -EINVAL;
 
 	reg = ep->msix_cap + PCI_MSIX_FLAGS;
-	val = dw_pcie_readw_dbi(pci, reg);
+	if (ep->hw_regs_not_available)
+		val = ep->cached_msix_flags;
+	else
+		val = dw_pcie_readw_dbi(pci, reg);
 	if (!(val & PCI_MSIX_FLAGS_ENABLE))
 		return -EINVAL;
 
@@ -312,16 +379,48 @@  static int dw_pcie_ep_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts)
 		return -EINVAL;
 
 	reg = ep->msix_cap + PCI_MSIX_FLAGS;
-	val = dw_pcie_readw_dbi(pci, reg);
+	if (ep->hw_regs_not_available)
+		val = ep->cached_msix_flags;
+	else
+		val = dw_pcie_readw_dbi(pci, reg);
 	val &= ~PCI_MSIX_FLAGS_QSIZE;
 	val |= interrupts;
-	dw_pcie_dbi_ro_wr_en(pci);
-	dw_pcie_writew_dbi(pci, reg, val);
-	dw_pcie_dbi_ro_wr_dis(pci);
+	ep->cached_msix_flags = val;
+	if (!ep->hw_regs_not_available) {
+		dw_pcie_dbi_ro_wr_en(pci);
+		dw_pcie_writew_dbi(pci, reg, val);
+		dw_pcie_dbi_ro_wr_dis(pci);
+	}
 
 	return 0;
 }
 
+void dw_pcie_set_regs_available(struct dw_pcie *pci)
+{
+	struct dw_pcie_ep *ep = &(pci->ep);
+	int i;
+
+	ep->hw_regs_not_available = false;
+
+	dw_pcie_ep_write_header_regs(ep);
+	for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
+		dw_pcie_prog_inbound_atu(pci, i,
+			ep->cached_inbound_atus[i].bar,
+			ep->cached_inbound_atus[i].cpu_addr,
+			ep->cached_inbound_atus[i].as_type);
+		dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
+	}
+	for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
+		dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
+			ep->cached_outbound_atus[i].addr,
+			ep->cached_outbound_atus[i].pci_addr,
+			ep->cached_outbound_atus[i].size);
+	dw_pcie_dbi_ro_wr_en(pci);
+	dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
+	dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
+	dw_pcie_dbi_ro_wr_dis(pci);
+}
+
 static int dw_pcie_ep_raise_irq(struct pci_epc *epc, u8 func_no,
 				enum pci_epc_irq_type type, u16 interrupt_num)
 {
@@ -330,6 +429,9 @@  static int dw_pcie_ep_raise_irq(struct pci_epc *epc, u8 func_no,
 	if (!ep->ops->raise_irq)
 		return -EINVAL;
 
+	if (ep->hw_regs_not_available)
+		return -EAGAIN;
+
 	return ep->ops->raise_irq(ep, func_no, type, interrupt_num);
 }
 
@@ -391,6 +493,9 @@  int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 	bool has_upper;
 	int ret;
 
+	if (ep->hw_regs_not_available)
+		return -EAGAIN;
+
 	if (!ep->msi_cap)
 		return -EINVAL;
 
@@ -436,6 +541,9 @@  int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
 	void __iomem *msix_tbl;
 	int ret;
 
+	if (ep->hw_regs_not_available)
+		return -EAGAIN;
+
 	reg = ep->msix_cap + PCI_MSIX_TABLE;
 	tbl_offset = dw_pcie_readl_dbi(pci, reg);
 	bir = (tbl_offset & PCI_MSIX_TABLE_BIR);
@@ -494,7 +602,6 @@  void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
 int dw_pcie_ep_init(struct dw_pcie_ep *ep)
 {
 	int ret;
-	void *addr;
 	struct pci_epc *epc;
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 	struct device *dev = pci->dev;
@@ -543,11 +650,17 @@  int dw_pcie_ep_init(struct dw_pcie_ep *ep)
 	if (!ep->ob_window_map)
 		return -ENOMEM;
 
-	addr = devm_kcalloc(dev, ep->num_ob_windows, sizeof(phys_addr_t),
-			    GFP_KERNEL);
-	if (!addr)
+	ep->cached_inbound_atus = devm_kzalloc(dev,
+		sizeof(ep->cached_inbound_atus[0]) * ep->num_ib_windows,
+		GFP_KERNEL);
+	if (!ep->cached_inbound_atus)
+		return -ENOMEM;
+
+	ep->cached_outbound_atus = devm_kzalloc(dev,
+		sizeof(ep->cached_outbound_atus[0]) * ep->num_ob_windows,
+		GFP_KERNEL);
+	if (!ep->cached_outbound_atus)
 		return -ENOMEM;
-	ep->outbound_addr = addr;
 
 	epc = devm_pci_epc_create(dev, &epc_ops);
 	if (IS_ERR(epc)) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index e895f37b1dee..c5fa5aecda0e 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -192,8 +192,6 @@  struct dw_pcie_ep {
 	phys_addr_t		phys_base;
 	size_t			addr_size;
 	size_t			page_size;
-	u8			bar_to_atu[6];
-	phys_addr_t		*outbound_addr;
 	unsigned long		*ib_window_map;
 	unsigned long		*ob_window_map;
 	u32			num_ib_windows;
@@ -202,6 +200,25 @@  struct dw_pcie_ep {
 	phys_addr_t		msi_mem_phys;
 	u8			msi_cap;	/* MSI capability offset */
 	u8			msix_cap;	/* MSI-X capability offset */
+	bool			hw_regs_not_available;
+	struct pci_epf_header	cached_hdr;
+	struct {
+		size_t			size;
+		int			flags;
+		u8			atu_index;
+	}			cached_bars[6];
+	struct {
+		enum pci_barno		bar;
+		dma_addr_t		cpu_addr;
+		enum dw_pcie_as_type	as_type;
+	}			*cached_inbound_atus;
+	struct {
+		phys_addr_t		addr;
+		u64			pci_addr;
+		size_t			size;
+	}			*cached_outbound_atus;
+	u32			cached_msi_flags;
+	u32			cached_msix_flags;
 };
 
 struct dw_pcie_ops {
@@ -363,6 +380,7 @@  static inline int dw_pcie_allocate_domains(struct pcie_port *pp)
 void dw_pcie_ep_linkup(struct dw_pcie_ep *ep);
 int dw_pcie_ep_init(struct dw_pcie_ep *ep);
 void dw_pcie_ep_exit(struct dw_pcie_ep *ep);
+void dw_pcie_ep_set_regs_available(struct dw_pcie *pci);
 int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no);
 int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 			     u8 interrupt_num);
@@ -383,6 +401,10 @@  static inline void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
 {
 }
 
+static inline void dw_pcie_ep_set_regs_available(struct dw_pcie *pci)
+{
+}
+
 static inline int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no)
 {
 	return 0;