Skip to content

Commit

Permalink
Merge branch 'net-next-2024-12-02--15-00' into HEAD
Browse files Browse the repository at this point in the history
  • Loading branch information
Your Name committed Dec 2, 2024
2 parents b87df96 + 221156b commit 3558094
Show file tree
Hide file tree
Showing 1,878 changed files with 56,845 additions and 18,622 deletions.
9 changes: 9 additions & 0 deletions .clippy.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# SPDX-License-Identifier: GPL-2.0

check-private-items = true

disallowed-macros = [
# The `clippy::dbg_macro` lint only works with `std::dbg!`, thus we simulate
# it here, see: https://github.com/rust-lang/rust-clippy/issues/11303.
{ path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool" },
]
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ modules.order
# We don't want to ignore the following even if they are dot-files
#
!.clang-format
!.clippy.toml
!.cocciconfig
!.editorconfig
!.get_maintainer.ignore
Expand Down
25 changes: 25 additions & 0 deletions Documentation/ABI/testing/debugfs-hisi-migration
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/dev_data
Date: Jan 2025
KernelVersion: 6.13
Contact: Longfang Liu <[email protected]>
Description: Read the configuration data and some status data
required for device live migration. These data include device
status data, queue configuration data, some task configuration
data and device attribute data. The output format of the data
is defined by the live migration driver.

What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/migf_data
Date: Jan 2025
KernelVersion: 6.13
Contact: Longfang Liu <[email protected]>
Description: Read the data from the last completed live migration.
This data includes the same device status data as in "dev_data".
The migf_data is the dev_data that is migrated.

What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/cmd_state
Date: Jan 2025
KernelVersion: 6.13
Contact: Longfang Liu <[email protected]>
Description: Used to obtain the device command sending and receiving
channel status. Returns failure or success logs based on the
results.
11 changes: 11 additions & 0 deletions Documentation/ABI/testing/sysfs-bus-pci
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,17 @@ Description:
will be present in sysfs. Writing 1 to this file
will perform reset.

What: /sys/bus/pci/devices/.../reset_subordinate
Date: October 2024
Contact: [email protected]
Description:
This is visible only for bridge devices. If you want to reset
all devices attached through the subordinate bus of a specific
bridge device, writing 1 to this will try to do it. This will
affect all devices attached to the system through this bridge
similiar to writing 1 to their individual "reset" file, so use
with caution.

What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
Expand Down
13 changes: 11 additions & 2 deletions Documentation/ABI/testing/sysfs-fs-f2fs
Original file line number Diff line number Diff line change
Expand Up @@ -311,10 +311,13 @@ Description: Do background GC aggressively when set. Set to 0 by default.
GC approach and turns SSR mode on.
gc urgent low(2): lowers the bar of checking I/O idling in
order to process outstanding discard commands and GC a
little bit aggressively. uses cost benefit GC approach.
little bit aggressively. always uses cost benefit GC approach,
and will override age-threshold GC approach if ATGC is enabled
at the same time.
gc urgent mid(3): does GC forcibly in a period of given
gc_urgent_sleep_time and executes a mid level of I/O idling check.
uses cost benefit GC approach.
always uses cost benefit GC approach, and will override
age-threshold GC approach if ATGC is enabled at the same time.

What: /sys/fs/f2fs/<disk>/gc_urgent_sleep_time
Date: August 2017
Expand Down Expand Up @@ -819,3 +822,9 @@ Description: It controls the valid block ratio threshold not to trigger excessiv
for zoned deivces. The initial value of it is 95(%). F2FS will stop the
background GC thread from intiating GC for sections having valid blocks
exceeding the ratio.

What: /sys/fs/f2fs/<disk>/max_read_extent_count
Date: November 2024
Contact: "Chao Yu" <[email protected]>
Description: It controls max read extent count for per-inode, the value of threshold
is 10240 by default.
29 changes: 29 additions & 0 deletions Documentation/PCI/endpoint/pci-endpoint.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,35 @@ by the PCI endpoint function driver.
The PCI endpoint function driver should use pci_epc_mem_free_addr() to
free the memory space allocated using pci_epc_mem_alloc_addr().

* pci_epc_map_addr()

A PCI endpoint function driver should use pci_epc_map_addr() to map to a RC
PCI address the CPU address of local memory obtained with
pci_epc_mem_alloc_addr().

* pci_epc_unmap_addr()

A PCI endpoint function driver should use pci_epc_unmap_addr() to unmap the
CPU address of local memory mapped to a RC address with pci_epc_map_addr().

* pci_epc_mem_map()

A PCI endpoint controller may impose constraints on the RC PCI addresses that
can be mapped. The function pci_epc_mem_map() allows endpoint function
drivers to allocate and map controller memory while handling such
constraints. This function will determine the size of the memory that must be
allocated with pci_epc_mem_alloc_addr() for successfully mapping a RC PCI
address range. This function will also indicate the size of the PCI address
range that was actually mapped, which can be less than the requested size, as
well as the offset into the allocated memory to use for accessing the mapped
RC PCI address range.

* pci_epc_mem_unmap()

A PCI endpoint function driver can use pci_epc_mem_unmap() to unmap and free
controller memory that was allocated and mapped using pci_epc_mem_map().


Other EPC APIs
~~~~~~~~~~~~~~

Expand Down
1 change: 1 addition & 0 deletions Documentation/PCI/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ PCI Bus Subsystem
pcieaer-howto
endpoint/index
boot-interrupts
tph
14 changes: 9 additions & 5 deletions Documentation/PCI/pciebus-howto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -217,8 +217,12 @@ capability structure except the PCI Express capability structure,
that is shared between many drivers including the service drivers.
RMW Capability accessors (pcie_capability_clear_and_set_word(),
pcie_capability_set_word(), and pcie_capability_clear_word()) protect
a selected set of PCI Express Capability Registers (Link Control
Register and Root Control Register). Any change to those registers
should be performed using RMW accessors to avoid problems due to
concurrent updates. For the up-to-date list of protected registers,
see pcie_capability_clear_and_set_word().
a selected set of PCI Express Capability Registers:

* Link Control Register
* Root Control Register
* Link Control 2 Register

Any change to those registers should be performed using RMW accessors to
avoid problems due to concurrent updates. For the up-to-date list of
protected registers, see pcie_capability_clear_and_set_word().
132 changes: 132 additions & 0 deletions Documentation/PCI/tph.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
.. SPDX-License-Identifier: GPL-2.0
===========
TPH Support
===========

:Copyright: 2024 Advanced Micro Devices, Inc.
:Authors: - Eric van Tassell <[email protected]>
- Wei Huang <[email protected]>


Overview
========

TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices
to provide optimization hints for requests that target memory space.
These hints, in a format called Steering Tags (STs), are embedded in the
requester's TLP headers, enabling the system hardware, such as the Root
Complex, to better manage platform resources for these requests.

For example, on platforms with TPH-based direct data cache injection
support, an endpoint device can include appropriate STs in its DMA
traffic to specify which cache the data should be written to. This allows
the CPU core to have a higher probability of getting data from cache,
potentially improving performance and reducing latency in data
processing.


How to Use TPH
==============

TPH is presented as an optional extended capability in PCIe. The Linux
kernel handles TPH discovery during boot, but it is up to the device
driver to request TPH enablement if it is to be utilized. Once enabled,
the driver uses the provided API to obtain the Steering Tag for the
target memory and to program the ST into the device's ST table.

Enable TPH support in Linux
---------------------------

To support TPH, the kernel must be built with the CONFIG_PCIE_TPH option
enabled.

Manage TPH
----------

To enable TPH for a device, use the following function::

int pcie_enable_tph(struct pci_dev *pdev, int mode);

This function enables TPH support for device with a specific ST mode.
Current supported modes include:

* PCI_TPH_ST_NS_MODE - NO ST Mode
* PCI_TPH_ST_IV_MODE - Interrupt Vector Mode
* PCI_TPH_ST_DS_MODE - Device Specific Mode

`pcie_enable_tph()` checks whether the requested mode is actually
supported by the device before enabling. The device driver can figure out
which TPH mode is supported and can be properly enabled based on the
return value of `pcie_enable_tph()`.

To disable TPH, use the following function::

void pcie_disable_tph(struct pci_dev *pdev);

Manage ST
---------

Steering Tags are platform specific. PCIe spec does not specify where STs
are from. Instead PCI Firmware Specification defines an ACPI _DSM method
(see the `Revised _DSM for Cache Locality TPH Features ECN
<https://members.pcisig.com/wg/PCI-SIG/document/15470>`_) for retrieving
STs for a target memory of various properties. This method is what is
supported in this implementation.

To retrieve a Steering Tag for a target memory associated with a specific
CPU, use the following function::

int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
unsigned int cpu_uid, u16 *tag);

The `type` argument is used to specify the memory type, either volatile
or persistent, of the target memory. The `cpu_uid` argument specifies the
CPU where the memory is associated to.

After the ST value is retrieved, the device driver can use the following
function to write the ST into the device::

int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index,
u16 tag);

The `index` argument is the ST table entry index the ST tag will be
written into. `pcie_tph_set_st_entry()` will figure out the proper
location of ST table, either in the MSI-X table or in the TPH Extended
Capability space, and write the Steering Tag into the ST entry pointed by
the `index` argument.

It is completely up to the driver to decide how to use these TPH
functions. For example a network device driver can use the TPH APIs above
to update the Steering Tag when interrupt affinity of a RX/TX queue has
been changed. Here is a sample code for IRQ affinity notifier:

.. code-block:: c
static void irq_affinity_notified(struct irq_affinity_notify *notify,
const cpumask_t *mask)
{
struct drv_irq *irq;
unsigned int cpu_id;
u16 tag;
irq = container_of(notify, struct drv_irq, affinity_notify);
cpumask_copy(irq->cpu_mask, mask);
/* Pick a right CPU as the target - here is just an example */
cpu_id = cpumask_first(irq->cpu_mask);
if (pcie_tph_get_cpu_st(irq->pdev, TPH_MEM_TYPE_VM, cpu_id,
&tag))
return;
if (pcie_tph_set_st_entry(irq->pdev, irq->msix_nr, tag))
return;
}
Disable TPH system-wide
-----------------------

There is a kernel command line option available to control TPH feature:
* "notph": TPH will be disabled for all endpoint devices.
1 change: 1 addition & 0 deletions Documentation/admin-guide/kernel-parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ is applicable::
SCSI Appropriate SCSI support is enabled.
A lot of drivers have their options described inside
the Documentation/scsi/ sub-directory.
SDW SoundWire support is enabled.
SECURITY Different security models are enabled.
SELINUX SELinux support is enabled.
SERIAL Serial support is enabled.
Expand Down
18 changes: 18 additions & 0 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4686,6 +4686,10 @@
nomio [S390] Do not use MIO instructions.
norid [S390] ignore the RID field and force use of
one PCI domain per PCI function
notph [PCIE] If the PCIE_TPH kernel config parameter
is enabled, this kernel boot option can be used
to disable PCIe TLP Processing Hints support
system-wide.

pcie_aspm= [PCIE] Forcibly enable or ignore PCIe Active State Power
Management.
Expand Down Expand Up @@ -6071,6 +6075,10 @@
non-zero "wait" parameter. See weight_single
and weight_many.

sdw_mclk_divider=[SDW]
Specify the MCLK divider for Intel SoundWire buses in
case the BIOS does not provide the clock rate properly.

skew_tick= [KNL,EARLY] Offset the periodic timer tick per cpu to mitigate
xtime_lock contention on larger systems, and/or RCU lock
contention on all systems with CONFIG_MAXSMP set.
Expand Down Expand Up @@ -6158,6 +6166,16 @@
For more information see Documentation/mm/slub.rst.
(slub_nomerge legacy name also accepted for now)

slab_strict_numa [MM]
Support memory policies on a per object level
in the slab allocator. The default is for memory
policies to be applied at the folio level when
a new folio is needed or a partial folio is
retrieved from the lists. Increases overhead
in the slab fastpaths but gains more accurate
NUMA kernel object placement which helps with slow
interconnects in NUMA systems.

slram= [HW,MTD]

smart2= [HW]
Expand Down
5 changes: 5 additions & 0 deletions Documentation/admin-guide/media/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ Documentation/driver-api/media/index.rst
- for driver development information and Kernel APIs used by
media devices;

Documentation/process/debugging/media_specific_debugging_guide.rst

- for advice about essential tools and techniques to debug drivers on this
subsystem

.. toctree::
:caption: Table of Contents
:maxdepth: 2
Expand Down
10 changes: 10 additions & 0 deletions Documentation/admin-guide/sysctl/fs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -337,3 +337,13 @@ Each "watch" costs roughly 90 bytes on a 32-bit kernel, and roughly 160 bytes
on a 64-bit one.
The current default value for ``max_user_watches`` is 4% of the
available low memory, divided by the "watch" cost in bytes.

5. /proc/sys/fs/fuse - Configuration options for FUSE filesystems
=====================================================================

This directory contains the following configuration options for FUSE
filesystems:

``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for
setting/getting the maximum number of pages that can be used for servicing
requests in FUSE.
9 changes: 9 additions & 0 deletions Documentation/admin-guide/sysctl/kernel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,15 @@ The upper bound on the number of tasks that are checked.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.


hung_task_detect_count
======================

Indicates the total number of tasks that have been detected as hung since
the system boot.

This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.


hung_task_timeout_secs
======================

Expand Down
19 changes: 19 additions & 0 deletions Documentation/arch/riscv/hwprobe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,9 @@ The following keys are defined:
ratified in commit 98918c844281 ("Merge pull request #1217 from
riscv/zawrs") of riscv-isa-manual.

* :c:macro:`RISCV_HWPROBE_EXT_SUPM`: The Supm extension is supported as
defined in version 1.0 of the RISC-V Pointer Masking extensions.

* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to
:c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was
mistakenly classified as a bitmask rather than a value.
Expand Down Expand Up @@ -274,3 +277,19 @@ The following keys are defined:
represent the highest userspace virtual address usable.

* :c:macro:`RISCV_HWPROBE_KEY_TIME_CSR_FREQ`: Frequency (in Hz) of `time CSR`.

* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF`: An enum value describing the
performance of misaligned vector accesses on the selected set of processors.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN`: The performance of misaligned
vector accesses is unknown.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW`: 32-bit misaligned accesses using vector
registers are slower than the equivalent quantity of byte accesses via vector registers.
Misaligned accesses may be supported directly in hardware, or trapped and emulated by software.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_FAST`: 32-bit misaligned accesses using vector
registers are faster than the equivalent quantity of byte accesses via vector registers.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED`: Misaligned vector accesses are
not supported at all and will generate a misaligned address fault.
Loading

0 comments on commit 3558094

Please sign in to comment.