kernel.sys.v5.16

-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYYvEbgAKCRCRxhvAZXjc
 og17AQDj+gsxk2lT4GsRo+WrI9qegGSvYHaxbOoqqSL6rHrrsQD+IU92dwVfuUXE
 oP+De6/TBmsdygnlECxITp8p4ByhGAM=
 =wi2X
 -----END PGP SIGNATURE-----

Merge tag 'kernel.sys.v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull prctl updates from Christian Brauner:
 "This contains the missing prctl uapi pieces for PR_SCHED_CORE.

  In order to activate core scheduling the caller is expected to specify
  the scope of the new core scheduling domain.

  For example, passing 2 in the 4th argument of

     prctl(PR_SCHED_CORE, PR_SCHED_CORE_CREATE, <pid>,  2, 0);

  would indicate that the new core scheduling domain encompasses all
  tasks in the process group of <pid>. Specifying 0 would only create a
  core scheduling domain for the thread identified by <pid> and 2 would
  encompass the whole thread-group of <pid>.

  Note, the values 0, 1, and 2 correspond to PIDTYPE_PID, PIDTYPE_TGID,
  and PIDTYPE_PGID. A first version tried to expose those values
  directly to which I objected because:

   - PIDTYPE_* is an enum that is kernel internal which we should not
     expose to userspace directly.

   - PIDTYPE_* indicates what a given struct pid is used for it doesn't
     express a scope.

  But what the 4th argument of PR_SCHED_CORE prctl() expresses is the
  scope of the operation, i.e. the scope of the core scheduling domain
  at creation time. So Eugene's patch now simply introduces three new
  defines PR_SCHED_CORE_SCOPE_THREAD, PR_SCHED_CORE_SCOPE_THREAD_GROUP,
  and PR_SCHED_CORE_SCOPE_PROCESS_GROUP. They simply express what
  happens.

  This has been on the mailing list for quite a while with all relevant
  scheduler folks Cced. I announced multiple times that I'd pick this up
  if I don't see or her anyone else doing it. None of this touches
  proper scheduler code but only concerns uapi so I think this is fine.

  With core scheduling being quite common now for vm managers (e.g.
  moving individual vcpu threads into their own core scheduling domain)
  and container managers (e.g. moving the init process into its own core
  scheduling domain and letting all created children inherit it) having
  to rely on raw numbers passed as the 4th argument in prctl() is a bit
  annoying and everyone is starting to come up with their own defines"

* tag 'kernel.sys.v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument
This commit is contained in:
Linus Torvalds 2021-11-10 16:10:47 -08:00
commit a41b74451b
3 changed files with 10 additions and 2 deletions

View file

@ -61,8 +61,9 @@ arg3:
``pid`` of the task for which the operation applies.
arg4:
``pid_type`` for which the operation applies. It is of type ``enum pid_type``.
For example, if arg4 is ``PIDTYPE_TGID``, then the operation of this command
``pid_type`` for which the operation applies. It is one of
``PR_SCHED_CORE_SCOPE_``-prefixed macro constants. For example, if arg4
is ``PR_SCHED_CORE_SCOPE_THREAD_GROUP``, then the operation of this command
will be performed for all tasks in the task group of ``pid``.
arg5:

View file

@ -268,5 +268,8 @@ struct prctl_mm_map {
# define PR_SCHED_CORE_SHARE_TO 2 /* push core_sched cookie to pid */
# define PR_SCHED_CORE_SHARE_FROM 3 /* pull core_sched cookie to pid */
# define PR_SCHED_CORE_MAX 4
# define PR_SCHED_CORE_SCOPE_THREAD 0
# define PR_SCHED_CORE_SCOPE_THREAD_GROUP 1
# define PR_SCHED_CORE_SCOPE_PROCESS_GROUP 2
#endif /* _LINUX_PRCTL_H */

View file

@ -135,6 +135,10 @@ int sched_core_share_pid(unsigned int cmd, pid_t pid, enum pid_type type,
if (!static_branch_likely(&sched_smt_present))
return -ENODEV;
BUILD_BUG_ON(PR_SCHED_CORE_SCOPE_THREAD != PIDTYPE_PID);
BUILD_BUG_ON(PR_SCHED_CORE_SCOPE_THREAD_GROUP != PIDTYPE_TGID);
BUILD_BUG_ON(PR_SCHED_CORE_SCOPE_PROCESS_GROUP != PIDTYPE_PGID);
if (type > PIDTYPE_PGID || cmd >= PR_SCHED_CORE_MAX || pid < 0 ||
(cmd != PR_SCHED_CORE_GET && uaddr))
return -EINVAL;