Replies: 2 comments 3 replies
-
@awhite456 I've been thinking about this a little bit in a different context recently. You make a good point about it being ideal that there's an incentive to provide complete information. I think The problem, stated another way, is that at the moment it's not possible to tell if a process was started by the kernel or if the telemetry provider decided not to include the information, and the information is important. Option 3 solves the nested object issue. It's limited in usefulness though since PIDs do not persist across process launches (for the vast majority of processes), so one needs to correlate to determine which parent process is the right one rather than it being unambiguously specified, and with that comes a reasonable possibility of the information being inaccurate - far from ideal for forensics. Option 2 could work, but then a parent process is just another process so it will much of the time have a parent process and that's important information. That being the case you'll want to know what it's parent is and you're in the same boat as option 4. That leaves option 1 which as you point out means one can simply provide not-very-much information. The Thinking out loud a bit, these are probably the ideal requirements with regards to tracking parents of a process:
If we translate those into schema conditions to meet all of those
This would accomplish the following:
There are some in-progress thoughts in there so let me know what you think, I suppose there's some other option where it's clear that the telemetry provider has chosen not to provide the parent process info. |
Beta Was this translation helpful? Give feedback.
-
Process required fields were reduced to allow this to work. |
Beta Was this translation helpful? Give feedback.
-
Problem - A less detailed
process
object could be more compliantThe current version of the schema for the
process
object hasparent_process
as a recommended field, which points to another instance of theprocess
object. This means that any constraints applied to fields in theprocess
object are also applied to theparent_process
.For telemetry producers that provide less detail about the
parent_process
, this causes a situation where they may be more compliant if they do not provide aparent_process
at all, rather than provide an incompleteparent_process
object. Take as an example, an event that contains only apid
for theparent_process
and no other parent information - a more compliant transformation would be to skipparent_process
, which is only recommended, rather than addparent_process
and having missing required fields forcmd_line
andpath
.Telemetry producers should be incentivized to populate as many fields as possible, not fewer fields, indicating that this aspect of the schema needs to change. For reference, an example of the minimum compliant process with
parent_process
populated is shown below:An example of a provider that only provides a
pid
for the parent, and subsequently would not populateparent_process
to remain compliant, is shown below:Current state
The current state of required fields and
parent_process
for theprocess
object are as follows:Option 1 - Make
cmd_line
andpath
recommended + makeparent_process
requiredTo change the requirements on fields for the
parent_process
, we could change the requirements on the rootprocess
object.While this is the simplest solution to maintain, it also significantly lowers the bar for what constitutes a process object, as the
cmd_line
andpath
fields would no longer be required. This may result in compliant providers that do not provide enough information to be useful.To handle the example case, where only the
pid
of the parent is provided, only thepid
field would be able to be required. The minimum compliant process object would be as follows:Option 2 - Create new
parent_process
object + makeparent_process
requiredTo change the requirements on fields for the
parent_process
, we could create a new object that clones theprocess
object and has lower requirements.This would allow the requirements of any field in the
parent_process
object to be modified independently of theprocess
object, but would add additional overhead and complexity as now two process-like objects need to be maintained.To handle the example case, where only the
pid
of the parent is provided, only thepid
field would be required in the parent object. The minimum compliant process object would be as follows:Option 3 - Integrate
parent_process
fields intoprocess
and makeparent_pid
requiredTo change the requirements on fields for the
parent_process
, we could move the fields from this object into theprocess
object with the prefixparent_
.This would significantly increase the number of fields in the
process
object as each would need to be duplicated with the prefixparent_
and remove the ability to nest multipleparent_process
objects, but would only result in one object to maintain and the ability to set individual requirements on any parent field.To handle the example case, where only the
pid
of the parent is provided, only theparent_pid
field would be required out of theparent_
fields. The minimum compliant process object would be as follows:Option 4 - Make
parent_process
requiredTo prevent providers being compliant by specifying less data, we could make
parent_process
required to force providers to provide as much data as they are able to.This would not significantly change the format of the
process
object, but would result in some non-compliant providers specifying fields with blank values.To handle the example case, where only the
pid
of the parent is provided, only thepid
field would be populated in the parent and the other fields would be blank. The non-compliant process object from this provider would be as follows:0 votes ·
Beta Was this translation helpful? Give feedback.
All reactions