Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy API Integration #499

Merged
merged 20 commits into from
Dec 20, 2024
Merged

Policy API Integration #499

merged 20 commits into from
Dec 20, 2024

Conversation

rlxdev
Copy link
Collaborator

@rlxdev rlxdev commented Nov 14, 2024

ScubaGoggles with Google's Policy API Implementation

This is the complete integration of the version of ScubaGoggles containing the
implementation of Googles's Policy API.

  • All Rego Tests Pass, including 185 new tests covering the baselines
    implemented by the Policy API.
  • Regal runs on all Rego code with no issues.
  • PYLint runs on all Python code with no issues.
  • Mitchel's "smoke" tests pass (locally, after modifying for credentials &
    scubagws tenant).
  • This version has been tested on Windows 10, with varying Python versions,
    and RedHat RHEL 8 linux.

Changes Introduced with this Version

Google Policy API

The Google Policy API is used for all baselines that can be assessed by the data
returned by the API. Google returns all settings available for the top-level
orgunit, and then only the settings that differ from the top-level orgunit for
sub-orgunits and groups.

The Policy API data is available to the rego code in the input data in the
"policies" section. The settings are grouped by orgunit or orgunit/group name.
There is no distinction between orgunits and groups.

The Policy API is queried using Google's AuthorizedSession class because
access to the API is not yet available using the Google API client, which is
used for the other APIs queried by ScubaGoggles (reports, directory, groups).
There have been no issues using this method to access the Policy API.

Because of the way Rego handles undefined variables, basically returning
undefined when a variable is not defined, the Python code fetching the Policy
API data must verify that all expected settings are present in the top-level
orgunit's data. It's expected that missing settings will be rare, but they
can't be ignored because missing settings will lead to incorrect results.
ScubaGoggles doesn't have to abort on any missing settings, so currently a
warning is issued for any missing settings.

As with the log events, the Rego code does the work to determine whether the
requirement has been met for each baseline covered by the Policy API. In
implementing the Policy API in Rego, the log event code has been kept along
with the new Policy API code for debugging/testing during the development of
the Policy API implementation. This code will be removed in a subsequent
release.

ScubaGoggles Package Distribution & Installation

This version of ScubaGoggles is able to be packaged using the standard Python
packaging software, using the setuptools back-end. Being able to install
ScubaGoggles and its dependencies as a package using pip makes the
installation much easier for users and provides a separation between the
distributed code and the user's data (credentials and reports).

Unfortunately, the current configuration in the repository does not conform to a
structure that can be packaged. This has led to subtle issues, such as circular
import errors when running ScubaGoggles, and the incorrect placement of
ScubaGoggles-related directories in the top-level site-packages directory in
the Python environment.

For this version, directories that are included in the package have been
moved below the scubagoggles directory:

  • baselines
  • docs
  • rego
  • sample-config-files
  • sample-report
  • Testing
  • utils

The current use of setup.py, with package configuration included in the file,
is deprecated by setuptools. This version now uses pyproject.toml for
package configuration.

The package is built using the scubagoggles/utils/build.sh script. It
produces a "wheel" file (.whl) that is installable using pip.

In addition to installing ScubaGoggles, users have to specify:

  • location where the reports will be created.
  • Google OAuth credentials file (credentials.json).
  • location of the OPA executable.

For this version, users will run scubagoggles setup to provide those
locations. Unless the user explicitly changes a location, the setup is a
one-time step, even after software updates.

What are the advantages of making ScubaGoggles an installable package:

  • Package installation is easier for users. They will not have to install
    dependent packages separately.
  • User's will not have to change their working directory to run ScubaGoggles.
  • The scubagoggles command will always work and there won't be a need to have
    them try Python scuba.py.
  • Upgrading ScubaGoggles will be easy and can be done in an existing (virtual)
    environment. User's will not have to relocate files, such as the
    credentials.json or OPA executable.

Versioning

There are two forms of the version number used for ScubaGoggles: the software
version number (e.g., "0.3.0"), and what I call the "version suffix" that
proliferates throughout the files referencing baseline identifiers. There are
multiple places in the code that define the version number. Whenever there is a
version upgrade, the change touches nearly every file in the repository.

For this version, the "official" version number is defined in only one place
(scubagoggles/__init__.py). All other references to the version number are
derived from this value. The version number is managed using the scubagoggles version command.

The Python code must use the Version class when referencing the version
number. For the Rego code, the version suffix used in the baseline identifiers
is accessible in the input written by the Python code.

Baseline identifiers in the Rego code are defined once in a variable for each
baseline, and all references to the identifier are to use this variable
(including the test Rego code). For example:

CommonControlsId3_2 := utils.PolicyIdWithSuffix("GWS.COMMONCONTROLS.3.2")

The utils.PolicyIdWithSuffix() function handles the addition of the version
suffix at the end of the identifier. There must be no reference to the
version suffix in any comments in the Rego code (in fact, there should be
no occurrences of the hard-coded version suffix in any Rego file). It
is also not necessary to reference a specific version suffix in the user
documentation.

The "drift rules" and baseline Markdown files will continue to have the
version suffix included in the baseline version. The version suffixes in
these files will be modified as part of the version upgrade process.

When a version upgrade occurs, these are the only files that are modified
in the process:

  • README.md
  • drift-rules/GWS Drift Monitoring Rules - Calendar.csv
  • drift-rules/GWS Drift Monitoring Rules - Chat.csv
  • drift-rules/GWS Drift Monitoring Rules - Classroom.csv
  • drift-rules/GWS Drift Monitoring Rules - Common Controls as of 11-14-23.csv
  • drift-rules/GWS Drift Monitoring Rules - Drive and Docs.csv
  • drift-rules/GWS Drift Monitoring Rules - Gmail.csv
  • drift-rules/GWS Drift Monitoring Rules - Groups.csv
  • drift-rules/GWS Drift Monitoring Rules - Meet.csv
  • drift-rules/GWS Drift Monitoring Rules - Sites.csv
  • scubagoggles/__init__.py
  • scubagoggles/baselines/calendar.md
  • scubagoggles/baselines/chat.md
  • scubagoggles/baselines/classroom.md
  • scubagoggles/baselines/commoncontrols.md
  • scubagoggles/baselines/drive.md
  • scubagoggles/baselines/gmail.md
  • scubagoggles/baselines/groups.md
  • scubagoggles/baselines/meet.md
  • scubagoggles/baselines/sites.md

Pandas Dependency Dropped

The pandas module was included as a requirement because it was used to create
an HTML table for the reports. Pandas is a "beast" of a module, and also
requires another "beast" in numpy. The table creation is easy in this case,
and it allows us to drop a module that really isn't being used for its
intended purpose (data analysis). Having pandas dropped as a requirement also
lessens the potential issues users will have when installing ScubaGoggles.

No "sudo" for Running OPA

There was some issue with OPA in the past that apparently required the use of
"sudo" (privilege escalation) to run it on macOS and perhaps linux. This
issue seems to be fixed, so the ability to use "sudo" for running OPA has
been removed. OPA is basically a language interpreter like Python - they are
user mode programs that should never require privilege escalation. If this
happens again, the onus is on OPA to fix the issue, rather than trying to
work-around it.

Python Logging Facility Enabled

The Python logging facility has been enabled, so logging may be inserted into
the code where appropriate using log.{debug, info, warning, error, critical}()
function calls. The current default level that is displayed to the user is
warning (meaning that warning level messages and above will be displayed and
not lower level (debug, info)), but this can be altered on the command line
using the --log option.

Downloading OPA

With users installing ScubaGoggles as a Python package, they can't easily
run the download_opa.py script. To continue to provide users the ability
to download OPA, the scubagoggles getopa command performs the same task, but
also ensures the executable has the correct permissions to run on macOS and
linux.

Ability to Purge Output Directories

This might be more useful for developers running ScubaGoggles often for testing,
but there's now the ability via scubagoggles purge to delete older versions
of the reports output directories.

Closes #498
Closes #449
Closes #360
Closes #430
Closes #412
Closes #305
Closes #465
Closes #83

Follow-up issues

🧪 Testing

These changes have been continuously tested during the implementation of the Policy API settings released by Google.
As mentioned in the description, all automated tests (PYLint, regal, Rego tests) pass.

✅ Pre-approval checklist

  • This PR has an informative and human-readable title.
  • If applicable, All future TODOs are captured in issues, which are referenced in the PR description.
  • The relevant issues PR resolves are linked preferably via closing keywords.
  • All relevant type-of-change labels have been added.
  • I have read and agree to the CONTRIBUTING.md document.
  • These code changes follow cisagov code standards.
  • All relevant repo and/or project documentation has been updated to reflect the changes in this PR.
  • Tests have been added and/or modified to cover the changes in this PR.
  • All new and existing tests pass.

✅ Pre-merge Checklist

  • This PR has been smoke tested to ensure main is in a functional state when this PR is merged.
  • Squash all commits into one PR level commit using the Squash and merge button.

✅ Post-merge Checklist

  • Delete the branch to clean up.
  • Close issues resolved by this PR if the closing keywords did not activate.

@rlxdev rlxdev linked an issue Nov 14, 2024 that may be closed by this pull request
@adhilto adhilto requested review from adhilto, buidav and snarve November 14, 2024 22:10
@adhilto
Copy link
Collaborator

adhilto commented Nov 18, 2024

Tagging @mdueltgen @jkaufman-mitre for awareness, this PR contains baseline changes. No substantive changes to the policies, just changes along the line of changing the policies section headers to "Policies" for consistency.

Copy link
Collaborator

@adhilto adhilto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've barely scratched the surface of the testing that needs to be done, but here's a first round of feedback.

Two types of feedback:

  • Change now
  • Change soon but after this PR.

I've left individual comments on changes that should be made before this PR is merged.
I'm listing the changes that can wait here in this comment. In the spirit of keeping the scope of this PR from expanding any further, these changes can wait but should be tracked as issues before this PR is merged.

Simplify the rego code
Controls that have policy API support should exclusively use the policy API. No need to maintain the old reports code. Would greatly simplify the code. (I know you note this in the description but we still need to track it as an issue).

Rework the baseline version logic.
This code assumes that all controls will have the same version number. This is not a safe assumption. Pre-version 1.0, this will always be the case. But after the 1.0 release, if a control needs to be updated, only the version number of that control will be updated. For reference, see the Exchange Online baseline. Note that not all controls have the same version number (e.g., MS.EXO.1.1v1 and MS.EXO.2.2v2).

Update the "prerequisites" field of the rego tests
As noted in a comment in the reporter, "If Prerequisites is not defined, assume the test just depends on the reports API." This is no longer a safe assumption.

Update the report details message for controls relying on log events
Clearly indicate the controls that can only be checked via the log events. These checks give imperfect visibility which should be highlighted to the users. For example, if the log events for the top level OU is show compliance, ScubaGoggles will currently report "Requirement met for all OUs and groups," but in reality we cannot know that with full confidence.

scubagoggles/auth.py Outdated Show resolved Hide resolved
scubagoggles/docs/prerequisites/Prerequisites.md Outdated Show resolved Hide resolved
scubagoggles/docs/troubleshooting/Troubleshooting.md Outdated Show resolved Hide resolved
scubagoggles/docs/usage/Config.md Outdated Show resolved Hide resolved
scubagoggles/docs/usage/Config.md Outdated Show resolved Hide resolved
scubagoggles/user_setup.py Outdated Show resolved Hide resolved
scubagoggles/scuba_argument_parser.py Show resolved Hide resolved
scubagoggles/auth.py Show resolved Hide resolved
scubagoggles/docs/prerequisites/Prerequisites.md Outdated Show resolved Hide resolved
scubagoggles/main.py Outdated Show resolved Hide resolved
@rlxdev
Copy link
Collaborator Author

rlxdev commented Nov 19, 2024

Simplify the rego code Controls that have policy API support should exclusively use the policy API. No need to maintain the old reports code. Would greatly simplify the code. (I know you note this in the description but we still need to track it as an issue).

This has been the plan all along. I approached this conservatively and preserved the log event code to keep the testing working.
My thought is to remove this in the 1.1 version, and it can be done in a methodical way throughout.

Rework the baseline version logic. This code assumes that all controls will have the same version number. This is not a safe assumption. Pre-version 1.0, this will always be the case. But after the 1.0 release, if a control needs to be updated, only the version number of that control will be updated. For reference, see the Exchange Online baseline. Note that not all controls have the same version number (e.g., MS.EXO.1.1v1 and MS.EXO.2.2v2).

The way I've implemented the "version suffix" changes doesn't preclude the possibility of having different suffixes for certain baseline policies. When we do end up having different suffixes on the baselines, it will not require much additional coding. It must be implemented such that there is only one source of the baseline version, most likely the markdown file where the baseline is defined. The correct suffix for the policies can be obtained from the source and driven across the system through the code instead of hard-coding the suffix in multiple places.

@adhilto adhilto mentioned this pull request Nov 25, 2024
9 tasks
README.md Outdated Show resolved Hide resolved
docs/installation/OPA.md Outdated Show resolved Hide resolved
docs/installation/OPA.md Outdated Show resolved Hide resolved
docs/prerequisites/Prerequisites.md Outdated Show resolved Hide resolved
scubagoggles/scuba_argument_parser.py Show resolved Hide resolved
Copy link
Collaborator

@snarve snarve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Testing section, it would help to have instructions on how to update any dependencies (automatically by running specific scripts/commands) or download any required packages.
Since this PR covers major updates, assuming running the tool now requires a different command (or atleast different flags), listing the instructions out in the Testing section would help.

image

scubagoggles/rego/Drive.rego Outdated Show resolved Hide resolved
scubagoggles/rego/Groups.rego Outdated Show resolved Hide resolved
scubagoggles/rego/Groups.rego Outdated Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Drive.rego Show resolved Hide resolved
scubagoggles/rego/Commoncontrols.rego Outdated Show resolved Hide resolved
scubagoggles/rego/Commoncontrols.rego Outdated Show resolved Hide resolved
scubagoggles/rego/Commoncontrols.rego Show resolved Hide resolved
… (see 11/6 change)

commoncontrols 4.1: use "friendly" value in non-compliance message
docs/installation/DownloadAndInstall.md Show resolved Hide resolved
docs/installation/DownloadAndInstall.md Show resolved Hide resolved
docs/installation/DownloadAndInstall.md Show resolved Hide resolved
docs/installation/DownloadAndInstall.md Show resolved Hide resolved
docs/installation/OPA.md Outdated Show resolved Hide resolved
docs/installation/DownloadAndInstall.md Show resolved Hide resolved
scuba_argument_parser: fix bug in converting argument value data types
drive rego: fix 6.1 for subOUs/groups
policy_api: add ability to dump Google's raw policy response
installation & OPA instructions: add more detail about setup downloading OPA
.github/workflows/run_release.yml Show resolved Hide resolved
scubagoggles/main.py Show resolved Hide resolved
scubagoggles/config.py Show resolved Hide resolved
scubagoggles/run_rego.py Show resolved Hide resolved
scubagoggles/reporter/md_parser.py Show resolved Hide resolved
@buidav buidav requested review from buidav, snarve and adhilto December 19, 2024 22:05
Copy link
Collaborator

@adhilto adhilto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job addressing my feedback. I still have a few concerns, but I don't see them as big enough to block the policy API, so I propose we open follow-up issues for those and merge this as it is now.

My remaining concerns:

  • The Windows version of the smoke test is failing. It passes when run locally and the Mac version passes, so a recommend disabling the Windows workflow for now and revisiting it later.
  • The installation process. I am no longer concerned by the so-called "conflicting" config files I brought up earlier, but still share some of David and Sheetal's concerns. Also I definitely see your changes here as a step in the right direction. I agree with David that there should be a .scubagoggles folder in the home directory, where the .scubagoggles initial config file lives. I also recommend that be the default location for the OPA executable. I also agree with Sheetal that prompts presented by the setup process are confusing and that some of the documentation needs tweaking.
  • The issue with Takeout for services without individual admin control described in my other comment.

Again, I'm ok with these final things being addressed later (as long as we open issues). Thanks for addressing all my other feedback!

Copy link
Collaborator

@snarve snarve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending comments addressed in issue: #531

Copy link
Collaborator

@snarve snarve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job Roy!

Any pending comments are added in issue: #531

Most of the comments are about the setup process.

@adhilto adhilto merged commit 2219d49 into main Dec 20, 2024
72 of 97 checks passed
@adhilto adhilto deleted the 498-policy-api-integration branch December 20, 2024 00:23
@buidav buidav mentioned this pull request Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment