generated from finos/standards-project-blueprint
-
Notifications
You must be signed in to change notification settings - Fork 42
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Detailed MLDE and Data warehouse features (#608)
Co-authored-by: Damien Burks <[email protected]>
- Loading branch information
1 parent
2be7b51
commit 5d63205
Showing
5 changed files
with
155 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
title: Machine Learning Development Environment | ||
id: CCC.MLDE | ||
description: | | ||
Machine Learning Development Environment refers to the suite of tools, | ||
infrastructure, and processes that facilitate the development, testing, | ||
deployment, and maintenance of machine learning models. | ||
release_details: | ||
- version: "2025.06" | ||
assurance_level: None | ||
threat_model_url: None | ||
threat_model_author: None | ||
red_team: None | ||
red_team_exercise_url: None | ||
release_manager: | ||
name: Damien Burks | ||
github_id: damienjburks | ||
company: Citi | ||
summary: None | ||
change_log: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,61 +1,111 @@ | ||
common_features: | ||
- CCC.F01 # Encryption in Transit Enabled by Default | ||
- CCC.F02 # Encryption at Rest Enabled by Default | ||
- CCC.F03 # Access/Activity Logs | ||
- CCC.F04 # Transaction Rate Limits | ||
- CCC.F06 # Identity-Based Access Control | ||
- CCC.F07 # Event Notifications | ||
- CCC.F08 # Multi-zone Deployment | ||
- CCC.F09 # Monitoring | ||
- CCC.F11 # Backup | ||
- CCC.F12 # Restore | ||
- CCC.F14 # API Access | ||
- CCC.F21 # Replication | ||
- CCC.F01 # Encryption in Transit Enabled by Default | ||
- CCC.F02 # Encryption at Rest Enabled by Default | ||
- CCC.F03 # Access/Activity Logs | ||
- CCC.F04 # Transaction Rate Limits | ||
- CCC.F06 # Identity-Based Access Control | ||
- CCC.F07 # Event Notifications | ||
- CCC.F08 # Multi-zone Deployment | ||
- CCC.F11 # Backup | ||
- CCC.F12 # Restore | ||
- CCC.F14 # API Access | ||
- CCC.F19 # On-demand Scaling | ||
- CCC.F21 # Replication | ||
|
||
features: | ||
- id: CCC.DataWar.F01 # Dataset Management | ||
title: Dataset Management | ||
- id: CCC.DataWar.F01 | ||
title: Centralized Data Repository | ||
description: | | ||
Provides the ability to create, manage, and organize | ||
datasets within the Cloud Data Warehouse for structured | ||
data storage. | ||
# references: # Temporary for validation purposes | ||
# AWS: https://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum.html | ||
# GCP: https://cloud.google.com/bigquery/docs/datasets | ||
# Azure: https://learn.microsoft.com/en-us/azure/synapse-analytics/synapse-link/connect-synapse-link-sql-database | ||
Acts as a centralized repository where data from various | ||
sources is consolidated, making it easier to manage and | ||
analyze large volumes of data. | ||
- id: CCC.DataWar.F02 | ||
title: Optimized Query Performance | ||
description: | | ||
Handles complex queries on large datasets efficiently using | ||
techniques such as indexing and partitioning. | ||
- id: CCC.DataWar.F02 # Managed Views | ||
title: Managed Views | ||
- id: CCC.DataWar.F03 | ||
title: Scalability | ||
description: | | ||
Supports creating views that allow sharing query results | ||
with specific users or groups without providing access to | ||
underlying data. | ||
# references: | ||
# AWS: https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_VIEW.html | ||
# GCP: https://cloud.google.com/bigquery/docs/authorized-views | ||
# Azure: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/create-use-views | ||
Ability to scale with growing data volumes and handle multiple | ||
queries simultaneously without compromising the performance. | ||
- id: CCC.DataWar.F04 | ||
title: Column Storage | ||
description: | | ||
Stores data in columns rather than rows for efficient | ||
data retrieval. | ||
- id: CCC.DataWar.F03 # Column-Level Security | ||
title: Column-Level Security | ||
- id: CCC.DataWar.F05 | ||
title: SQL Based Querying | ||
description: | | ||
Allows setting access policies at the column level to restrict access to sensitive data fields within tables. | ||
# references: | ||
# AWS: https://aws.amazon.com/blogs/big-data/achieve-finer-grained-data- | ||
# security-with-column-level-access-control-in-amazon-redshift/ | ||
# GCP: https://cloud.google.com/bigquery/docs/column-level-security | ||
# Azure: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/column-level-security | ||
Supports SQL based querying on the data sets with specific | ||
enhancements and optimization for data warehousing. | ||
- id: CCC.DataWar.F06 | ||
title: Data Types | ||
description: | | ||
Ability to store processed structured and semi-structured | ||
data optimized for querying and analysis. | ||
- id: CCC.DataWar.F07 | ||
title: Massively Parallel Processing (MPP) | ||
description: | | ||
Distributes queries across multiple nodes for increased performance. | ||
- id: CCC.DataWar.F08 | ||
title: Materialized Views | ||
description: | | ||
Ability to store results od a query into physical tables for faster | ||
data retrieval and improved query performance for complex queries. | ||
- id: CCC.DataWar.F09 | ||
title: Column-Level Security | ||
description: | | ||
Allows setting access policies at the column level to | ||
restrict access to sensitive data fields within tables. | ||
- id: CCC.DataWar.F05 # Row-Level Security | ||
- id: CCC.DataWar.F10 | ||
title: Row-Level Security | ||
description: | | ||
Enables setting access policies at the row level to | ||
control access to subsets of data within a table based | ||
on user roles. | ||
# references: | ||
# AWS: https://docs.aws.amazon.com/redshift/latest/dg/t_rls.html | ||
# GCP: https://cloud.google.com/bigquery/docs/row-level-security-intro | ||
# Azure: https://techcommunity.microsoft.com/t5/azure-synapse-analytics- | ||
# blog/how-to-implement-row-level-security-in-serverless-sql-pools/ba-p/2354759 | ||
- id: CCC.DataWar.F11 | ||
title: Integration with Data Sources | ||
description: | | ||
Seamless integration with various data sources such as object | ||
storage, relational and non-relational databases, data streams | ||
and data lakes. | ||
- id: CCC.DataWar.F12 | ||
title: Integration with ETL | ||
description: | | ||
Integration with services that perform extract, transform and | ||
load data from various sources into the data warehouse. Unstructured | ||
data in transformed to structured or semi-structured data before | ||
ingestion to the data warehouse using ETL tools. | ||
- id: CCC.DataWar.F13 | ||
title: Integration with ML | ||
description: | | ||
Build-in integration with machine learning services for enhanced | ||
processing of large volumes of complex data with ML models for | ||
predictive analytics, automated insights and more. ML can be used | ||
in data cleansing and transformation for improved data quality as well. | ||
- id: CCC.DataWar.F14 | ||
title: Real-time Monitoring | ||
description: | | ||
Ability to continuously track and analyze data as it is ingested, | ||
processed and stored to ensure data quality, operational efficiency, | ||
scalability and security. | ||
- id: CCC.DataWar.F15 | ||
title: Cross-Region Replication | ||
description: | | ||
Ability to replicate data to multiple regions for high availability, | ||
disaster recovery and low-latency access. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
title: Data Warehouse | ||
id: CCC.DataWar | ||
description: | | ||
A data warehouse is a centralized repository designed to | ||
store, manage, and analyze large volumes of data from | ||
various sources. | ||
release_details: | ||
- version: "2025.06" | ||
assurance_level: None | ||
threat_model_url: None | ||
threat_model_author: None | ||
red_team: None | ||
red_team_exercise_url: None | ||
release_manager: | ||
name: Damien Burks | ||
github_id: damienjburks | ||
company: Citi | ||
summary: None | ||
change_log: [] |