Skip to content

Commit

Permalink
Merge pull request #1066 from great-expectations/0.9.0
Browse files Browse the repository at this point in the history
0.9.0
  • Loading branch information
jcampbell authored Feb 18, 2020
2 parents 13f8884 + 0356cec commit 2bb5fcc
Show file tree
Hide file tree
Showing 333 changed files with 25,853 additions and 20,006 deletions.
4 changes: 4 additions & 0 deletions .isort.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[settings]
multi_line_output=3
force_grid_wrap=0
include_trailing_comma=1
24 changes: 13 additions & 11 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,20 @@ branches:
- master
- develop
- /^\d.*/
- aylr.*
# dist: xenial
language: python
cache: pip
os:
- linux
- linux
env:
global:
secure: kEzD0ak7sXZspOv4014PRC5+DWQirM1NsQyTfwuZt7u0flTCtYg80Sw4lOuYZmtdztXrY/XresnK6HpCUkFtwwVuODDnrOnj1nO4l4Sb9awoxZn6zo3Q5TyC/sflDrakPsxwKEvxIH1cvzWqRwzmGXc5LEqn5SJyHR8RBhx6zUSgt8KIXLkfLTu0naj4dJH0Hmu4pqjRTmekwfoFHnYJ7aWopZChvxc6Mo2FmFFhl7Zt70EKh6RRB+aUG5xwvwf5iE+YX9YUQfqnA+2ys3O3m4wWGTxwJcdt8ZoexlYjAnSrOePdVO7EvosaBQYEyyU3xS28DzKPQXGoyJLCLNuAmzpxq4yRzPCJJF4Swl0yqY4+ab2OJrNP4A7MuMrQUDdNebUA4BY6PoTM82zfVkDCxf/fDqYPgMOAlNJb+R3jUffRJVlVq/M4sCr3sbGrnzELeTAf63jNIyH0U5LNQzzQUAT/Kyiimey17YqC9RhOUTNrojTkyXJ+ckc7z24yEfT1hUo6DAMM2b0ko/iB1MT3+ujT6qxamHe5imeX+4W1PpZsqkaRgo/dvhIsYoVPFeoBkY/N7PLgIwXhBuuFCH61p2Kx+z8hs34lVZ5GYVHDjeoDEEfm3SQrOuM2uASL9rGWj5yXiUs0OtTjmmKgEBZtNQ9JWHPY0AAdnfgS/GhWm54=
matrix:
include:
- dist: trusty
python: 2.7
env: PANDAS=0.22.0
# - dist: trusty
# python: 2.7
# env: PANDAS=0.23.4
# - dist: trusty
# python: 2.7
# env: PANDAS=0.24.2
- dist: trusty
python: 2.7
env: PANDAS=latest
Expand All @@ -40,20 +39,23 @@ matrix:
# python: 3.7
# env: PANDAS=latest
addons:
postgresql: "9.4"
postgresql: '9.4'
services:
- postgresql
- mysql
- postgresql
- mysql
install:
# - ./travis-java.sh
- sudo apt-get install -y pandoc
- pip install --only-binary=numpy,scipy numpy scipy
- if [ "$PANDAS" = "latest" ]; then pip install pandas; else pip install pandas==$PANDAS; fi
- pip install -r requirements-dev.txt
- pip install pytest-slack # Send a webhook when on travis
before_script:
- psql -c 'create database test_ci;' -U postgres
- mysql -u root --password="" -e 'create database test_ci;'
script:
- pytest --cov=great_expectations tests/
- pytest --cov=great_expectations --slack_hook=$SLACK_WEBHOOK --slack_report_link=$TRAVIS_BUILD_WEB_URL --slack_channel=notifications-great_expectations tests/
after_success:
- coveralls
- bash <(curl -s https://codecov.io/bash)

120 changes: 95 additions & 25 deletions docs/changelog/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,73 @@
.. _changelog:

0.9.0
-----------------

Version 0.9.0 is a major update to Great Expectations! The DataContext has continued to evolve into a powerful tool
for ensuring that Expectation Suites can properly represent the way users think about their data, and upgrading will
make it much easier to store and share expectation suites, and to build data docs that support your whole team.
You’ll get awesome new features including improvements to data docs look and the ability to choose and store metrics
for building flexible data quality dashboards.

The changes for version 0.9.0 fall into several broad areas:

1. Onboarding

Release 0.9.0 of Great Expectations makes it much easier to get started with the project. The `init` flow has grown
to support a much wider array of use cases and to use more natural language rather than introducing
GreatExpectations concepts earlier. You can more easily configure different backends and datasources, take advantage
of guided walkthroughs to find and profile data, and share project configurations with colleagues.

If you have already completed the `init` flow using a previous version of Great Expectations, you do not need to
rerun the command. However, **there are some small changes to your configuration that will be required**. See
:ref:`migrating_versions` for details.

2. CLI Command Improvements

With this release we have introduced a consistent naming pattern for accessing subcommands based on the noun (a
Great Expectations object like `suite` or `docs`) and verb (an action like `edit` or `new`). The new user experience
will allow us to more naturally organize access to CLI tools as new functionality is added.

3. Expectation Suite Naming and Namespace Changes

Defining shared expectation suites and validating data from different sources is much easier in this release. The
DataContext, which manages storage and configuration of expectations, validations, profiling, and data docs, no
longer requires that expectation suites live in a datasource-specific “namespace.” Instead, you should name suites
with the logical name corresponding to your data, making it easy to share them or validate against different data
sources. For example, the expectation suite "npi" for National Provider Identifier data can now be shared across
teams who access the same logical data in local systems using Pandas, on a distributed Spark cluster, or via a
relational database.

Batch Kwargs, or instructions for a datasource to build a batch of data, are similarly freed from a required
namespace, and you can more easily integrate Great Expectations into workflows where you do not need to use a
BatchKwargsGenerator (usually because you have a batch of data ready to validate, such as in a table or a known
directory).

The most noticeable impact of this API change is in the complete removal of the DataAssetIdentifier class. For
example, the `create_expectation_suite` and `get_batch` methods now no longer require a data_asset_name parameter,
relying only on the expectation_suite_name and batch_kwargs to do their job. Similarly, there is no more asset name
normalization required. See the upgrade guide for more information.

4. Metrics and Evaluation Parameter Stores

Metrics have received much more love in this release of Great Expectations! We've improved the system for declaring
evaluation parameters that support dependencies between different expectation suites, so you can easily identify a
particular field in the result of one expectation to use as the input into another. And the MetricsStore is now much
more flexible, supporting a new ValidationAction that makes it possible to select metrics from a validation result
to be saved in a database where they can power a dashboard.

5. Internal Type Changes and Improvements

Finally, in this release, we have done a lot of work under the hood to make things more robust, including updating
all of the internal objects to be more strongly typed. That change, while largely invisible to end users, paves the
way for some really exciting opportunities for extending Great Expectations as we build a bigger community around
the project.


We are really excited about this release, and encourage you to upgrade right away to take advantage of the more
flexible naming and simpler API for creating, accessing, and sharing your expectations. As always feel free to join
us on Slack for questions you don't see addressed!


0.8.9__develop
-----------------
Expand Down Expand Up @@ -53,7 +121,7 @@
* Add support to S3 generator for retrieving directories by specifying the `directory_assets` configuration
* Fix warning regarding implicit class_name during init flow
* Expose build_generator API publicly on datasources
* Allow configuration of known extensions and return more informative message when SubdirReaderGenerator cannot find
* Allow configuration of known extensions and return more informative message when SubdirReaderBatchKwargsGenerator cannot find
relevant files.
* Add support for allow_relative_error on internal dataset quantile functions, and add support for
build_continuous_partition_object in Redshift
Expand Down Expand Up @@ -140,8 +208,8 @@ Highlights include:

3. Partitioners: Batch Kwargs are clarified and enhanced to help easily reference well-known chunks of data using a
partition_id. Batch ID and Batch Fingerprint help round out support for enhanced metadata around data
assets that GE validates. See :ref:`batch_identifiers` for more information. The `GlobReaderGenerator`,
`QueryGenerator`, `S3Generator`, `SubdirReaderGenerator`, and `TableGenerator` all support partition_id for
assets that GE validates. See :ref:`batch_identifiers` for more information. The `GlobReaderBatchKwargsGenerator`,
`QueryBatchKwargsGenerator`, `S3GlobReaderBatchKwargsGenerator`, `SubdirReaderBatchKwargsGenerator`, and `TableBatchKwargsGenerator` all support partition_id for
easily accessing data assets.

4. Other Improvements:
Expand All @@ -166,7 +234,7 @@ v0.7.10
-----------------
* Fix an issue in generated documentation where the Home button failed to return to the index
* Add S3 Generator to module docs and improve module docs formatting
* Add support for views to QueryGenerator
* Add support for views to QueryBatchKwargsGenerator
* Add success/failure icons to index page
* Return to uniform histogram creation during profiling to avoid large partitions for internal performance reasons

Expand Down Expand Up @@ -200,7 +268,7 @@ v0.7.8
- PY2 failure on encountering unicode (#676)


v.0.7.7
0.7.7
-----------------
* Standardize the way that plugin module loading works. DataContext will begin to use the new-style class and plugin
identification moving forward; yml configs should specify class_name and module_name (with module_name optional for
Expand All @@ -226,7 +294,7 @@ v.0.7.7
- Add run_id to path for validation files


v.0.7.6
0.7.6
-----------------
* New Validation Renderer! Supports turning validation results into HTML and displays differences between the expected
and the observed attributes of a dataset.
Expand All @@ -239,11 +307,11 @@ v.0.7.6
* Bug fixes: improved internal logic of rendering data documentation, slack notification, and CLI profile command when
datasource argument was not provided.

v.0.7.5
0.7.5
-----------------
* Fix missing requirement for pypandoc brought in from markdown support for notes rendering.

v.0.7.4
0.7.4
-----------------
* Fix numerous rendering bugs and formatting issues for rendering documentation.
* Add support for pandas extension dtypes in pandas backend of expect_column_values_to_be_of_type and
Expand All @@ -254,7 +322,7 @@ v.0.7.4
* Add support for rendering expectation_suite and expectation_level notes from meta in docs.
* Fix minor formatting issue in readthedocs documentation.

v.0.7.3
0.7.3
-----------------
* BREAKING: Harmonize expect_column_values_to_be_of_type and expect_column_values_to_be_in_type_list semantics in
Pandas with other backends, including support for None type and type_list parameters to support profiling.
Expand All @@ -270,15 +338,15 @@ v.0.7.3
* Allow user to specify data_assets to profile via CLI
* Support CLI rendering of expectation_suite and EVR-based documentation

v.0.7.2
0.7.2
-----------------
* Improved error detection and handling in CLI "add datasource" feature
* Fixes in rendering of profiling results (descriptive renderer of validation results)
* Query Generator of SQLAlchemy datasource adds tables in non-default schemas to the data asset namespace
* Added convenience methods to display HTML renderers of sections in Jupyter notebooks
* Implemented prescriptive rendering of expectations for most expectation types

v.0.7.1
0.7.1
------------

* Added documentation/tutorials/videos for onboarding and new profiling and documentation features
Expand All @@ -297,7 +365,7 @@ v.0.7.1
* Other internal improvements and bug fixes


v.0.7.0
0.7.0
------------

Version 0.7 of Great Expectations is HUGE. It introduces several major new features
Expand Down Expand Up @@ -372,13 +440,13 @@ to top-level names.
* Documentation reorganization and improvements
* Introduce custom exceptions for more detailed error logs

v.0.6.1
0.6.1
------------
* Re-add testing (and support) for py2
* NOTE: Support for SqlAlchemyDataset and SparkDFDataset is enabled via optional install \
(e.g. ``pip install great_expectations[sqlalchemy]`` or ``pip install great_expectations[spark]``)

v.0.6.0
0.6.0
------------
* Add support for SparkDFDataset and caching (HUGE work from @cselig)
* Migrate distributional expectations to new testing framework
Expand All @@ -389,13 +457,13 @@ v.0.6.0
We anticipate this will become the future default behavior.
* BREAKING CHANGE: Drop official support pandas < 0.22

v.0.5.1
0.5.1
---------------
* **Fix** issue where no result_format available for expect_column_values_to_be_null caused error
* Use vectorized computation in pandas (#443, #445; thanks @RoyalTS)


v.0.5.0
0.5.0
----------------
* Restructured class hierarchy to have a more generic DataAsset parent that maintains expectation logic separate \
from the tabular organization of Dataset expectations
Expand All @@ -412,7 +480,7 @@ v.0.5.0
* Minor documentation, warning, and testing improvements (thanks @zdog).


v.0.4.5
0.4.5
----------------
* Add a new autoinspect API and remove default expectations.
* Improve details for expect_table_columns_to_match_ordered_list (#379, thanks @rlshuhart)
Expand All @@ -434,14 +502,14 @@ v.0.4.5
* Improve internal testing suite (thanks @anhollis and @ccnobbli)
* Consistently use value_set instead of mixing value_set and values_set (thanks @njsmith8)

v.0.4.4
0.4.4
----------------
* Improve CLI help and set CLI return value to the number of unmet expectations
* Add error handling for empty columns to SqlAlchemyDataset, and associated tests
* **Fix** broken support for older pandas versions (#346)
* **Fix** pandas deepcopy issue (#342)

v.0.4.3
0.4.3
-------
* Improve type lists in expect_column_type_to_be[_in_list] (thanks @smontanaro and @ccnobbli)
* Update cli to use entry_points for conda compatibility, and add version option to cli
Expand All @@ -454,7 +522,7 @@ v.0.4.3
* Implement expect_column_value_lenghts_to_[be_between|equal] for SQAlchemy (thanks @ccnobbli)
* **Fix** PandasDataset subclasses to inherit child class

v.0.4.2
0.4.2
-------
* **Fix** bugs in expect_column_values_to_[not]_be_null: computing unexpected value percentages and handling all-null (thanks @ccnobbli)
* Support mysql use of Decimal type (thanks @bouke-nederstigt)
Expand All @@ -465,11 +533,11 @@ v.0.4.2

* **Fix** documentation errors and other small errors (thanks @roblim, @ccnobbli)

v.0.4.1
0.4.1
-------
* Correct inclusion of new data_context module in source distribution

v.0.4.0
0.4.0
-------
* Initial implementation of data context API and SqlAlchemyDataset including implementations of the following \
expectations:
Expand Down Expand Up @@ -505,19 +573,21 @@ v.0.4.0
* Behind-the-scenes improvements to testing framework to ensure parity across data contexts.
* Documentation improvements, bug-fixes, and internal api improvements

v.0.3.2
0.3.2
-------
* Include requirements file in source dist to support conda

v.0.3.1
0.3.1
--------
* **Fix** infinite recursion error when building custom expectations
* Catch dateutil parsing overflow errors

v.0.2
0.2
-----
* Distributional expectations and associated helpers are improved and renamed to be more clear regarding the tests they apply
* Expectation decorators have been refactored significantly to streamline implementing expectations and support custom expectations
* API and examples for custom expectations are available
* New output formats are available for all expectations
* Significant improvements to test suite and compatibility

*Last updated*: |lastupdate|
2 changes: 2 additions & 0 deletions docs/community.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,5 @@ Contribute code or documentation

We welcome contributions to Great Expectations. Please start with our :ref:`contributing` guide and don't be shy
with questions!

*last updated*: |lastupdate|
4 changes: 3 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@
'sphinx.ext.coverage',
# 'sphinx.ext.mathjax'
'sphinx.ext.napoleon',
'sphinxcontrib.contentui'
'sphinxcontrib.contentui',
'sphinxcontrib.lastupdate',
'sphinx.ext.autosectionlabel'
]

# Add any paths that contain templates here, relative to this directory.
Expand Down
Loading

0 comments on commit 2bb5fcc

Please sign in to comment.