Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/danswer-ai/danswer into bug…
Browse files Browse the repository at this point in the history
…fix/celery_light_backoff

# Conflicts:
#	backend/danswer/background/celery/tasks/connector_deletion/tasks.py
#	backend/danswer/background/celery/tasks/indexing/tasks.py
#	backend/danswer/background/celery/tasks/pruning/tasks.py
#	backend/danswer/background/celery/tasks/shared/tasks.py
  • Loading branch information
rkuo-danswer committed Oct 25, 2024
2 parents a74594e + 4ca3820 commit 9e555ed
Show file tree
Hide file tree
Showing 129 changed files with 13,445 additions and 797 deletions.
30 changes: 17 additions & 13 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,24 @@
[Describe the tests you ran to verify your changes]


## Accepted Risk
[Any know risks or failure modes to point out to reviewers]
## Accepted Risk (provide if relevant)
N/A


## Related Issue(s)
[If applicable, link to the issue(s) this PR addresses]
## Related Issue(s) (provide if relevant)
N/A


## Checklist:
- [ ] All of the automated tests pass
- [ ] All PR comments are addressed and marked resolved
- [ ] If there are migrations, they have been rebased to latest main
- [ ] If there are new dependencies, they are added to the requirements
- [ ] If there are new environment variables, they are added to all of the deployment methods
- [ ] If there are new APIs that don't require auth, they are added to PUBLIC_ENDPOINT_SPECS
- [ ] Docker images build and basic functionalities work
- [ ] Author has done a final read through of the PR right before merge
## Mental Checklist:
- All of the automated tests pass
- All PR comments are addressed and marked resolved
- If there are migrations, they have been rebased to latest main
- If there are new dependencies, they are added to the requirements
- If there are new environment variables, they are added to all of the deployment methods
- If there are new APIs that don't require auth, they are added to PUBLIC_ENDPOINT_SPECS
- Docker images build and basic functionalities work
- Author has done a final read through of the PR right before merge

## Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.
- [ ] This PR should be backported (make sure to check that the backport attempt succeeds)
7 changes: 6 additions & 1 deletion .github/workflows/nightly-close-stale-issues.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@
name: 'Close stale issues and PRs'
name: 'Nightly - Close stale issues and PRs'
on:
schedule:
- cron: '0 11 * * *' # Runs every day at 3 AM PST / 4 AM PDT / 11 AM UTC

permissions:
# contents: write # only for delete-branch option
issues: write
pull-requests: write

jobs:
stale:
runs-on: ubuntu-latest
Expand Down
92 changes: 92 additions & 0 deletions .github/workflows/pr-backport-autotrigger.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
name: Backport on Merge

on:
pull_request:
types: [closed]

jobs:
backport:
if: github.event.pull_request.merged == true
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0 # Fetch all history for all branches and tags

- name: Set up Git
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
- name: Check for Backport Checkbox
id: checkbox-check
run: |
PR_BODY="${{ github.event.pull_request.body }}"
if [[ "$PR_BODY" == *"[x] This PR should be backported"* ]]; then
echo "backport=true" >> $GITHUB_OUTPUT
else
echo "backport=false" >> $GITHUB_OUTPUT
fi
- name: List and sort release branches
id: list-branches
run: |
git fetch --all --tags
BRANCHES=$(git for-each-ref --format='%(refname:short)' refs/remotes/origin/release/* | sed 's|origin/release/||' | sort -Vr)
BETA=$(echo "$BRANCHES" | head -n 1)
STABLE=$(echo "$BRANCHES" | head -n 2 | tail -n 1)
echo "beta=$BETA" >> $GITHUB_OUTPUT
echo "stable=$STABLE" >> $GITHUB_OUTPUT
# Fetch latest tags for beta and stable
LATEST_BETA_TAG=$(git tag -l "v*.*.0-beta.*" | sort -Vr | head -n 1)
LATEST_STABLE_TAG=$(git tag -l "v*.*.*" | grep -v -- "-beta" | sort -Vr | head -n 1)
# Increment latest beta tag
NEW_BETA_TAG=$(echo $LATEST_BETA_TAG | awk -F '[.-]' '{print $1 "." $2 ".0-beta." ($NF+1)}')
# Increment latest stable tag
NEW_STABLE_TAG=$(echo $LATEST_STABLE_TAG | awk -F '.' '{print $1 "." $2 "." ($3+1)}')
echo "latest_beta_tag=$LATEST_BETA_TAG" >> $GITHUB_OUTPUT
echo "latest_stable_tag=$LATEST_STABLE_TAG" >> $GITHUB_OUTPUT
echo "new_beta_tag=$NEW_BETA_TAG" >> $GITHUB_OUTPUT
echo "new_stable_tag=$NEW_STABLE_TAG" >> $GITHUB_OUTPUT
- name: Echo branch and tag information
run: |
echo "Beta branch: ${{ steps.list-branches.outputs.beta }}"
echo "Stable branch: ${{ steps.list-branches.outputs.stable }}"
echo "Latest beta tag: ${{ steps.list-branches.outputs.latest_beta_tag }}"
echo "Latest stable tag: ${{ steps.list-branches.outputs.latest_stable_tag }}"
echo "New beta tag: ${{ steps.list-branches.outputs.new_beta_tag }}"
echo "New stable tag: ${{ steps.list-branches.outputs.new_stable_tag }}"
- name: Trigger Backport
if: steps.checkbox-check.outputs.backport == 'true'
run: |
set -e
echo "Backporting to beta ${{ steps.list-branches.outputs.beta }} and stable ${{ steps.list-branches.outputs.stable }}"
# Fetch all history for all branches and tags
git fetch --prune --unshallow
# Checkout the beta branch
git checkout ${{ steps.list-branches.outputs.beta }}
# Cherry-pick the merge commit from the merged PR
git cherry-pick -m 1 ${{ github.event.pull_request.merge_commit_sha }} || {
echo "Cherry-pick to beta failed due to conflicts."
exit 1
}
# Create new beta tag
git tag ${{ steps.list-branches.outputs.new_beta_tag }}
# Push the changes and tag to the beta branch
git push origin ${{ steps.list-branches.outputs.beta }}
git push origin ${{ steps.list-branches.outputs.new_beta_tag }}
# Checkout the stable branch
git checkout ${{ steps.list-branches.outputs.stable }}
# Cherry-pick the merge commit from the merged PR
git cherry-pick -m 1 ${{ github.event.pull_request.merge_commit_sha }} || {
echo "Cherry-pick to stable failed due to conflicts."
exit 1
}
# Create new stable tag
git tag ${{ steps.list-branches.outputs.new_stable_tag }}
# Push the changes and tag to the stable branch
git push origin ${{ steps.list-branches.outputs.stable }}
git push origin ${{ steps.list-branches.outputs.new_stable_tag }}
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ We also have built-in support for deployment on Kubernetes. Files for that can b

## 🚧 Roadmap
* Chat/Prompt sharing with specific teammates and user groups.
* Multi-Model model support, chat with images, video etc.
* Multimodal model support, chat with images, video etc.
* Choosing between LLMs and parameters during chat session.
* Tool calling and agent configurations options.
* Organizational understanding and ability to locate and suggest experts from your team.
Expand Down
9 changes: 7 additions & 2 deletions backend/alembic/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from danswer.db.models import Base
from celery.backends.database.session import ResultModelBase # type: ignore
from danswer.db.engine import get_all_tenant_ids
from shared_configs.configs import POSTGRES_DEFAULT_SCHEMA

# Alembic Config object
config = context.config
Expand Down Expand Up @@ -57,11 +58,15 @@ def get_schema_options() -> tuple[str, bool, bool]:
if "=" in pair:
key, value = pair.split("=", 1)
x_args[key.strip()] = value.strip()
schema_name = x_args.get("schema", "public")
schema_name = x_args.get("schema", POSTGRES_DEFAULT_SCHEMA)
create_schema = x_args.get("create_schema", "true").lower() == "true"
upgrade_all_tenants = x_args.get("upgrade_all_tenants", "false").lower() == "true"

if MULTI_TENANT and schema_name == "public" and not upgrade_all_tenants:
if (
MULTI_TENANT
and schema_name == POSTGRES_DEFAULT_SCHEMA
and not upgrade_all_tenants
):
raise ValueError(
"Cannot run default migrations in public schema when multi-tenancy is enabled. "
"Please specify a tenant-specific schema."
Expand Down
2 changes: 1 addition & 1 deletion backend/danswer/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
import os

__version__ = os.environ.get("DANSWER_VERSION", "") or "0.3-dev"
__version__ = os.environ.get("DANSWER_VERSION", "") or "Development"
9 changes: 9 additions & 0 deletions backend/danswer/access/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,12 @@ def build(
user_groups=set(user_groups),
is_public=is_public,
)


default_public_access = DocumentAccess(
external_user_emails=set(),
external_user_group_ids=set(),
user_emails=set(),
user_groups=set(),
is_public=True,
)
98 changes: 64 additions & 34 deletions backend/danswer/auth/users.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,8 @@
from danswer.utils.telemetry import optional_telemetry
from danswer.utils.telemetry import RecordType
from danswer.utils.variable_functionality import fetch_versioned_implementation
from shared_configs.configs import current_tenant_id
from shared_configs.configs import CURRENT_TENANT_ID_CONTEXTVAR
from shared_configs.configs import POSTGRES_DEFAULT_SCHEMA

logger = setup_logger()

Expand Down Expand Up @@ -187,7 +188,7 @@ def verify_email_domain(email: str) -> None:

def get_tenant_id_for_email(email: str) -> str:
if not MULTI_TENANT:
return "public"
return POSTGRES_DEFAULT_SCHEMA
# Implement logic to get tenant_id from the mapping table
with Session(get_sqlalchemy_engine()) as db_session:
result = db_session.execute(
Expand Down Expand Up @@ -233,35 +234,62 @@ async def create(
safe: bool = False,
request: Optional[Request] = None,
) -> User:
verify_email_is_invited(user_create.email)
verify_email_domain(user_create.email)
if hasattr(user_create, "role"):
user_count = await get_user_count()
if user_count == 0 or user_create.email in get_default_admin_user_emails():
user_create.role = UserRole.ADMIN
else:
user_create.role = UserRole.BASIC
user = None
try:
user = await super().create(user_create, safe=safe, request=request) # type: ignore
except exceptions.UserAlreadyExists:
user = await self.get_by_email(user_create.email)
# Handle case where user has used product outside of web and is now creating an account through web
if (
not user.has_web_login
and hasattr(user_create, "has_web_login")
and user_create.has_web_login
):
user_update = UserUpdate(
password=user_create.password,
has_web_login=True,
role=user_create.role,
is_verified=user_create.is_verified,
)
user = await self.update(user_update, user)
else:
raise exceptions.UserAlreadyExists()
return user
tenant_id = (
get_tenant_id_for_email(user_create.email)
if MULTI_TENANT
else POSTGRES_DEFAULT_SCHEMA
)
except exceptions.UserNotExists:
raise HTTPException(status_code=401, detail="User not found")

if not tenant_id:
raise HTTPException(
status_code=401, detail="User does not belong to an organization"
)

async with get_async_session_with_tenant(tenant_id) as db_session:
token = CURRENT_TENANT_ID_CONTEXTVAR.set(tenant_id)

verify_email_is_invited(user_create.email)
verify_email_domain(user_create.email)
if MULTI_TENANT:
tenant_user_db = SQLAlchemyUserAdminDB(db_session, User, OAuthAccount)
self.user_db = tenant_user_db
self.database = tenant_user_db

if hasattr(user_create, "role"):
user_count = await get_user_count()
if (
user_count == 0
or user_create.email in get_default_admin_user_emails()
):
user_create.role = UserRole.ADMIN
else:
user_create.role = UserRole.BASIC
user = None
try:
user = await super().create(user_create, safe=safe, request=request) # type: ignore
except exceptions.UserAlreadyExists:
user = await self.get_by_email(user_create.email)
# Handle case where user has used product outside of web and is now creating an account through web
if (
not user.has_web_login
and hasattr(user_create, "has_web_login")
and user_create.has_web_login
):
user_update = UserUpdate(
password=user_create.password,
has_web_login=True,
role=user_create.role,
is_verified=user_create.is_verified,
)
user = await self.update(user_update, user)
else:
raise exceptions.UserAlreadyExists()

CURRENT_TENANT_ID_CONTEXTVAR.reset(token)
return user

async def on_after_login(
self,
Expand Down Expand Up @@ -302,7 +330,9 @@ async def oauth_callback(
# Get tenant_id from mapping table
try:
tenant_id = (
get_tenant_id_for_email(account_email) if MULTI_TENANT else "public"
get_tenant_id_for_email(account_email)
if MULTI_TENANT
else POSTGRES_DEFAULT_SCHEMA
)
except exceptions.UserNotExists:
raise HTTPException(status_code=401, detail="User not found")
Expand All @@ -312,15 +342,15 @@ async def oauth_callback(

token = None
async with get_async_session_with_tenant(tenant_id) as db_session:
token = current_tenant_id.set(tenant_id)
token = CURRENT_TENANT_ID_CONTEXTVAR.set(tenant_id)

verify_email_in_whitelist(account_email, tenant_id)
verify_email_domain(account_email)

if MULTI_TENANT:
tenant_user_db = SQLAlchemyUserAdminDB(db_session, User, OAuthAccount)
self.user_db = tenant_user_db
self.database = tenant_user_db
self.database = tenant_user_db # type: ignore

oauth_account_dict = {
"oauth_name": oauth_name,
Expand Down Expand Up @@ -402,7 +432,7 @@ async def oauth_callback(
user.oidc_expiry = None # type: ignore

if token:
current_tenant_id.reset(token)
CURRENT_TENANT_ID_CONTEXTVAR.reset(token)

return user

Expand Down
Loading

0 comments on commit 9e555ed

Please sign in to comment.