Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a setting to use a different REST url during SSR execution #3358

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

atarix83
Copy link
Contributor

@atarix83 atarix83 commented Sep 26, 2024

References

Add references/links to any related issues or PRs. These may include:

Description

This pull request provide the possibility to use a different DSpace REST url during the SSR

Instructions for Reviewers

List of changes in this PR:

  • Added a new setting ssrBaseUrl where to specify a different DSpace REST url to use during SSR
  • Added a new interceptor which replaces the base url used to contact the REST server, if a different DSpace REST url is provided
  • Changed the behaviour of the thumbnail component. The thumbnail component now renders exclusively in the browser. This change is necessary to prevent issues during server-side rendering (SSR). When a page containing a thumbnail is rendered on the server, the HTML content delivered to the browser would attempt to download the thumbnail using the URL specified in the ssrBaseUrl property. If this URL was not publicly accessible, it would result in an error. By rendering the thumbnail only in the browser, we avoid this problem.
  • An additional change is required with the ServerHardRedirectService. In case the url to redirect contains the ssrBaseUrl it's reinstated with the public base url. This is necessary otherwise download bitstream page doesn't work.

Include guidance for how to test or review your PR.
Unfortunately it is difficult to test this PR. The best would be to deploy it in a production like environment and configure the internal base url property.
This might be done on both Angular and REST side, like in the example below where public base url is https://dspace-rest.org/server and internal base url is http://localhost:8080/server.

  • Angular: in your config.prod.yml file set the ssrBaseUrl property, e.g :
rest:
  ssl: true
  host: dspace-rest.org
  port: 443
  nameSpace: /server
  ssrBaseUrl: http://localhost:8080/server
  • REST: in your local.cfg file set the dspace.server.ssr.url property, e.g. :
# Public URL of DSpace backend ('server' webapp). May require a port number if not using standard ports (80 or 443)
# DO NOT end it with '/'.
# This is where REST API and all enabled server modules (OAI-PMH, SWORD, SWORDv2, RDF, etc) will respond.
# NOTE: This URL must be accessible to all DSpace users (should not use 'localhost' in Production)
# and is usually "synced" with the "rest" section in the DSpace User Interface's config.*.yml.
# It corresponds to the URL that you would type into your browser to access the REST API.
dspace.server.url = https://dspace-rest.org/server

# Additional URL of DSpace backend which could be used by DSpace frontend during SSR execution.
# May require a port number if not using standard ports (80 or 443)
# DO NOT end it with '/'.
dspace.server.ssr.url = http://localhost:8080/server

In order to test it locally you could add an alias hostname for localhost ip to be used as public url

Checklist

  • My PR is created against the main branch of code (unless it is a backport or is fixing an issue specific to an older branch).
  • My PR is small in size (e.g. less than 1,000 lines of code, not including comments & specs/tests), or I have provided reasons as to why that's not possible.
  • My PR passes ESLint validation using npm run lint
  • My PR doesn't introduce circular dependencies (verified via npm run check-circ-deps)
  • My PR includes TypeDoc comments for all new (or modified) public methods and classes. It also includes TypeDoc for large or complex private methods.
  • My PR passes all specs/tests and includes new/updated specs or tests based on the Code Testing Guide.
  • My PR aligns with Accessibility guidelines if it makes changes to the user interface.
  • My PR uses i18n (internationalization) keys instead of hardcoded English text, to allow for translations.
  • My PR includes details on how to test it. I've provided clear instructions to reviewers on how to successfully test this fix or feature.
  • If my PR includes new libraries/dependencies (in package.json), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.
  • If my PR includes new features or configurations, I've provided basic technical documentation in the PR itself.
  • If my PR fixes an issue ticket, I've linked them together.

@tdonohue tdonohue added high priority performance / caching Related to performance, caching or embedded objects labels Sep 26, 2024
@tdonohue
Copy link
Member

@atarix83 : While I realize this is still in "draft", I'm assigning this to @artlowel and I for review. @artlowel , I know you have a lot on your plate, but this seems like a high priority fix that would be good to get your or your team's feedback on.

@tdonohue tdonohue changed the title [DURACOM-288] Provide a setting to use a different REST url during SS… Provide a setting to use a different REST url during SSR execution Sep 26, 2024
@atarix83 atarix83 marked this pull request as ready for review September 26, 2024 18:05
@artlowel artlowel requested review from ybnd and removed request for artlowel September 27, 2024 07:46
Copy link
Member

@ybnd ybnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything works as expected, but still "leaks" the SSR URL and does double work since the transfer state is ignored

map(() => true),
),
);
// The app state can be transferred only when SSR and CSR are using the same base url for the REST API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate REST URL could be a performance benefit during SSR, but if we discard all cached requests we'll always do double work and CSR performance will suffer.

Could we consider just running a search/replace on the transfer state to replace the SSR URL with the CSR URL?

  • This shouldn't be too heavy of an operation so I think the net impact on performance could still be positive
  • I did a very quick check and replacing all URLs right after the full HTML has been generated takes at most 0.25% of the total time needed for SSR
    • Only tested for /home, not authenticated
    • SSR took ~400ms, HTML search replace took between 0 and 1ms

Otherwise we should at least exclude the transfer state entirely; right now it's sent but never read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your suggestion, but I believe implementing a search/replace operation for URLs might not be a good idea due to performance concerns. Even though the initial tests show a minimal impact, this might not be the case for all routes or under different conditions and it could become a bottleneck.
Moreover adding an extra step in the Server-side rendering process can increase the overall latency, negatively affecting the user experience, particularly for pages with larger amount of request.
Honestly I don't see a real issue to fetch again the request during the CSR.

I'll try to exclude the transfer state entirely

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't see a real issue to fetch again the request during the CSR

My main concern is that it has a significant impact on perceived performance, especially on slower connections.
This counteracts our push towards decreasing the amount of REST requests we send ~ #3110 (comment).

I've made a quick branch where you can toggle both things: ybnd/dspace-angular → task/main/DURACOM-288

  • ui.transferState → whether NGRX state is transferred from SSR to CSR
  • replaceRestUrl → whether the SSR URL is replaced after SSR is done

As far as I can tell, a simple RegEx to replace SSR URLs scales a lot better than DSpace itself

  • A /search page with 1000 results (!) takes ~50000ms to build with SSR on my machine
  • Replacing all URLs (~16000) takes ~20ms

On the other hand, for a /search page with 100 results, Lighthouse reports about double the blocking time if we don't transfer state on CSR.

With URL replacing & state transfer
Without state transfer

@@ -100,6 +101,14 @@ export class ServerInitService extends InitService {
}

private saveAppConfigForCSR(): void {
this.transferState.set<AppConfig>(APP_CONFIG_STATE, environment as AppConfig);
if (isNotEmpty(environment.rest.ssrBaseUrl) && environment.rest.baseUrl !== environment.rest.ssrBaseUrl) {
// Avoid to transfer ssrBaseUrl in order to prevent security issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about the potential security concern, good idea.
The state transfer JSON will still list the SSR URL. It's a bit more "hidden" but not secure either.

Copy link

github-actions bot commented Oct 8, 2024

Hi @atarix83,
Conflicts have been detected against the base branch.
Please resolve these conflicts as soon as you can. Thanks!

# Conflicts:
#	src/app/thumbnail/thumbnail.component.spec.ts
@tdonohue tdonohue requested a review from ybnd November 15, 2024 22:03
Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @atarix83 ! I gave this a code review today, and have some minor requests for improvement. I haven't had a chance to fully test this yet, but I plan to do so next week.

src/config/server-config.interface.ts Show resolved Hide resolved
@@ -6,4 +6,6 @@ export class ServerConfig implements Config {
public port: number;
public nameSpace: string;
public baseUrl?: string;
public ssrBaseUrl?: string;
public hasSsrBaseUrl?: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment here to describe how/where this boolean value is set? This doesn't appear to be something we want people to specify in their config.yml, so it would be good to document that this is set automatically based on the value of ssrBaseUrl in your config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'd recommend we add a comment above this setting to clarify that this should NEVER be copied to config.*.yml. Something like:
This boolean will be automatically set on server startup based on whether "baseUrl" and "ssrBaseUrl" have different values.

@atarix83
Copy link
Contributor Author

atarix83 commented Jan 9, 2025

@tdonohue @ybnd

I've improved my implementation trying to address your feedback. Here the main changes:

  • added a new config property to disable transfer state (see here). By default even if the property is enabled when a different SSR base url is set the state is not transferred from server to client application.
  • added a new config to allow url replacement in the state transferred to the client application. default to false.

@atarix83 atarix83 requested a review from tdonohue January 9, 2025 18:22
Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atarix83 : I was able to test this locally today by using temporary public URLs for the frontend & backend. I ran the frontend & backend on different domains, and ran the frontend in production mode (npm run build:prod; npm run serve:ssr).

Overall, the basics of this PR (and the backend PR) work well. However, I've found a few bugs in my testing:

  1. Authentication doesn't work. This might be a side effect of an error I'm seeing on main, but here's what I'm trying.
    • If I use these PRs with the dspace.server.ssr.url and rest.ssrBaseUrl settings disabled, then authentication works fine. However, whenever I authenticate, in the SSR logs I see an "doesn't contain the link profiles" error. This seems harmless, but it appears on every login
    • If I then enable the dspace.server.ssr.url and rest.ssrBaseUrl settings and point them at http://localhost:8080/, authentication no longer works. After attempting to authenticate, the page hangs and I just see the loading image (on a blank white page). No obvious errors are seen in the logs.
  2. Whenever SSR is triggered, I see a 404 request to an /undefined path. This appears in my browser's DevTools "Network" tab.
    • Enable the dspace.server.ssr.url and rest.ssrBaseUrl settings and point them at http://localhost:8080/
    • Go to any Item page.
    • Click reload in your browser (to trigger SSR). The 404 /undefined path error will appear in your DevTools
  3. Thumbnail images appear to be accessed initially via the private URL.
    • Enable the dspace.server.ssr.url and rest.ssrBaseUrl settings and point them at http://localhost:8080/
    • Go to an Item page, for an Item that has a thumbnail.
    • Click reload in your browser (to trigger SSR)
    • Look closely at the /content request in your browser's DevTools. It will initially be against the private URL ("http://localhost:8080/"). However, once the page redraws, it will send again to the correct public URL.
    • If I add the replaceRestUrl: true setting to my config.prod.yml and try again, then this behavior will no longer occur. It only occurs if replaceRestUrl: false is set.

Overall, this appears to be working. When this feature is enabled, I do visually notice the flash/redraw of the page when transitioning from SSR to CSR. It's much more severe than when this feature is disabled. However, if there's no way to minimize that flash, then we may just need to warn people that the page flash is more severe when this separate REST URL feature is enabled.

I also have a few minor comments inline below on the code. Mostly it's looking good though.

@@ -19,11 +19,32 @@ ui:

# Angular Server Side Rendering (SSR) settings
ssr:
# A boolean flag indicating whether the SSR configuration is enabled
# Defaults to true.
enabled: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this. While it is possible to disable SSR in this manner, I don't think we should encourage anyone to disable SSR entirely -- it will have a detrimental effect on SEO.

So, this enabled setting was purposefully excluded from config.example.yml to discourage anyone from disabling SSR entirely.


# Enable request performance profiling data collection and printing the results in the server console.
# Defaults to false.
enablePerformanceProfiler: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change boolean to false and remove the semicolon. This config.example.yml file is supposed to have default settings listed and be "valid" if they want to copy the config.example.yml into config.prod.yml and edit the defaults. So, this should say:

enablePerformanceProfiler: false

Also is there any risk in enabling this in production? This sounds like a debugging tool, so we may want to mention that we don't recommend enabling in production

# Note: When using an external application cache layer, it's recommended not to transfer the state to avoid caching it.
# Disabling it ensures that dynamic state information is not inadvertently cached, which can improve security and
# ensure that users always use the most up-to-date state.
transferState: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this should have the default value listed:

transferState: true

# REST resources with the internal URL configured, so it is not transferred to the client application, by default.
# Enabling this setting transfers the state to the client application and replaces internal URLs with the public
# URLs used by the client application.
replaceRestUrl: boolean;
Copy link
Member

@tdonohue tdonohue Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this should have the default value listed:

replaceRestUrl: false

Why is this false by default? We may want to add an explanation to describe why someone may want this to be true or false. Are there any side effects to enabling it, or scenarios where you should definitely have it enabled or disabled? Currently, it's unclear to me why it is false and when users would want to set this to true.

Based on the discussion with @ybnd above , it sounds like setting this to true should decrease the number of REST API requests when a page goes through SSR? (As it shouldn't need to make duplicate requests in CSR)

(I like how for transferState above you described a scenario where you might want to set it to false instead of true. So, it'd be nice to do that here as well.)

@@ -6,4 +6,6 @@ export class ServerConfig implements Config {
public port: number;
public nameSpace: string;
public baseUrl?: string;
public ssrBaseUrl?: string;
public hasSsrBaseUrl?: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'd recommend we add a comment above this setting to clarify that this should NEVER be copied to config.*.yml. Something like:
This boolean will be automatically set on server startup based on whether "baseUrl" and "ssrBaseUrl" have different values.

Copy link

Hi @atarix83,
Conflicts have been detected against the base branch.
Please resolve these conflicts as soon as you can. Thanks!

@tdonohue tdonohue requested a review from pnbecker January 23, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority merge conflict performance / caching Related to performance, caching or embedded objects
Projects
Status: 👀 Under Review
Development

Successfully merging this pull request may close these issues.

Unable to route network requests to different DSpace REST API URL during SSR
4 participants