Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Currently the frontend sends it's own user agent to the server API requests #2902

Closed
misilot opened this issue Apr 5, 2024 · 6 comments
Closed
Assignees
Milestone

Comments

@misilot
Copy link
Contributor

misilot commented Apr 5, 2024

Describe the bug
On DSpace 7.6.1 when the frontend makes requests to the backend server, many of the requests have the user agent string: "Mozilla/5.0 (Linux x64) node.js/18.20.0 v8/10.2.154.26-node.36" instead of the browser or system interacting with it.

It looks like some requests do this, and others are using the node.js UA string.

To Reproduce
Steps to reproduce the behavior:

  1. Load frontend and look at the logs for the frontend server in apache or nginx.
  2. Look for "Mozilla/5.0 (Linux x64) node.js/18.20.0 v8/10.2.154.26-node.36"

Expected behavior
I believe the expected behavior would be the originating user agent would pass through the frontend.

Example Logs

192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/authz/authorizations/search/object?uri=https://frontend.site.com/server/api/core/sites/1550aa4a-1417-4d04-a649-d2752fd452e6&feature=canViewUsageStatistics&embed=feature HTTP/1.1" 200 1778 "-" "Mozilla/5.0 (Linux x64) node.js/18.20.0 v8/10.2.154.26-node.36"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/core/communities/search/top?page=0&size=5&sort=dc.title,ASC HTTP/1.1" 200 8570 "-" "Mozilla/5.0 (Linux x64) node.js/18.20.0 v8/10.2.154.26-node.36"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/config/properties/websvc.opensearch.svccontext HTTP/1.1" 200 247 "-" "Mozilla/5.0 (Linux x64) node.js/18.20.0 v8/10.2.154.26-node.36"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/config/properties/websvc.opensearch.enable HTTP/1.1" 200 226 "-" "Mozilla/5.0 (Linux x64) node.js/18.20.0 v8/10.2.154.26-node.36"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/authn/status HTTP/1.1" 200 432 "https://frontend.site.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:124.0) Gecko/20100101 Firefox/124.0"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/config/properties/google.analytics.key HTTP/1.1" 200 226 "https://frontend.site.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:124.0) Gecko/20100101 Firefox/124.0"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api HTTP/1.1" 200 7103 "https://frontend.site.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:124.0) Gecko/20100101 Firefox/124.0"
192.168.0.1 - "" [05/Apr/2024:15:42:28 -0500] "GET /server/api/config/properties/registration.verification.enabled HTTP/1.1" 200 245 "https://frontend.site.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:124.0) Gecko/20100101 Firefox/124.0"
@misilot misilot added bug needs triage New issue needs triage and/or scheduling labels Apr 5, 2024
@github-project-automation github-project-automation bot moved this to 🆕 Triage in DSpace Backlog Apr 5, 2024
@mark-cooper
Copy link
Contributor

@tdonohue unless another volunteer wants it more I can look into this one. Very directly, it may be as simple as:

diff --git a/src/app/core/forward-client-ip/forward-client-ip.interceptor.ts b/src/app/core/forward-client-ip/forward-client-ip.interceptor.ts
index 037f69005..7fe4263d4 100644
--- a/src/app/core/forward-client-ip/forward-client-ip.interceptor.ts
+++ b/src/app/core/forward-client-ip/forward-client-ip.interceptor.ts
@@ -27,6 +27,14 @@ export class ForwardClientIpInterceptor implements HttpInterceptor {
    */
   intercept(httpRequest: HttpRequest<any>, next: HttpHandler): Observable<HttpEvent<any>> {
     const clientIp = this.req.get('x-forwarded-for') || this.req.connection.remoteAddress;
-    return next.handle(httpRequest.clone({ setHeaders: { 'X-Forwarded-For': clientIp } }));
+    const headers = { 'X-Forwarded-For': clientIp };
+
+    // if the request has a user-agent retain it
+    const userAgent = this.req.get('user-agent');
+    if (userAgent) {
+      headers['User-Agent'] = userAgent;
+    }
+
+    return next.handle(httpRequest.clone({ setHeaders: headers }));
   }
 }

But for a PR what would be the preference: renaming forward-client-ip to forward-client-headers, or creating a new set of files for forward-client-user-agent, or just keeping it maximally simple and doing it like in the diff?

@tdonohue
Copy link
Member

@mark-cooper : I'm unaware of any volunteer on this ticket, and I hadn't had time to try to reproduce it myself. So, assuming you are able to reproduce this bug, then I'd be glad to assign it to you. I honestly don't have a strong preference on the proposed fix options... but it seems like either renaming to forward-client-headers or keeping it as forward-client-ip is reasonable.

@tdonohue tdonohue moved this to 🏗 In Progress in DSpace 8.0 Release Apr 22, 2024
@tdonohue tdonohue added code task and removed needs triage New issue needs triage and/or scheduling labels Apr 22, 2024
@alanorth
Copy link
Contributor

alanorth commented Jun 21, 2024

I've tested @mark-cooper's patch on DSpace 7.6.2-SNAPSHOT and it seems to work. I need to look and see if there are any side effects, for example in statistics or other web server logging stuff... but it seems to be fine. Maybe @misilot can also check.

Regarding the file name, perhaps forward-client-headers is cleaner... changing the file name isn't optimal for a minor change like this, but maybe that's fine.

@mark-cooper
Copy link
Contributor

Just to add to what @alanorth has seen so far we've been running this in a test environment, and more recently in production, for at least a few weeks now. We use it primarily with AWS WAF for categorizing traffic and as an additional layer for blocking unwanted bots and it's been doing the job for that purpose.

For a PR I'd personally favor the direct approach (as in the diff) given how compact and simple this is, but would consider changing things in the future if any additional updates are made to this functionality.

@alanorth
Copy link
Contributor

This works well. I've been running it for a few days.

On a DSpace 7.6.1 server without the patch I get this in the server logs when loading the front page:

41.18.204.190 - - [25/Jun/2024:08:55:52 +0200] "GET /server/api HTTP/1.1" 200 7180 "-" "Mozilla/5.0 (Linux x64) node.js/18.17.1 v8/10.2.154.26-node.26"
41.18.204.190 - - [25/Jun/2024:08:55:52 +0200] "GET /server/api/authn/status HTTP/1.1" 200 447 "-" "Mozilla/5.0 (Linux x64) node.js/18.17.1 v8/10.2.154.26-node.26"

On a server with the patch, I get this:

41.18.204.190 - - [25/Jun/2024:08:55:55 +0200] "GET /server/api HTTP/1.1" 200 7381 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/128.0"        
41.18.204.190 - - [25/Jun/2024:08:55:55 +0200] "GET /server/api/authn/status HTTP/1.1" 200 456 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/128.0"

I think a pull request is in order. Shame it couldn't make it into 8.0, but there's still hope for 7.6.2!

@tdonohue
Copy link
Member

tdonohue commented Jul 3, 2024

Closing, fixed by #3152

@tdonohue tdonohue closed this as completed Jul 3, 2024
@github-project-automation github-project-automation bot moved this from 🏗 In Progress to ✅ Done in DSpace 8.x and 7.6.x Maintenance Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants