Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle search keywords containing characters outside the latin alphabet #125

Open
3 tasks done
E5rael opened this issue Aug 7, 2023 · 2 comments
Open
3 tasks done

Comments

@E5rael
Copy link

E5rael commented Aug 7, 2023

  • This is not a feature request
  • This is not an image-provider request
  • I have searched the issuetracker if the issue is not already reported.

Describe the bug:
Images containing umlaut characters (e.g. Ä or Ö) in their tags aren't displaying.

To Reproduce:
Steps to reproduce the behavior:

  1. Go to Splash settings
  2. List only tags containing umlaut characters, for example jönköping, hämeenlinna, jyväskylä
  3. Refresh the page, and an error background image is displayed.

Expected behavior:
Images containing these tags should appear.

Server:

  • Nextcloud version: 27.0.1
  • Splash-App version: 2.2.1

Additional Information:

  • The bug can be circumvented by omitting the umlauts, i.e. "jonkoping, hameenlinna, jyvaskyla".
  • It looks like some tags containing umlauts DO display other images besides the error image, such as öland and häme. Note, however, that the tag häme only displays an image of a cat, completely different from the 23 images that you can find, if you use the same tag at unsplash.com.
  • I also tried adding cyrillic tags, such as Москва, and it seemed to work.
@E5rael E5rael added the bug label Aug 7, 2023
@timonsky
Copy link
Contributor

Line 116 in lib/ProviderHandler/Provider.php is responsible for that behaviour
$term = preg_replace('/[^a-z]/i','', $term);

does not only affect umlauts, but basically any character that is not in the range U+0061 to U+007A (standard latin lowercase letters)
 

I also tried adding cyrillic tags, such as Москва, and it seemed to work.

that's because it results in an empty search string, so you just got a completely random image, not necessarily with any relation to moscow

 

I'm taking the comment line // only allow letters as searchterm as that it was a design choice with a rationale.

@joshtrichards joshtrichards changed the title Tags containing umlaut characters not working Handle search keywords containing characters outside the latin alphabet Jan 24, 2025
@joshtrichards
Copy link
Member

Yeah, at the moment this is a deliberate design choice (probably for simplicity of implementation at the time).

So this mostly an enhancement request + probably some robustness improvements to better handle keywords containing characters outside the standard Latin alphabet.

So I'd say this is a two-parter:

  • bug: handle unacceptable characters more gracefully
  • enhancement: decide how to handle characters outside the Latin alphabet (taking into account possible differences across providers)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants