Skip to content

Commit

Permalink
feat!: Improve file attribute caching (#854)
Browse files Browse the repository at this point in the history
BREAKING CHANGE: `S3FileSystem#getCache()` method was removed
BREAKING CHANGE: `S3Path#setFileAttributes()` was removed.
BREAKING CHANGE: `S3FileSystemProvider#getCache()` was removed.
BREAKING CHANGE: `S3FileSystemProvider#setCache()` was removed.
  • Loading branch information
steve-todorov committed Oct 29, 2024
1 parent 54b0d03 commit 08d14c2
Show file tree
Hide file tree
Showing 14 changed files with 837 additions and 262 deletions.
3 changes: 3 additions & 0 deletions build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,9 @@ dependencies {
exclude("org.slf4j", "slf4j-api")
}
api("com.google.code.findbugs:jsr305:3.0.2")
api("com.github.ben-manes.caffeine:caffeine:2.9.3") {
because("Last version to support JDK 8.")
}

testImplementation("ch.qos.logback:logback-classic:1.5.12")
testImplementation("org.junit.jupiter:junit-jupiter:5.11.3")
Expand Down
8 changes: 5 additions & 3 deletions docs/content/contributing/developer-guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Before you start writing code, please read:
## System requirements

1. Gradle 8.1, or higher
2. `JDK8`, `JDK11` or `JDK17`
2. `JDK8`, `JDK11`, `JDK17` or `JDK21`

## Finding issues to work on

Expand Down Expand Up @@ -85,7 +85,7 @@ s3fs.proxy.url=https://my.local.domain/path/to/repository
### Build
Builds the entire code and runs unit and integration tests.
It is assumed you already have the `amazon-test.properties` configuration in place.
It is assumed you already have the `amazon-test.properties` configuration in place under the `src/test/resources` or `src/testIntegration/resources`.
```
./gradlew build
Expand All @@ -100,9 +100,11 @@ It is assumed you already have the `amazon-test.properties` configuration in pla
### Run only integration tests
```
./gradlew it-s3
./gradlew testIntegration
```
You can also use `./gradlew build -x testIntegration` to skip the integration tests.
### Run all tests
```
Expand Down
52 changes: 27 additions & 25 deletions docs/content/reference/configuration-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,30 @@

A complete list of environment variables which can be set to configure the client.

| Key | Default | Description |
|-------------------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------|
| s3fs.access.key | none | <small>AWS access key, used to identify the user interacting with AWS</small> |
| s3fs.secret.key | none | <small>AWS secret access key, used to authenticate the user interacting with AWS</small> |
| s3fs.request.metric.collector.class | TODO | <small>Fully-qualified class name to instantiate an AWS SDK request/response metric collector</small> |
| s3fs.connection.timeout | TODO | <small>Timeout (in milliseconds) for establishing a connection to a remote service</small> |
| s3fs.max.connections | TODO | <small>Maximum number of connections allowed in a connection pool</small> |
| s3fs.max.retry.error | TODO | <small>Maximum number of times that a single request should be retried, assuming it fails for a retryable error</small> |
| s3fs.protocol | TODO | <small>Protocol (HTTP or HTTPS) to use when connecting to AWS</small> |
| s3fs.proxy.domain | none | <small>For NTLM proxies: The Windows domain name to use when authenticating with the proxy</small> |
| s3fs.proxy.protocol | none | <small>Proxy connection protocol.</small> |
| s3fs.proxy.host | none | <small>Proxy host name either from the configured endpoint or from the "http.proxyHost" system property</small> |
| s3fs.proxy.password | none | <small>The password to use when connecting through a proxy</small> |
| s3fs.proxy.port | none | <small>Proxy port either from the configured endpoint or from the "http.proxyPort" system property</small> |
| s3fs.proxy.username | none | <small>The username to use when connecting through a proxy</small> |
| s3fs.proxy.workstation | none | <small>For NTLM proxies: The Windows workstation name to use when authenticating with the proxy</small> |
| s3fs.region | none | <small>The AWS Region to configure the client</small> |
| s3fs.socket.send.buffer.size.hint | TODO | <small>The size hint (in bytes) for the low level TCP send buffer</small> |
| s3fs.socket.receive.buffer.size.hint | TODO | <small>The size hint (in bytes) for the low level TCP receive buffer</small> |
| s3fs.socket.timeout | TODO | <small>Timeout (in milliseconds) for each read to the underlying socket</small> |
| s3fs.user.agent.prefix | TODO | <small>Prefix of the user agent that is sent with each request to AWS</small> |
| s3fs.amazon.s3.factory.class | TODO | <small>Fully-qualified class name to instantiate a S3 factory base class which creates a S3 client instance</small> |
| s3fs.signer.override | TODO | <small>Fully-qualified class name to define the signer that should be used when authenticating with AWS</small> |
| s3fs.path.style.access | TODO | <small>Boolean that indicates whether the client uses path-style access for all requests</small> |
| s3fs.request.header.cache-control | blank | <small>Configures the `cacheControl` on request builders (i.e. `CopyObjectRequest`, `PutObjectRequest`, etc) |
| Key | Default | Description |
|-------------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------|
| s3fs.access.key | none | <small>AWS access key, used to identify the user interacting with AWS</small> |
| s3fs.secret.key | none | <small>AWS secret access key, used to authenticate the user interacting with AWS</small> |
| s3fs.request.metric.collector.class | TODO | <small>Fully-qualified class name to instantiate an AWS SDK request/response metric collector</small> |
| s3fs.cache.attributes.ttl | `60000` | <small>TTL for the cached file attributes (in millis)</small> |
| s3fs.cache.attributes.size | `5000` | <small>Total size of cached file attributes</small> |
| s3fs.connection.timeout | TODO | <small>Timeout (in milliseconds) for establishing a connection to a remote service</small> |
| s3fs.max.connections | TODO | <small>Maximum number of connections allowed in a connection pool</small> |
| s3fs.max.retry.error | TODO | <small>Maximum number of times that a single request should be retried, assuming it fails for a retryable error</small> |
| s3fs.protocol | TODO | <small>Protocol (HTTP or HTTPS) to use when connecting to AWS</small> |
| s3fs.proxy.domain | none | <small>For NTLM proxies: The Windows domain name to use when authenticating with the proxy</small> |
| s3fs.proxy.protocol | none | <small>Proxy connection protocol.</small> |
| s3fs.proxy.host | none | <small>Proxy host name either from the configured endpoint or from the "http.proxyHost" system property</small> |
| s3fs.proxy.password | none | <small>The password to use when connecting through a proxy</small> |
| s3fs.proxy.port | none | <small>Proxy port either from the configured endpoint or from the "http.proxyPort" system property</small> |
| s3fs.proxy.username | none | <small>The username to use when connecting through a proxy</small> |
| s3fs.proxy.workstation | none | <small>For NTLM proxies: The Windows workstation name to use when authenticating with the proxy</small> |
| s3fs.region | none | <small>The AWS Region to configure the client</small> |
| s3fs.socket.send.buffer.size.hint | TODO | <small>The size hint (in bytes) for the low level TCP send buffer</small> |
| s3fs.socket.receive.buffer.size.hint | TODO | <small>The size hint (in bytes) for the low level TCP receive buffer</small> |
| s3fs.socket.timeout | TODO | <small>Timeout (in milliseconds) for each read to the underlying socket</small> |
| s3fs.user.agent.prefix | TODO | <small>Prefix of the user agent that is sent with each request to AWS</small> |
| s3fs.amazon.s3.factory.class | TODO | <small>Fully-qualified class name to instantiate a S3 factory base class which creates a S3 client instance</small> |
| s3fs.signer.override | TODO | <small>Fully-qualified class name to define the signer that should be used when authenticating with AWS</small> |
| s3fs.path.style.access | TODO | <small>Boolean that indicates whether the client uses path-style access for all requests</small> |
| s3fs.request.header.cache-control | blank | <small>Configures the `cacheControl` on request builders (i.e. `CopyObjectRequest`, `PutObjectRequest`, etc) |
12 changes: 12 additions & 0 deletions src/main/java/org/carlspring/cloud/storage/s3fs/S3Factory.java
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
import java.time.Duration;
import java.util.Properties;

import org.carlspring.cloud.storage.s3fs.attribute.S3BasicFileAttributes;
import org.carlspring.cloud.storage.s3fs.attribute.S3PosixFileAttributes;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import software.amazon.awssdk.auth.credentials.AwsBasicCredentials;
Expand Down Expand Up @@ -41,6 +43,16 @@ public abstract class S3Factory

public static final String SECRET_KEY = "s3fs.secret.key";

/**
* Maximum TTL in millis to cache {@link S3BasicFileAttributes} and {@link S3PosixFileAttributes}.
*/
public static final String CACHE_ATTRIBUTES_TTL = "s3fs.cache.attributes.ttl";

/**
* Total size of {@link S3BasicFileAttributes} and {@link S3PosixFileAttributes} cache.
*/
public static final String CACHE_ATTRIBUTES_SIZE = "s3fs.cache.attributes.size";

public static final String REQUEST_METRIC_COLLECTOR_CLASS = "s3fs.request.metric.collector.class";

public static final String CONNECTION_TIMEOUT = "s3fs.connection.timeout";
Expand Down
31 changes: 19 additions & 12 deletions src/main/java/org/carlspring/cloud/storage/s3fs/S3FileSystem.java
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
package org.carlspring.cloud.storage.s3fs;

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableSet;
import org.carlspring.cloud.storage.s3fs.cache.S3FileAttributesCache;
import org.carlspring.cloud.storage.s3fs.util.S3Utils;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.Bucket;

import java.io.IOException;
import java.nio.file.FileStore;
import java.nio.file.FileSystem;
import java.nio.file.Path;
import java.nio.file.PathMatcher;
import java.nio.file.WatchService;
import java.nio.file.attribute.UserPrincipalLookupService;
import java.util.List;
import java.util.Properties;
import java.util.Set;

import com.google.common.base.Joiner;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableSet;
import org.carlspring.cloud.storage.s3fs.util.S3Utils;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.Bucket;
import static org.carlspring.cloud.storage.s3fs.S3Path.PATH_SEPARATOR;

/**
Expand All @@ -37,7 +37,7 @@ public class S3FileSystem

private final String endpoint;

private final int cache;
private S3FileAttributesCache fileAttributesCache;

private final Properties properties;

Expand All @@ -51,8 +51,12 @@ public S3FileSystem(final S3FileSystemProvider provider,
this.key = key;
this.client = client;
this.endpoint = endpoint;
this.cache = 60000; // 1 minute cache for the s3Path
this.properties = properties;

int cacheTTL = Integer.parseInt(String.valueOf(properties.getOrDefault(S3Factory.CACHE_ATTRIBUTES_TTL, "60000")));
int cacheSize = Integer.parseInt(String.valueOf(properties.getOrDefault(S3Factory.CACHE_ATTRIBUTES_SIZE, "5000")));

this.fileAttributesCache = new S3FileAttributesCache(cacheTTL, cacheSize);
}

public S3FileSystem(final S3FileSystemProvider provider,
Expand All @@ -78,6 +82,7 @@ public String getKey()
public void close()
throws IOException
{
this.fileAttributesCache.invalidateAll();
this.provider.close(this);
}

Expand Down Expand Up @@ -184,12 +189,14 @@ public String[] key2Parts(String keyParts)
return S3Utils.key2Parts(keyParts);
}

public int getCache()
/**
* @return The {@link S3FileAttributesCache} instance holding the path attributes cache for this file provider.
*/
public S3FileAttributesCache getFileAttributesCache()
{
return cache;
return fileAttributesCache;
}


/**
* @return The value of the {@link S3Factory#REQUEST_HEADER_CACHE_CONTROL} property. Default is empty.
*/
Expand Down
Loading

0 comments on commit 08d14c2

Please sign in to comment.