Skip to content

BulkInsert configuration

Jeronimo López edited this page Jan 7, 2019 · 2 revisions

Load Data and Copy methods are based on serializing a batch of rows to a CSVlike StringBuilder, and when serialized information reach a limit of characters, flush it to the database. Depending on the available memory and the size of each row you can tune this limit.

In the JDBC batch insert method you can configure the numbers of rows of each batch operation.

You can also configure how transactions are managed persisting your Stream or Collection:

  • Let JFleet commit to database each time a batch of rows is flushed.
  • Join to the existing transaction in the provided connection, and deciding on your own code when to commit or rollback it.

If you override the default values (10MB and autocommit), you must use a different BulkInsert constructor.

For LoadDataBulkInsert version, with 5MB batch size and no autocommit:

import org.jfleet.mysql.LoadDataConfiguration;
import static org.jfleet.mysql.LoadDataConfiguration.LoadDataConfigurationBuilder;

LoadDataConfiguration config = from(Employee.class)
        .batchSize(5 * 1024 * 1024)
        .autocommit(false)
        .build();
BulkInsert<Employee> bulkInsert = new LoadDataBulkInsert<>(config);
bulkIsert.insertAll(connection, stream);

For PgCopyBulkInsert version, with 30MB batch size and autocommit after each batch:

import org.jfleet.postgres.PgCopyConfiguration;
import static org.jfleet.postgres.PgCopyConfiguration.PgCopyConfigurationBuilder

PgCopyConfiguration config = from(Employee.class)
        .batchSize(30 * 1024 * 1024)
        .autocommit(true)
        .build();
BulkInsert<Employee> bulkInsert = new PgCopyBulkInsert<>(config);
bulkInsert.insertAll(connection, stream);

For JdbcBulkInsert version, with 1000 rows batch size and autocommit after each batch:

import org.jfleet.jdbc.JdbcConfiguration
import static org.jfleet.jdbc.JdbcConfiguration.JdbcConfigurationBuilder;

JdbcConfiguration config = from(Employee.class)
        .batchSize(1000)
        .autocommit(true)
        .build();
BulkInsert<Employee> bulkInsert = new JdbcBulkInsert<>(config);
bulkInsert.insertAll(connection, stream);

MySQL LOAD DATA error handling

In MySQL LOAD DATA command, data-interpretation, duplicate-key errors or foreign key errors become warnings and the operation continues until finish the whole data. Rows with errors are discarded and no SQLException is thrown by database or JDBC driver.

If your business logic is sensitive to these errors you can configure JFleet to detect when some row is missing and throw an exception:

import org.jfleet.mysql.LoadDataConfiguration;
import static org.jfleet.mysql.LoadDataConfiguration.LoadDataConfigurationBuilder;

LoadDataConfiguration config = from(Employee.class)
        .errorOnMissingRow(true);
        .build();

BulkInsert<Employee> bulkInsert = new LoadDataBulkInsert<>(config);
try {
    bulkInsert.insertAll(connection, employeesWithForeignKeyError);
} catch (JFleetException e) {
    logger.info("Expected error on missed FK");
}
Clone this wiki locally