Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

useCopy doesn't support JSONB or arrays #446

Open
AntonOfTheWoods opened this issue Nov 27, 2024 · 4 comments
Open

useCopy doesn't support JSONB or arrays #446

AntonOfTheWoods opened this issue Nov 27, 2024 · 4 comments

Comments

@AntonOfTheWoods
Copy link

I am unable to build the project after trying for a couple of hours (wasm docker won't build with the scripts, at least on my machine) and it will require a few tests, but the following does what I need it to - it allows for arrays of simple types and JSONB. I spent far, far too long trying to get JSONB[] to import but in the end abandoned it. JSONB now probably has all the methods you might want that previously only had array versions, so maybe left as a challenge to the reader...

I tried to create a nix flake for building the project but wasn't able to get wasm building using the scripts.

function toCopyable(value: any, surround = '"') {
  // Escape double quotes and wrap in quotes if necessary
  if (typeof value === "string" && (value.includes(",") || value.includes('"') || value.includes("\n"))) {
    return `${surround}${value.replace(/"/g, '""')}${surround}`;
  }
  return value === null ? "\\N" : value;
}

function baseValue(value: any) {
  return typeof value === "object" && value !== null && !Array.isArray(value) ? JSON.stringify(value) : value;
}

async function applyMessagesToTableWithCopy({
  pg,
  table,
  schema = "public",
  messages,
  mapColumns,
  debug,
}: ApplyMessagesToTableWithCopyOptions) {
  if (debug) console.log("applying messages with COPY");

  // Map the messages to the data to be inserted
  const data: Record<string, any>[] = messages.map((message) =>
    mapColumns ? doMapColumns(mapColumns, message) : message.value,
  );

  // Get column names from the first message
  const columns = Object.keys(data[0]);

  // Create CSV data
  const csvData = data
    .map((message) => {
      return columns
        .map((column) => {
          const value = baseValue(message[column]);
          if (Array.isArray(value)) {
            const vals: any[] = [];
            let hasObjects = false;
            for (const val of value) {
              vals.push(toCopyable(baseValue(val), ""));
              if (typeof val === "object") {
                hasObjects = true;
              }
            }
            return `"${hasObjects ? "[" : "{"}${vals.join(",")}${hasObjects ? "]" : "}"}"`;
          }
          return toCopyable(value);
        })
        .join(",");
    })
    .join("\n");
  const csvBlob = new Blob([csvData], { type: "text/csv" });
  const sql = `
      COPY "${schema}"."${table}" (${columns.map((c) => `"${c}"`).join(", ")})
      FROM '/dev/blob'
      WITH (FORMAT csv, NULL '\\N')
    `;
  // Perform COPY FROM
  await pg.query(sql, [], {
    blob: csvBlob,
  });

  if (debug) console.log(`Inserted ${messages.length} rows using COPY`);
}
@copiltembel
Copy link
Contributor

I am unable to build the project after trying for a couple of hours (wasm docker won't build with the scripts, at least on my machine)

Just to make sure, what are the steps that you followed to build the project?

This works:

$ git clone [email protected]:electric-sql/pglite.git pglite
$ cd pglite
$ pnpm install
$ pnpm build:all

@AntonOfTheWoods
Copy link
Author

I am unable to build the project after trying for a couple of hours (wasm docker won't build with the scripts, at least on my machine)

Just to make sure, what are the steps that you followed to build the project?

Ok, sorry I should have updated this... I have a difficult network situation and I recently migrated to nixos and am still just learning... It was an issue on my side getting to the international internet from inside the container. I can build it fine now!

The copy issue is still there however. I am almost at the end of something else for pglite and will follow that up by doing some copy to/from a postgres using psql to try and reverse engineer the escaping algorithm. I'll then try and submit a PR.

Unless this is easily exposable from somewhere in the wasm?

@samwillis
Copy link
Collaborator

Hey @AntonOfTheWoods

The useCopy option is quite new, and mostly implemented to speed up the demo I'm working on, I'm not surprised it doesn't work with all types yet. This first version uses CSV, but I'm considering implementing it using the Postgres binary copy format. We can then annotate all types quite easily with the appropriate Postgres types.

Having said that your code above has prompted my thinking, we need to consider what the best implementation is 🤔

@samwillis
Copy link
Collaborator

Unless this is easily exposable from somewhere in the wasm

I used the Postgres 'format' function previously for this. Take a look at the live query extension, it formats the provided query using it in order to create the view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants