Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbGetQuery() Returns Incorrect NULL Values When Using R ODBC Package on Linux #864

Open
ddl-chris-mutono opened this issue Dec 2, 2024 · 2 comments
Labels
bug an unexpected problem or unintended behavior db2 IBM DB2

Comments

@ddl-chris-mutono
Copy link

ddl-chris-mutono commented Dec 2, 2024

Overview:
When connecting to a DB2 database using the R ODBC package on Linux, null values in a specific column are returned as unreadable characters instead of being correctly identified as NULL.

Problem Description:

  • A specific column in the database contains null values.
  • When querying the table using the R odbc package on Linux, null values are returned as unreadable characters (e.g., @oNjU) instead of NA.

The same query works correctly with:

  • The Python pyodbc package on Linux.
  • Both R odbc and Python pyodbc on Windows.

Reproducible Example:

library(tidyverse)
library(odbc)

Sys.setenv(CODEPAGE = "819") 
Sys.setenv(DB2CODEPAGE = "819")

connection <- odbc::dbConnect(odbc::odbc(),
 .connection_string = paste0('DRIVER=', "{IBM DB2 ODBC DRIVER}",
    ';UID=', Sys.getenv("USER"),
    ';PWD=', Sys.getenv("PASSWORD"),
    ';HOSTNAME=', "placeholder_hostname",
    ';DATABASE=', "placeholder_db",
    ';PORT=', "45000",
    ';COMMITONEOF=0;',
    ';NULLS=YES;'),
 encoding = "latin1")

df_test <- odbc::dbGetQuery(connection, "SELECT column_name FROM table_name WHERE column_name IS NULL FETCH FIRST 10 ROWS ONLY")

df_test %>% head()

Expected Behaviour:

Null values in the column should be represented as NA in the resulting R data frame.

Observed Behavior:

Null values are returned as unreadable characters, as shown below:

 column_name
1          @ONjU
2          @ONjU
3          @ONjU
...

Version info:

  • R version: 4.3.2
  • R ODBC version: 1.5.0
  • ODBC driver: IBM Data Server Driver for ODBC and CLI (64-bit) 11.5.9
@simonpcouch simonpcouch added db2 IBM DB2 bug an unexpected problem or unintended behavior labels Jan 4, 2025
@simonpcouch
Copy link
Collaborator

This smells like an encoding issue to me. A few assorted ideas:

  1. Do you still see this issue with encoding = "UTF-8" or encoding = "UTF-16"?
  2. This SO post seems to suggest setting CCSID = 1252 may have resolved their issue.
  3. odbc 1.5.0 introduced a couple encoding-related changes, and we also have some in the developmental version of the package since then. Could you attempt to run this same query with odbc 1.4.2 (install via pak::pak("r-dbi/[email protected]")) and also with the dev version pak::pak("r-dbi/odbc"), restarting R after installing in both cases?

@detule
Copy link
Collaborator

detule commented Jan 10, 2025

Dovetailing onto @simonpcouch's note:

I am unable to replicate on my debian box, though maybe with a sligthly older driver.

   > conn <- dbConnect(
                     odbc::odbc(),
                     dsn = "***", UID="***", PWD='**')
   > dbGetQuery(conn, "
              SELECT CAST(NULL AS FLOAT) AS NULL_FLOAT,
              CAST(NULL AS DATE) AS NULL_DATE,
              CAST(NULL AS INT) AS NULL_INT,
              CAST(NULL AS VARCHAR(10)) AS NULL_VARCHAR
              FROM SYSIBM.SYSTABLES FETCH FIRST 2 ROWS ONLY")
     NULL_FLOAT NULL_DATE NULL_INT NULL_VARCHAR
   1         NA      <NA>       NA         <NA>
   2         NA      <NA>       NA         <NA>
   > conn@info
   $dbname
   [1] "SAMPLE"
 
   $dbms.name
   [1] "DB2/LINUXX8664"
 
   $db.version
   [1] "11.05.0800"
...
   $servername
   [1] "DB2"
 
   $drivername
   [1] "libdb2.a"
 
   $odbc.version
   [1] "03.52"
 
   $driver.version
   [1] "11.05.0800"
 
   $odbcdriver.version
   [1] "03.51"

In addition to Simon's suggestions, can you try:

  • A similarly minimal approach ( as few connection parameters as possible / simple query ) and provide the output?
  • Can you provide the content of your odbcinst.ini under [IBM DB2 ODBC DRIVER].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior db2 IBM DB2
Projects
None yet
Development

No branches or pull requests

3 participants