Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use ResultSet.getObject to retrieve java.sql.Date (JDBC specification violation?) #1409

Open
LDVSOFT opened this issue Aug 1, 2023 · 11 comments · May be fixed by #1886
Open

Cannot use ResultSet.getObject to retrieve java.sql.Date (JDBC specification violation?) #1409

LDVSOFT opened this issue Aug 1, 2023 · 11 comments · May be fixed by #1886

Comments

@LDVSOFT
Copy link

LDVSOFT commented Aug 1, 2023

Describe the bug

At some point ClickHouse JDBC driver switched to using java.time. types to represent temporal values. While I salute this change and would prefer too to abandon java.sql.Date, the way it was done, unfortunately, breaks some end consumers that expect old types to be returned by default. While originally we detected it by trying to read data from ClickHouse to Spark (by using the latter), I was able to provide a more direct counterexample.

Steps to reproduce

  1. Setup a JDBC connection using driver version 0.4.6,
  2. Perform a SELECT d FROM some_table where d has type Date or similar (nullable or array),
  3. Access it via resultSet.getObject("d") or resultSet.getObject("d", java.sql.Date.class).
    • First will give java.time.LocalDate, that could be correct under some type maps but not the default one,
    • Second one fails with a ClassCastException, that in my eyes violates JDBC specification directly.

To be exact, I am referencing to JDBC 4.3 specification, appendix B (quotation with my style):

B.3 JDBC Types Mapped to Java Object Types

ResultSet.getObject and CallableStatement.getObject use the mapping shown in TABLE B-3 for standard mappings.

JDBC Type Java Object Type
skipped skipped
DATE java.sql.Date
skipped skipped

All ResultSet.getObject methods that don't take a type map directly tell in their documentation that they follow the mapping above.

Related:

  • ResultSet.getDate works as expected.
  • Arrays of dates (accessed via resultSet.getArray("ds").getArray()) are broken the same way (java.time.LocalDate[] instead of java.sql.Date[]).
  • I've not tested this yet but I'd expect the same problem to arise for JDBC TIMESTAMP with java.time.Instant vs java.sql.Timestamp.
  • I've not tested other JDBC methods, like setting PreparedStatement parameters, but I would check their behaviour too.

I've checked the behaviour of other JDBC drivers against the respectful databases I use, including PostgreSQL, and they work as expected. Older ClickHouse JDBC drvier (I guess around 0.3.1) worked for us.

Expected behaviour

I guess the right thing to do would be one or more of:

  • Add java.time. support for getObject(<label or number>, java.lang.Class<*>) and getObject(<label or number>, <mapping>) methods while preserving the old defaults,
  • Add explicit opt-in for java.time. to be returned by default, if wanted (I guess it could be achieved by this option changing the connection-default type map),

Either way, don't change behaviour from specification until explicit opt-in for one.

Code example

When we detected the issue, after applying the hotfix I was working on a reproducer to understand if the issue was consistent with other API usage or not, here it goes. It uses JUnit 5, TestContainers and Kotlin.

package com.jetbrains.jetstat.oneshot

import org.junit.jupiter.api.assertAll
import org.testcontainers.containers.ClickHouseContainer
import org.testcontainers.junit.jupiter.Container
import org.testcontainers.junit.jupiter.Testcontainers
import java.sql.Connection
import java.sql.DriverManager
import java.time.LocalDate as JavaTimeDate
import java.sql.Date as JavaSqlDate
import kotlin.test.*

@Testcontainers
class ClickhouseJdbcTest {
    @Container
    private val container = ClickHouseContainer("clickhouse/clickhouse-server:22.2")

    private inline fun withConnection(block: (Connection) -> Unit) =
        DriverManager
            .getConnection(container.jdbcUrl, container.username, container.password)
            .use(block)

    @BeforeTest
    fun `setup database`() {
        withConnection { conn ->
            conn.createStatement().use { statement ->
                statement.execute("CREATE TABLE example (a_date Date, b_dates Array(Date)) ENGINE = Log")
                statement.executeUpdate("INSERT INTO example(a_date, b_dates) VALUES ('2022-01-01', ['2022-01-02', '2022-01-03'])")
            }
        }
    }

    @Test
    fun `check that ClickHouse driver reports DATEs consistently`() {
        val expectedDateA = JavaTimeDate.of(2022, 1, 1)
        val expectedDatesB = listOf(
            JavaTimeDate.of(2022, 1, 2),
            JavaTimeDate.of(2022, 1, 3)
        )

        withConnection { conn ->
            conn.createStatement().use { statement ->
                val resultSet = statement.executeQuery("SELECT * FROM example")
                resultSet.next() // First
                val aDate = resultSet.getObject("a_date")
                val aDateDirect = resultSet.getDate("a_date")
                val bDatesArr = resultSet.getArray("b_dates").array
                val bDatesObj = resultSet.getObject("b_dates")

                println("Type of bDatesArr: ${bDatesArr.javaClass}")
                println("Type of bDatesObj: ${bDatesObj.javaClass}")

                assertEquals(JavaSqlDate.valueOf(expectedDateA), aDateDirect)

                when (aDate) {
                    is JavaSqlDate -> {
                        println("single date returned as JavaSqlDate")

                        assertEquals(JavaSqlDate.valueOf(expectedDateA), aDate)

                        val expected = expectedDatesB.map(JavaSqlDate::valueOf)
                        assertAll({
                            assertTrue(bDatesArr is Array<*> && bDatesArr.isArrayOf<JavaSqlDate>())
                            assertContentEquals(expected, bDatesArr.asList())                             
                        }, {
                            assertTrue(bDatesObj is Array<*> && bDatesObj.isArrayOf<JavaSqlDate>())
                            assertContentEquals(expected, bDatesObj.asList())
                        })
                    }

                    is JavaTimeDate -> {
                        println("single date returned as JavaTimeDate")
                        assertEquals(expectedDateA, aDate)

                        assertEquals(expectedDateA, aDate)
                        assertAll({
                            assertTrue(bDatesArr is Array<*> && bDatesArr.isArrayOf<JavaTimeDate>())
                            assertContentEquals(expectedDatesB, bDatesArr.asList())
                        }, {
                            assertTrue(bDatesObj is Array<*> && bDatesObj.isArrayOf<JavaTimeDate>())
                            assertContentEquals(expectedDatesB, bDatesObj.asList())
                        })
                    }

                    else -> {
                        error("Returned date as a weird type ${aDate.javaClass.name}")
                    }
                }

                val aDateExplicit = resultSet.getObject("a_date", JavaSqlDate::class.java)
                assertEquals(JavaSqlDate.valueOf(expectedDateA), aDateExplicit)
            }
        }
    }
}

Technically it only fails on the last lines, but I believe the consistent reporting of java.time.LocalDates is not the behaviour I expect.

Configuration

Environment

  • Client version: com.clickhouse, clickhouse-jdbc, 0.4.6, http specifier.
  • Language version: Java 17 (from Ubuntu LTS)
  • OS: Linux (Ubuntu 22.04 LTS, up-to-date), amd64

ClickHouse server

  • ClickHouse Server version: tested on Docker image version 22.2, but I believe affects all supported versions
  • ClickHouse Server non-default settings, if any: irrelevant
@LDVSOFT LDVSOFT added the bug label Aug 1, 2023
@zhicwu
Copy link
Contributor

zhicwu commented Aug 1, 2023

Thanks @LDVSOFT, very well explained! Agree it's an issue to fix. Will add an option for explicit opt-in and change the default mapping to legacy Date/Timestamp.

@mzitnik mzitnik added the module-jdbc JDBC driver label Aug 2, 2023
@LDVSOFT
Copy link
Author

LDVSOFT commented Aug 2, 2023

Thanks @zhicwu! That should work out. I think that even on the new mapping driver should still allow to fetch java.sql. types with .getObject(<id>, java.sql.Date.class).

I've noticed that my test case also checks that resultSet.getObject returns java.lang.Array, and that's not necessary required actually, Postgres uses PgArray there.

@LDVSOFT
Copy link
Author

LDVSOFT commented Aug 2, 2023

Actually, JDBC spec, the same B.3 table, requires to return java.sql.Array for ARRAYs in getObject. Hm.

@mshustov mshustov added this to the v0.5.1 milestone Aug 29, 2023
@jnd77
Copy link
Contributor

jnd77 commented Jan 19, 2024

To add some more information on a point raised by @LDVSOFT, there is an issue with PreparedStatement.setObject(int parameterIndex, Object x, int targetSqlType) when x is of type java.sql.Timestamp and targetSqlType is java.sql.Types.Timestamp.

As expected it updates the value in ClickHouseDateTimeValue but since java.sql.Types.Timestamp is not handled, it reverts to the default behavior of the String representation - which is deprecated (and incorrect when the Java time zone is not UTC).

@dolfinus
Copy link

dolfinus commented Oct 2, 2024

@chernser
Copy link
Contributor

chernser commented Oct 2, 2024

Good day, @dolfinus ! Thank you for reporting this!
I will take a look on this.
Btw, we do have ClickHouse Spark connector https://clickhouse.com/docs/en/integrations/apache-spark - may be it would help.

Thanks!

@dolfinus
Copy link

dolfinus commented Oct 2, 2024

Btw, we do have ClickHouse Spark connector

Sorry, but the way this connector is implemented is not suitable for my cases.

@chernser
Copy link
Contributor

chernser commented Oct 2, 2024

@dolfinus ok (if possible, would you please describe what would make our connector work for you?)

@dolfinus
Copy link

dolfinus commented Oct 3, 2024

would you please describe what would make our connector work for you

What I expect from all Spark connectors is that they will be implemented as custom DataFrameReader/DataFrameWriter. In that case I can include all required .jars to Spark session, and then pass table or raw SQL query (in Clickhouse syntax) to DataFrameReader, to get a data I need. Same for writing data to a table - I just pass dataframe to DataFrameWriter with a set of options, and it's done.

Spark connector for Clickhouse does not implement DataFrameReader/DataFrameWriter, but instead it is implemented as custom catalog. In this case, user is not able to create a dataframe on top of Clickhouse SQL query, instead they should convert query to Spark syntax (there some of function names are not supported at all). Users also have to convert CREATE TABLE DDL from Clickhouse syntax to Spark syntax, resolving different type names, different ways of passing table properties and so on. If I understand correctly, there are no plans for adding DataFrameReader/DataFrameWriter support ClickHouse/spark-clickhouse-connector#229.

@chernser
Copy link
Contributor

chernser commented Oct 3, 2024

Thank you a lot for the feedback, @dolfinus !
I've passed it to Spark connector developers.
Currently I do not know about their plans but will ask them to update.

@chernser
Copy link
Contributor

Moved to next release because of too many breaking changes.

@chernser chernser modified the milestones: 0.7.1, Priority Backlog Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants