mikey.bike . archive . other writings . lists . about
Postgresql’s AT TIME ZONE
is one of those functions that seems
intuitive at first but can bite you very easily. If you’re going to
use it, you should carefully read the documentation and verify what
the input types are, because it’s a function that’s both overloaded
and polymorphic in its return type:
select '2019-08-10T8:51:00'::timestamp AT TIME ZONE 'PDT';
-- #=> 2019-08-10 11:51:00-04
select '2019-08-10T8:51:00'::timestamptz AT TIME ZONE 'PDT';
-- #=> 2019-08-10 05:51:00
In the first example we interpret the time as if it were in
PDT
. Human interpretation: It’s 8:51 in California. Postgres then
prints out the time for my system time, which is EDT
.
In the second, it’s the inverse: we interpret the time at EDT
, and
then print out the time it would be in PDT
. Human interpretation:
It’s 8:51 in Boston, what time is it in California? The reason 8:51
is interpreted as an EDT
timestamp is because Postgres coerces to
timestamptz
using the system time zone, and on my system that’s
EDT
. So for me the following two queries are equivalent:
The double AT TIME ZONE
looks weird, but it is useful for dealing
with timestamp without time zone
columns, because really, AT TIME
ZONE
does two jobs. One (when given a timestamp without a time zone)
is to assert that the time is in such-and-such time zone. The other
job (when given a timestamp with a time zone) is to query, what time
is it in such-and-such time zone?
select '2019-08-10T8:51:00'::timestamptz
AT TIME ZONE 'PDT';
select '2019-08-10T8:51:00'::timestamp
AT TIME ZONE 'EDT'
AT TIME ZONE 'PDT';
Part of the problem is that “At” is an overloaded word, and you could
probably blame the English language for some of this. AT TIME ZONE
may have been better named GO AHEAD AND DO A THING
, because at least
then we don’t think we know what it’s doing.
AT TIME ZONE
is both overloaded and has a polymorphic return type. I
can’t immediately summon what an accurate type signature for this
function might be, so I wonder: could we define it in Haskell?
First, let’s define some dummy datatypes to work with. The timestamp
types will just be wrappers around String
; we won’t actually do any
real conversion. For now, we’re just interested in the types.
type TimeZone = String
data Timestamp = Timestamp String deriving Show
data TimestampTz = TimestampTz String deriving Show
Now let’s define the class of types that can be converted using “at
time zone”. While normal, boring type classes are generic over one
type variable, an instance of this class must be defined for two
type variables: the input type and the output type. This kind of
craziness wasn’t allowed in the original Haskell standard, and we need
to enable the MultiParamTypeClasses
language extension for this to
compile.
From A History of Haskell: Being Lazy With Class:
“We [the Haskell Committee] felt that single-parameter type classes were already a big step beyond our initial conservative design goals, and they solved the problem we initially addressed. Going beyond that would be an unforced step into the dark, and we were anxious about questions of overlap, confluence, and decidability of type inference. […] As time went on, however, user pressure grew to adopt multi-parameter type classes, and GHC adopted them in 1997 (version 3.00). However, multi-parameter type classes did not really come into their own until the advent of functional dependencies.”
{-# LANGUAGE MultiParamTypeClasses #-}
class AtTimeZoneConvertible input output where
atTimeZone :: TimeZone -> input -> output
Now let’s define the instances. One for Timestamp -> TimestampTz
,
and one for TimestampTz -> Timestamp
.
instance AtTimeZoneConvertible Timestamp TimestampTz where
Timestamp timestamp) =
atTimeZone timezone (TimestampTz $ timestamp ++ " " ++ timezone
instance AtTimeZoneConvertible TimestampTz Timestamp where
TimestampTz timestamp) =
atTimeZone timezone (-- do some time calculations...
Timestamp $ "10:10"
And now we can use them like so:
> atTimeZone "EDT" (Timestamp "10:10") :: TimestampTz
ghciTimestampTz "10:10 EDT"
> atTimeZone "EDT" (TimestampTz "14:10 UTC") :: Timestamp
ghciTimestamp "10:10"
Disregarding the fact that atTimeZone
only ever returns "10:10"
when given a timestamp with time zone, this looks good! One annoying
thing is that we need to specify the return type, even though we’ve
only defined one instance for each input type. The problem is that
there isn’t anything preventing us from defining more instances and
having multiple possible output types for, say, converting a
Timestamp
.
If we try to evaluate atTimeZone
without specifying the return type,
we end up with this error:
ghci> atTimeZone "EDT" (TimestampTz "14:10 UTC")
<interactive>:53:1: error:
• Non type-variable argument
in the constraint: AtTimeZoneConvertible TimestampTz output
GHC is saying something like: I can infer the types as far as
AtTimeZoneConvertible TimestampTz output
, and that’s just not enough
to decide what instance to use, because output
is a type variable,
not a concrete type.
What we want to say is that the input
type implies the output
type. This is exactly what the FunctionalDependencies
language
extension lets us do. It looks like this:
{-# LANGUAGE FunctionalDependencies #-}
class AtTimeZoneConvertible input output | input -> output where
atTimeZone :: TimeZone -> input -> output
Now the compiler will prevent us from defining more than one instance for a
given input
type, and we no longer need to specify the output type:
> atTimeZone "EDT" (TimestampTz "14:10 UTC")
ghciTimestamp "10:10"
We can even call it multiple times, just like we did with AT TIME
ZONE
:
> atTimeZone "UTC" $ atTimeZone "EDT" (TimestampTz "14:10 UTC")
ghciTimestampTz "10:10 UTC"
If we try to define another instance for the Timestamp
input type,
for say a String
output type:
instance AtTimeZoneConvertible Timestamp String where
Timestamp timestamp) =
atTimeZone timezone (++ " " ++ timezone timestamp
We’ll get an error like this:
AtTimeZone.hs:14:10: error:
Functional dependencies conflict between instance declarations:
instance AtTimeZoneConvertible Timestamp TimestampTz
-- Defined at AtTimeZone.hs:14:10
instance AtTimeZoneConvertible Timestamp String
-- Defined at AtTimeZone.hs:18:10
I was also curious if this is possible in Rust. I am much less familiar with Rust, but I’ve at least heard a few times that Rust’s traits are like Haskell’s type classes. Let’s see how it might work. First, some data types:
type TimeZone = String;
#[derive(Debug)]
struct Timestamp {
: String,
ts}
#[derive(Debug)]
struct TimestampTz {
: String,
ts}
As in Haskell, we’ll define a trait for AtTimeZoneConvertible
:
trait AtTimeZoneConvertible<Output> {
fn at_time_zone(&self, time_zone: TimeZone) -> Output;
}
One difference with Haskell already is that Rust has a more
object-oriented approach: a trait
is defined in terms of some self
type. In Haskell, this was just another type variable,
input
. Practically, there isn’t really a difference, as far as I can
tell.
Now let’s define some instances:
impl AtTimeZoneConvertible<TimestampTz> for Timestamp {
fn at_time_zone(&self, time_zone: TimeZone) -> TimestampTz {
{
TimestampTz : self.ts.to_string() + " " + &time_zone,
ts}
}
}
impl AtTimeZoneConvertible<Timestamp> for TimestampTz {
fn at_time_zone(&self, _time_zone: TimeZone) -> Timestamp {
{
Timestamp : "10:10".to_string(),
ts}
}
}
This is similar to our approach in Haskell without functional dependencies. So I assumed the following code wouldn’t work:
fn main() {
println!(
"{:?}",
{
Timestamp : "14:10".to_string()
ts}
.at_time_zone("UTC".to_string())
.at_time_zone("EDT".to_string())
;
)}
// $ cargo run
// Timestamp { ts: "10:10" }
Surprisingly, it does! Rust seems to be saying, you’ve only given me
one instance for AtTimeZoneConvertible
for your type, so I’ll use
it, even though multiple instances could exist.
I’m not quite sure why the Rust compiler allows this. It seems like a
reasonable thing to disallow, because there is no guarantee that the
compiler can infer the types. Remember that the trait is generic over
the Output
type. If we add another instance, there is indeed a
failure to compile that same code:
impl AtTimeZoneConvertible<String> for Timestamp {
fn at_time_zone(&self, time_zone: TimeZone) -> String {
self.ts.to_string() + " " + &time_zone
}
}
$ cargo build
Compiling rust-at-time-zone v0.1.0 (/Users/mjhoy/proj/rust-at-time-zone)
error[E0282]: type annotations needed
--> src/main.rs:45:9
|
45 | / Timestamp {
46 | | ts: "14:10".to_string()
47 | | }
48 | | .at_time_zone("UTC".to_string())
| |__________^ cannot infer type for `Output`
|
= note: type must be known at this point
error: aborting due to previous error
But perhaps in the real world, this isn’t such a problem, and the benefits of making life easier when there is just one instance are too good to pass up.
All that said, we can prevent this issue by enforcing only one
Output
type per instance for a given type using an associated
type. It looks like this:
trait AtTimeZoneConvertible {
type Output;
fn at_time_zone(&self, time_zone: TimeZone) -> Self::Output;
}
This is a lot like the input -> output
functional dependency for
Haskell. The AtTimeZoneConvertible
trait is no longer generic over
the Output
type; instead, one Output
type must be chosen for a
given instance. Our instances now look like this:
impl AtTimeZoneConvertible for Timestamp {
type Output = TimestampTz;
fn at_time_zone(&self, time_zone: TimeZone) -> TimestampTz { ... }
}
impl AtTimeZoneConvertible for TimestampTz {
type Output = Timestamp;
fn at_time_zone(&self, _time_zone: TimeZone) -> Timestamp { ... }
}
If we try to define another instance for Timestamp
, we now get an
error:
impl AtTimeZoneConvertible for Timestamp {
type Output = String;
fn at_time_zone(&self, time_zone: TimeZone) -> String {
self.ts.to_string() + " " + &time_zone
}
}
$ cargo build
error[E0119]: conflicting implementations of trait `AtTimeZoneConvertible`
for type `Timestamp`:
--> src/main.rs:30:1
Spend extra time reviewing code that uses AT TIME ZONE
or coercions
such as ::timestamp
or ::timestamptz
. The behavior is often
surprising.
At work, we have data that moves from a production database into a
warehouse. In this process, for some reason, timestamp
columns are
coerced to timestamptz
. This means any query that uses AT TIME
ZONE
is semantically different depending on whether you run it in the
warehouse or in production, and was the source of some subtle bugs.
Also, both Haskell and Rust have good support for representing functions that are overloaded and polymorphic in their return types. GHC is a bit more strict than the Rust compiler, though; you may want to use associated types in Rust to enforce functional dependencies.