3 packages you need to know about before processing timestamps in Haskell

Are you trying to process a large amount of timestamped data? Need to manipulate events in CSV or log files? Your choice of time library could be slowing you down.

Most people new to this problem will do the obvious thing and google for "haskell time". Unfortunately, the first thing to come up is the worst possible choice.

Give me the benchmarks already!

Time

The time package comes with GHC and unlucky for you it has the canonical SEO friendly name. It may be perfectly fine for some situations, but it’s the slowest thing out there by a large margin. If you’re doing a lot of logging or building a data processing app this is going to have a huge negative impact.

You know how painful profiling in Haskell can be, why put yourself behind before you’ve even started.

Pros

  • Comes with GHC
  • Picosecond resolution
  • Extensive capabilities (like timezones)

Cons

  • Uses Integer to store timestamps
    • Akin to using String instead ByteString/Text
    • Can’t be packed tightly into vectors
  • Parsing and formatting is slow, uses String
  • String formatting not type safe
  • Complicated to use (capabilities come at a cost)

Thyme

thyme is a performance focused rewrite of the time library that
has more or less the same API. This can be really useful if you accidentally
picked time for your timestamp heavy app before getting the lay of the
land, and now you want a quick fix for your performance problems.

Pros

  • Much faster than time
  • Mostly compatible with time
  • Includes attoparsec parser, for high performance parsing
  • vector Unbox support, can be laid out extremely efficiently in memory
  • Memory rep is just 64-bits

Cons

  • Microsecond resolution
    • Too coarse for analyzing high frequency stuff like market data
  • Default parsing/formatting via String
  • Complicated to use (same API as time)
  • QuickCheck dependency??
  • Last uploaded to Hackage in 2014
    • It’s quite stable and complete so this may not be a problem
    • Still works/compiles with modern libraries (i.e. GHC 8.6.5)

Chronos

chronos like thyme has a strong focus on performance and has a simpler, safer, interface that doesn’t try to replicate the time API.

If you want the fastest time library on Hackage, this is it.

Pros

  • Blazing fast
  • Nanosecond resolution
    • Good enough resolution for basically everything
    • Still fits in 64-bits
  • Fast parsing via attoparsec
  • Fast formatting via bytestring builder
  • vector Unbox support, can be laid out extremely efficiently in memory
  • Simpler interface
  • Type safe, no format strings

Cons

  • Simpler means less expressive
  • Time offsets, but no time zone support
  • If you need a fancy time format you need to build it up yourself

Benchmarks

chronos smokes the competition. Over 30x faster than time and 3x faster than thyme in the parsing test.

Time taken for parsing and also formatting a single ISO8601-formated timestamp (e.g. 2020-01-22T11:34:29) on an Intel i9-9900k:

The parsing numbers for both Thyme and Chronos are using their attoparsec parser. Chronos also has an attoparsec Zepto parser which was twice as fast for this benchmark, around 100 ns.

view benchmark code

Verdict

  • Use chronos for performance and simplicity.
  • Use thyme if you want to speed up a time codebase without a rewrite.
  • Stick with time if you’re comfortable and don’t care about performance.

Packages

timeA time library
thymeA faster time library
chronosA performant time library
attoparsecFast combinator parsing for bytestrings and text
bytestringFast, compact, strict and lazy byte strings with a list interface
vectorEfficient Arrays

Credits

Photo by Fabrizio Verrecchia on Unsplash

jacobstanley.io

Menu