Sunday, January 20, 2019

Quick backtest update [ STRATEGY-TREND-1: XBTUSD @ 2018-11-01 -> 2019-01-17 ]

STRATEGY-TREND-1: XBTUSD @ 2018-11-01 -> 2019-01-17 { git: a623d60742e96d35b8f7365496ffd99911aa7dbb }

NOTE: I'll post future backtests on this page ==> https://quantoga.blogspot.com/p/backtests.html

  • 10000 USD fixed position size (perpetual futures contracts).
  • Fees are included.
  • Limit orders are used for both entries and exits.
Huge improvements are still possible; i.e. even by adding basic kelly% for position sizing+++.

NOTE: the candle coloring isn't a very good representation of what's going on. The color is only based on avg position at a single point in time (the open). The candle at 2019-01-10 (2nd image) is shown as a long (green), but that's only for the first part of the candle before it goes short. It then goes long again before the next candle -- which is therefore green "again". In the end the initial loss for that duration is cancelled out by the gain from the brief short taken in the middle of the drop.


Don't pay too much attention to the colors in this; they are not correct or accurate.


...this is based on recent improvements to the strategy I did a screencapture of here: https://www.youtube.com/watch?v=JnCQ3qOKou4 ..actually, the biggest change is probably a move from trading based on discrete time to continuous time and the addition of a signal system with dedicated grace durations etc. etc..  

Tuesday, January 15, 2019

Clojure for fast processing of streams of data via LAZY-SEQ and SEQUE

UPDATE: After some back and forth I think clojure.core.async with its several buffers both at source, computation/transformation and sink areas is a better fit for what I'm doing! 

LAZY-SEQ and SEQUE are useful in many contexts where you need to speed up processing of streams of data (SEQ) from e.g. another computation, a filesystem or a web or database server -- anything really.

The key idea is that SEQUE will continuously fetch the next few elements from the stream in advance and in the background (thread) -- while the code that consumes or handles the data keeps on working on the current or already fetched element(s) in the foreground.

A quick video to demonstrate the effect:


SEQUE uses a LinkedBlockingQueue behind the scenes, but you can pass it anything that implements the BlockingQueue interface as needed.

Clojure makes this simple and fun and all of this might be pretty basic and trivial for many, but a small "trick" is needed to set it up correctly -- like this:

Thursday, January 10, 2019

Big data: from compressed text (e.g. CSV) to compressed binary format -- or why Nippy (Clojure) and java.io.DataOutputStream are awesome

Say you have massive amounts of historical market data in a common, gzip'ed CSV format or similar and you have these data types which represents instances of the data in your system:

(defrecord OFlow ;; Order flow; true trade and volume data!
    [^double trade ;; Positive = buy, negative = sell.
     ^double price ;; Average fill price.
     ^Keyword tick-direction ;; :plus | :zero-plus | :minus | :zero-minus
     ^long timestamp ;; We assume this is the ts for when the order executed in full.

     ^IMarketEvent memeta]

(defrecord MEMeta
    [^Keyword exchange-id
     ^String symbol
     ^long local-timestamp])


A good way to store and access this would be to use a binary format and a modern, fast compression algorithm. The key issue is fast decompression and LZ4HC is the best here as far as I'm aware of -- apparently reaching the limitations of what's possible with regards to RAM speed. To do this we'll use https://github.com/ptaoussanis/nippy which exposes the DataOutputStream class nicely and enables us to express a simple binary protocol for reading and writing our data types, like this:

(nippy/extend-freeze OFlow :QA/OFlow [^OFlow oflow output]
                     (.writeDouble output (.trade oflow))
                     (.writeDouble output (.price oflow))
                     (.writeByte output (case (.tick-direction oflow)
                                          :plus 0, :zero-plus 1, :minus 2, :zero-minus 3))
                     (.writeLong output (.timestamp oflow))
                     ;; MEMeta
                     (.writeUTF output (name (.exchange-id ^MEMeta (.memeta oflow))))
                     (.writeUTF output (.symbol ^MEMeta (.memeta oflow)))
                     (.writeLong output (.local-timestamp ^MEMeta (.memeta oflow))))

(nippy/extend-thaw :QA/OFlow [input]
                   (->OFlow (.readDouble input)
                            (.readDouble input)
                            (case (.readByte input)
                              0 :plus, 1 :zero-plus, 2 :minus, 3 :zero-minus)
                            (.readLong input)
                            (->MEMeta (keyword (.readUTF input))
                                      (.readUTF input)
                                      (.readLong input))))


..to write out the binary data to a file, you'd do something like (oflow-vector is a vector containing OFlow instances):

(nippy/freeze-to-file "data.dat" oflow-vector                                   
                      {:compressor nippy/lz4hc-compressor, :encryptor nil, :no-header? true})


..and to read it back in to get a vector of OFlow instances as the result you'd do something like this:

(nippy/thaw-from-file "data.dat"
                      {:compressor nippy/lz4hc-compressor, :encryptor nil, :no-header? true})


...it's so simple and the result is very, very good in terms of speed and space savings [I'll add some numbers here later]. Of course you'd still want to use something like PostgreSQL for indexed views or access to the data, but this is very nice for fast access to massive amounts of sequential, high resolution data. I've split things up in such a way that each file contains 1 day worth of data; this way it is possible to make fast requests to ranges of the data at any location without doing long, linear searches. 👍