Let's make a function to determine phases or the start and end of our graph's cycles. We'll write a function that finds the x
intercepts of both the polynomial and sine graphs. Recall that these two functions appear as follows:
(defn polynomial [a b c x] (-> (+ (* a (Math/pow x 3)) (* b (Math/pow x 2))) (- (* c x)))) (defn sine [a b d x] (- (* a (Math/sin (* b (- x (/ Math/PI 2))))) d))
Looking again at the representative graphs, the first thing to note is a constant x intercept of x=0. Next, if we go in either direction, the graph goes in a particular direction, then returns to pass through x, and winds back up again:
Algorithmically, we can use this to first determine the direction (or side) of the curve. Next, when we pass through x
, the direction (or side) will change. If the side does change, we need to go back in the other direction until the side changes again. We can keep doing this in increasingly smaller steps until we have an effective 0 value. For our purposes, I've chosen an x
of 0 to the 7th power. Any values beyond this can be considered negligible to our phase.
To implement our x-intercept function, we'll need to track a few variables. At each step, we need to know the following:
With all these variables (or states) to track, loop
/recur
looks to be a useful function. We can track multiple values and recursively evaluate them until we reach our desired result:
(defn polynomial-xintercept [x] (polynomial 2 2 3 x)) (defn sine-xintercept [x] (sine 2 2 0 x)) (defn ydirection [ypos] (if (pos? ypos) :positive :negative)) (defn direction-changed? [ypos dirn] (not (= (ydirection ypos) dirn))) (defn get-opposite-direction-key [ydir] (if (= ydir :positive) :negative :positive)) (defn get-opposite-direction-fn [dirn] (if (= dirn +) - +)) (defn effectively-zero? [xval] (= 0.0 (Double. (format "%.7f" xval)))) (defn find-xintercept [direction mfn] (loop [start-point 0.0 distance 1.0 ydir (ydirection (mfn (direction 0 0.1))) dirn direction] (let [next-point (dirn start-point distance)] (if (effectively-zero? (mfn next-point)) next-point (let [dc? (direction-changed? (mfn next-point) ydir)] (recur next-point (if dc? (/ distance 2) distance) (if dc? (get-opposite-direction-key ydir) ydir) (if dc? (get-opposite-direction-fn dirn) dirn)))))))
So, this is one implementation that works. Focus on the loop
and recur
bindings. The let
and if
functions revolve around setting up and deciding whether to continue. We have some helper functions that let us split code. These make some of the bits of code reusable and the main function more readable. The x-intercept algorithm basically follows these steps:
ydirection
to determine the graph's direction (or side). This is where we set up and use the sine-xintercept
and polynomial-xintercept
helper functions. They simplify the parameters passed to them.effectively-zero?
helper function:direction-changed?
)Now, if we want to find the polynomial's or sine's x-intercept, we can use find-xintercept
as follows. Also, note that we can find x-intercepts on both the left and right-hand side of x=0:
(find-xintercept - polynomial-xintercept) -1.8228756561875343 (find-xintercept + polynomial-xintercept) 0.8228756487369537 (find-xintercept - sine-xintercept) -1.570796325802803 (find-xintercept + sine-xintercept) 1.570796325802803
In order to randomize vertical and horizontal dilations, we'll need a function that generates a random double value within a specified range. The Clojure core
namespace doesn't have such a function. However, we can use one from an older project, lazytest, which is no longer under active development. The following code is taken from https://github.com/stuartsierra/lazytest/blob/master/modules/lazytest/src/main/clojure/lazytest/random.clj:
(defn rand-double-in-range "Returns a random double between min and max." [min max] {:pre [(<= min max)]} (+ min (* (- max min) (Math/random))))
Taking a first pass at randomizing the vertical dilation for the polynomial and sine equations, we get the following functions:
(defn randomize-vertical-dilation-P [x] (let [a (rand-double-in-range 0.5 2)] (polynomial a 2 3 x))) (defn randomize-vertical-dilation-S [x] (let [a (rand-double-in-range 0.5 2.7)] (sine a 2 0 x)))
However, both shapes (polynomial and sine) are very similar. These can, therefore, be generalized into one function called randomize-vertical-dilation
. With this function derived, we can reuse a pattern to create randomize-horizontal-dilation
. Also, notice how adjusting the vertical and horizontal dilations means adjusting the a
and b
values as well:
(defn randomize-vertical-dilation [mathfn min' max'] (let [a (rand-double-in-range min' max')] (partial mathfn a))) (defn randomize-horizontal-dilation [mathfn-curried min' max'] (let [b (rand-double-in-range min' max')] (partial mathfn-curried b)))
This is a good use case to partially apply a function with the first argument (a
value). Take this partially applied function and then partially apply it again to b
and any other successive values. We'll then find x-intercepts, select a granularity, and then iteratively generate x values by stepping (iterate
) forward at each granularity increment, starting with the left-most x-intercept. Finally, we can map our partially applied polynomial or sine functions over the granular sequence of x's, yielding y's at each of these x points.
For each feature, we're selecting from a range of random values that work in this context. You can play with these numbers and try whatever works for you:
Vertical dilation:
Horizontal dilation:
Granularity:
Let's take a look at the following code which demonstrates the implementation of what we just covered:
(def one (randomize-vertical-dilation polynomial 0.5 2)) (def two (randomize-horizontal-dilation one 0.5 2)) (def polyn-partial (partial two 3)) (def xinterc-polyn-left (find-xintercept - polynomial-xintercept)) (def xinterc-polyn-right (find-xintercept + polynomial-xintercept)) (def granularityP (rand-double-in-range 0.1 1)) (def xsequenceP (iterate (partial + granularityP) xinterc-polyn-left)) (map polyn-partial xsequenceP)
Our sequence generation functions now look like this:
(defn generate-polynomial-sequence [] (let [one (randomize-vertical-dilation polynomial 0.5 2) two (randomize-horizontal-dilation one 0.5 2) polyn-partial (partial two 3) xinterc-polyn-left (find-xintercept - polynomial-xintercept) xinterc-polyn-right (find-xintercept + polynomial-xintercept) granularityP (rand-double-in-range 0.1 1) xsequenceP (iterate (partial + granularityP) xinterc-polyn-left)] (map polyn-partial xsequenceP))) (defn generate-sine-sequence [] (let [ein (randomize-vertical-dilation sine 0.5 2.7) zwei (randomize-horizontal-dilation ein 0.3 2.7) sine-partial (partial zwei 0) xinterc-sine-left (find-xintercept - sine-xintercept) xinterc-sine-right (find-xintercept + sine-xintercept) granularityS (rand-double-in-range 0.1 1) xsequenceS (iterate (partial + granularityS) xinterc-sine-left)] (map sine-partial xsequenceS))) (defn generate-oscillating-sequence [] (analytics/generate-prices-without-population 5 15))
Apache Commons Math is a Java library that implements a Beta probability curve. Clojure is a language that runs atop the JVM and can leverage this library for its own use. Remember that we now want to combine all our price generating functions into one long stream of price data. This means that we can distribute the polynomial, sine, and stochastic oscillating functions under a beta curve.
If we create a beta distribution of a=2, b=4.1, and x=0 (implicit in the library), the curve looks like this:
Using this, we'll sample evenly in 1/3rds from the pool of our price functions. If we sample from the pool 100 times, the probability distribution will determine how often each third gets sampled. So, for example, in your REPL, try calling test-beta
100 times, which will yield the following results (sorted):
(defn test-beta [beta-distribution] (let [sample-val (.sample beta-distribution)] (cond (< sample-val 0.33) :a (< sample-val 0.66) :b :else :c))) (def beta-distribution (org.apache.commons.math3.distribution.BetaDistribution. 2.0 4.1)) (def result (repeatedly #(test-beta beta-distribution))) (sort (take 100 result)) '(:a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :a :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :b :c :c :c :c :c :c)
Here's a sample graph output (your data may vary):
With this in mind, it's now trivial to create a simple function that samples from the beta distribution. If a sample falls within a specified range, generate a sequence with a random length (itself within a given range):
(defn sample-dispatcher [sample-type sample-length sample-fn] (take sample-length (sample-fn))) (defn sample-prices [beta-distribution] (let [sample-val (.sample beta-distribution)] (cond (< sample-val 0.33) (sample-dispatcher :sine (rand-double-in-range 10 15) generate-sine-sequence) (< sample-val 0.66) (sample-dispatcher :polynomial (rand-double-in-range 4 6) generate-polynomial-sequence) :else (sample-dispatcher :oscillating (rand-double-in-range 8 10) generate-oscillating-sequence))))
We can now generate an infinite sequence of price samples. The (repeatedly #(sample-prices beta-distribution))
expression will give us a list of lists, each sublist being a sample of either a polynomial graph, sine graph, or a sequence of oscillating stochastic points. This is where we'll begin the next step. We could just concat
them all together (recall that concat
acts lazily) and be done with it. However, the last problem is that each list has its own y starting point. For the entire sequence of prices to make sense, we have to normalize the price levels of each subsequent sequence:
(defn generate-prices [beta-distribution] (reduce (fn [^clojure.lang.LazySeq rslt ^clojure.lang.LazySeq each-sample-seq] (let [beginning-price (if (empty? rslt) (rand-double-in-range 5 15) (last rslt)) sample-seq-head (first each-sample-seq) price-difference (math/abs (- sample-seq-head beginning-price))] (if (< sample-seq-head beginning-price) (concat rslt (map #(+ % price-difference) each-sample-seq)) (concat rslt (map #(- % price-difference) each-sample-seq) each-sample-seq)))) '() (repeatedly #(sample-prices beta-distribution))))
You're probably familiar with Clojure's basic functions by now. And your first instinct may be to reduce
over the infinite sequence and adjust all the prices of a subsequent list based on the last price of the previous list. However, this approach will fail. The reduce
function is a core Clojure function that does not act lazily. It is meant to be used when we have finished accumulating data into a list. It then folds over all the elements in a list to arrive at a final value. This means realizing any lazy values. So, if we have an infinite sequence, the preceding function will run indefinitely, which is what generate-prices
does. If we run (generate-prices beta-distribution)
, it will never return a list of generated prices as it will never finish reducing. Clearly, we need another approach:
(defn generate-prices-iterate [beta-distribution] (let [sample-seq (repeatedly #(sample-prices beta-distribution)) iterfn (fn [[^clojure.lang.LazySeq rslt ^clojure.lang.LazySeq remaining-sample-seq]] (let [each-sample-seq (first remaining-sample-seq) beginning-price (if (empty? rslt) (rand-double-in-range 5 15) (last rslt)) sample-seq-head (first each-sample-seq) price-difference (math/abs (- sample-seq-head beginning-price))] ;; only raise the price if below the beginning price (if (< sample-seq-head beginning-price) [(concat rslt (map #(+ % price-difference) each-sample-seq)) (rest remaining-sample-seq)] [(concat rslt (map #(- % price-difference) each-sample-seq)) (rest remaining-sample-seq)])))] (map first (iterate iterfn ['() sample-seq]))))
Instead of reduce
, we can use iterate
to successively call a function and only one argument. This works. However, because we need to maintain the state between lists, we end up having to manually maintain the results and remaining lists between calls. This result in code that is very unclear and the code is unintuitive as a result of having to handle rslt
and remaining-sample-seq
(Var bindings in the preceding code):
(defn generate-prices-for [beta-distribution] (def previous-price nil) (let [adjusted-samples (for [each-sample-seq (repeatedly #(sample-prices beta-distribution)) :let [beginning-price (if (nil? previous-price) (rand-double-in-range 5 15) previous-price) sample-seq-head (first each-sample-seq) price-difference (math/abs (- sample-seq-head beginning-price)) adjusted-sample (if (< sample-seq-head beginning-price) (map #(+ % price-difference) each-sample-seq) (map #(- % price-difference) each-sample-seq)) _ (alter-var-root #'previous-price (fn [x] (last adjusted-sample)))]] adjusted-sample)] (apply concat adjusted-samples)))
So, we need to remain lazy while cleanly handling intermediate state that is required by a list and its predecessor. The for
function (a macro) is Clojure's way of implementing list comprehensions. It lets us manipulate many lists at a time in order to create a single sublist. Again, (repeatedly #(sample-prices beta-distribution))
creates a list of lists, which will be the input to our for
comprehension. Each sublist is the focus for the each-sample-seq
binding and the :let
modifier associated with it. It can, along with the :while
and :when
modifiers, apply conditions for how to modify or consider each value in the list. This means that there's much less state for us to maintain. We can simply adjust each-sample-seq
based on whether the first price is above or below the last price. While this approach is lazy, we have once again not been able to avoid a little hack to maintain the state of previous-price
:
(defn generate-prices-partition [beta-distribution] (let [samples-sequence (repeatedly #(sample-prices beta-distribution)) partitioned-sequences (partition 2 1 samples-sequence) mapping-fn (fn [[fst snd]] (let [beginning-price (last fst) sample-seq-head (first snd) price-difference (math/abs (- sample-seq-head beginning-price))] (if (< sample-seq-head beginning-price) (concat fst (map #(+ % price-difference) snd)) (concat fst (map #(- % price-difference) snd)))))] (apply concat (map mapping-fn partitioned-sequences))))
The partition
function is a good idea as it lets us divide our list of lists into successive pairs (incremented by 1). But, each modified list can't be seen by the successive pair. So, generate-prices-partition
, while much cleaner, produces a price stream with large and arbitrary price jumps:
(defn generate-prices-reductions [beta-distribution] (reductions (fn [^clojure.lang.LazySeq rslt ^clojure.lang.LazySeq each-sample-seq] (let [beginning-price (if (empty? rslt) (rand-double-in-range 5 15) (last rslt)) sample-seq-head (first each-sample-seq) price-difference (math/abs (- sample-seq-head beginning-price))] ;; only raise the price if below the beginning price (if (< sample-seq-head beginning-price) (concat rslt (map #(+ % price-difference) each-sample-seq)) (concat rslt (map #(- % price-difference) each-sample-seq) each-sample-seq)))) '() (repeatedly #(sample-prices beta-distribution))))
Finally, we arrive at a way of maintaining successive states lazily through our infinite sequence of lists. The reductions
function produces a lazy sequence of all the intermediate values of the reduce
function. More importantly, however, is the fact that during function processing, previous intermediate results are available to the current iteration. We now have a clean solution that is very close to our original reduce
version. We've also had a chance to see the utility and trade-offs of the partition
, for
, and iterate
functions in this situation. We can now stitch these into an updated generate-prices
function with a simple guard condition to ensure that our prices aren't negative:
(defn generate-prices ([] (generate-prices (BetaDistribution. 2.0 4.1))) ([beta-distribution] (map (fn [x] (if (neg? x) (* -1 x) x)) (distinct (apply concat (generate-prices-reductions beta-distribution))))))
18.191.189.186