Skip to content

ribelo/danzig

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

danzig

a easy-to-use transducer based data analysis tools for the clojure programming language.

rationale

any finitely complicated problem can be made infinitely complicated by a finite number of macros, so why not write macros that write (macro based)meander code that would generate transducers functions?

…wait, but why not just use a meander?

because meander.epsilon/scan is slow, and because transducers are super composable and can be combined into endless sequences

usage examples

(require '[taoensso.encore :as enc])
(require '[ribelo.danzig :as dz :refer [=>>]])

(def data (vec (repeatedly 1000000 (fn [] {:a (* (rand-int 100) (if (enc/chance 0.5) 1 -1))
                                           :b (* (rand-int 100) (if (enc/chance 0.5) 1 -1))
                                           :c (* (rand-int 100) (if (enc/chance 0.5) 1 -1))}))))

why you should care?

because it is super concise and pleasing to the eye

(defn q1 []
  (into []
        (comp
         (map (fn [{:keys [a b] :as m}] (assoc m :d (+ a b))))
         (filter (fn [{:keys [c]}] (pos? c)))
         (map (fn [m] (update m :a inc)))
         (filter (fn [{:keys [a b c]}] (= a b c))))
        data))

(defn q2 []
  (=>> data
       (dz/with :d [+ :a :b])
       (dz/where :c pos?)
       (dz/with :a [+ :a 1])
       (dz/where [= :a :b :c])))

(= (q1) (q2))
;; => true

because it is much faster than handwritten code

(enc/qb 1 (q1) (q2))
;; => [309.46 145.71] - in ms

fat arrow

the most basic function is the fat arrow which replaces the tread last

(=>> [1 2 3 4 5]
     (map inc)
     (filter even?))
;; => [2 4 6]

(macroexpand '(=>> [1 2 3 4 5] (map inc) (filter even?)))
;; => (clojure.core/into [] (ribelo.danzig/comp-some (map inc) (filter even?)) [1 2 3 4 5])

(=>> [1 2 3 4 5]
     (map inc)
     (when false
       (filter even?)))
;; => [2 3 4 5 6]

fat arrows can be mixed with other arrows

(=>> [1 2 3 4 5]
     (map inc)
     (->> (mapv inc)))
;; => [3 4 5 6 7]

you can also use first and last

(=>> [1 2 3 4 5]
     (map inc)
     first)
;; => 2

(=>> [1 2 3 4 5]
     (map inc)
     (last))
;; => 6

where

where can take the function

(=>> data
     (dz/where (fn [{:keys [a]}] (= a 1)))
     (take 1))
;; => [{:a 1, :b 87, :c -27}]

the assumption is that we have a collection of maps, so we can query the key value

(=>> data (dz/where :a 1) (take 1))
;; => [{:a 1, :b 87, :c -27}]

if we need to search for a key, we must use '

(=>> [{:a :some/key} {:a :other/key}] (dz/where [= :a ':other/key]))
;; => [{:a :other/key}]

or keys

(=>> data (dz/where {:a 1 :b 1}) (take 1))
;; => [{:a 1, :b 1, :c 74}]

or keys and functions

(=>> data (dz/where {:a even? :b odd?}) (take 1))
;; => [{:a 40, :b 39, :c -76}]

we can use a vector, where the first argument is the function

(=>> data (dz/where [= :a :b :c]) (take 1))
;; => [{:a 27, :b 27, :c 27}]
(=>> data (dz/where [= :a 1]) (take 1))
;; => [{:a 1, :b 87, :c -27}]

ask for the key that meets the condition

(=>> data (dz/where even? :a) (take 1))
;; => [{:a -96, :b -84, :c -76}]

(=>> data (dz/where :a even?) (take 1))
;; => [{:a -96, :b -84, :c -76}]

square clojure is still clojure

(=>> data (dz/where [= [+ :a :b] :c]) (take 1))
;; => [{:a 0, :b 2, :c 2}]
(=>> data (dz/where [= [+ :a :b] [+ :c :a]]) (take 1))
;; => [{:a 75, :b -43, :c -43}]

meander just works

(=>> data (dz/where {:a ?x :b ?x :c ?x}) (take 1))
;; => [{:a -32, :b -32, :c -32}]

(require '[meander.epsilon :as m])
(=>> data (dz/where {:a (m/pred pos?)}) (take 1))
;; => [{:a 92, :b -64, :c -96}]

is as fast as the fine-tuned hand-written code

(enc/qb 1
  (=>> data (filter (fn [{:keys [a]}] (= a 1))))
  (=>> data (filter (fn [m] (= 1 (:a m)))))
  (=>> data (dz/where :a 1))
  (=>> data (dz/where {:a 1})))
;; => [81.88 54.14 48.77 52.16]

with

you can change an individual value at i element

(=>> data (dz/with 0 :a 999) (take 1))
;; => [{:a 999, :b 23, :c 32}]

a map can be used

(=>> data (dz/with 0 {:a 999 :b -999}) (take 1))
;; => [{:a 999, :b -999, :c 32}]

function

(=>> data (dz/with :d (fn [{:keys [a b]}] (+ a b 10))) (take 1))
;; => [{:a 24, :b 23, :c 32, :d 57}]

square clojure still behaves like clojure

(=>> data (dz/with :d [+ :a :b [- :c 10]]) (take 1))
;; => [{:a 92, :b -64, :c -96, :d -78}]

a whole column can be added

(=>> data (dz/with :d 5) (take 3))
;; => [{:a 24, :b 23, : c 32, :d 5}
;;     {:a 53, :b 69, :c -99, :d 5}
;;     {:a -4, :b 80, :c -16, :d 5}]

many things in one go

(=>> data (dz/with {:a 5 :b 10}) (take 1))
;; => [{:a 5, :b 10, :c -69}]

(=>> data (dz/with {:a [+ :a 1000] :b [+ :b 1000]}) (take 1))
;; => [{:a 927, :b 905, :c -69}]

conditional with

(=>> data
     (dz/with 0 :a -999)
     (dz/with :when [= :a -999] {:a 999 :b 999 :c 999})
     (dz/where :a 999)
     (dz/row-count))
;; => [1]

aggregate

wip

group-by

wip

io

wip

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published