Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cascalog spike #28

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Cascalog spike #28

wants to merge 3 commits into from

Conversation

jxa
Copy link
Member

@jxa jxa commented Mar 4, 2016

This is a WIP of an idea we had to precompute the geographical aggregation of species sightings. It is very early stages. Still learning how to use cascalog. But the idea is to skip the datomic layer entirely in order to save on hosting fees now that we have lost Neo as our hosting sponsor.

When sighting data is updated we can spin up an AWS EMR (amazon's hadoop) and run the new file through the analysis. The result will be a bunch of files on S3 which contain query results for each of the possible queries.

Open questions:

  • How can we add a JSON / EDN / Transit tap to cascalog
  • How much will the new scheme cost (ballpark)

@jxa
Copy link
Member Author

jxa commented Mar 4, 2016

Currently outputs the following to stdout

RESULTS
-----------------------
American Dipper 4
American Tree Sparrow   1
Bald Eagle  90
Black-legged Kittiwake  10
Black Oystercatcher 10
Black Scoter    148
Bufflehead  1
Common Goldeneye    1
Common Loon 1
Common Merganser    2
Common Raven    51
Common Redpoll  10
Double-crested Cormorant    1
Emperor Goose   455
Gray-crowned Rosy-Finch 3
Glaucous-winged Gull    68
Green-winged Teal   87
Greater Scaup   205
Harlequin Duck  227
Long-tailed Duck    150
Mallard 2
Mew Gull    300
Pelagic Cormorant   65
Peregrine Falcon    1
Pigeon Guillemot    68
Red-breasted Merganser  5
Red-faced Cormorant 2
Rock Ptarmigan  36
Rock Sandpiper  17
Song Sparrow    7
Steller's Eider 32
White-winged Scoter 103
American Dipper 1
American Dipper 1
American Dipper 1
American Dipper 2
American Dipper 1
American Robin  4
American Robin  4
American Robin  10
American Wigeon 1
Bald Eagle  1
Bald Eagle  2
Bald Eagle  2
Bald Eagle  2
Bald Eagle  3
Bald Eagle  1
Bald Eagle  1
Bald Eagle  1
Bald Eagle  2
Bald Eagle  1
Bald Eagle  2
Bald Eagle  1
Bald Eagle  2
Bald Eagle  1
Bald Eagle  1
Bald Eagle  1
Bald Eagle  1
Bald Eagle  1
Bald Eagle  1
Black-billed Magpie 1
Black-billed Magpie 17
Black-billed Magpie 20
Black-billed Magpie 12
Black-billed Magpie 5
Black-billed Magpie 4
Black-billed Magpie 2
Black-billed Magpie 4
Black-billed Magpie 8
Black-billed Magpie 2
Black-billed Magpie 1
Black-billed Magpie 7
Black-billed Magpie 2
Black-billed Magpie 2
Black-billed Magpie 2
Black-billed Magpie 10
Black-billed Magpie 2
Black-billed Magpie 5
Black-billed Magpie 5
Black-billed Magpie 3
Black-billed Magpie 14
Black-billed Magpie 1
Black-billed Magpie 2
Black-billed Magpie 2
Black-billed Magpie 2
Black-billed Magpie 30
Black-billed Magpie 29
Black-billed Magpie 29
Black-capped Chickadee  6
Black-capped Chickadee  20
Black-capped Chickadee  40
Black-capped Chickadee  25
Black-capped Chickadee  28
Black-capped Chickadee  9
Black-capped Chickadee  28
Black-capped Chickadee  7
Black-capped Chickadee  20
Black-capped Chickadee  22
Black-capped Chickadee  15
-----------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant