Ви не можете вибрати більше 25 тем Теми мають розпочинатися з літери або цифри, можуть містити дефіси (-) і не повинні перевищувати 35 символів.
nitowa a614991ff0 working graph implementation and improved shell scripts 2 роки тому
config/db progress on mapping data, finding clusters, probably inefficient 2 роки тому
spark-packages working graph implementation and improved shell scripts 2 роки тому
src/spark working graph implementation and improved shell scripts 2 роки тому
.gitignore working graph implementation and improved shell scripts 2 роки тому
README.md working graph implementation and improved shell scripts 2 роки тому
clean.py progress on mapping data, finding clusters, probably inefficient 2 роки тому
settings.json working graph implementation and improved shell scripts 2 роки тому
setup.py progress on mapping data, finding clusters, probably inefficient 2 роки тому
small_test_data.csv progress on mapping data, finding clusters, probably inefficient 2 роки тому
start_services.sh working graph implementation and improved shell scripts 2 роки тому
submit.sh working graph implementation and improved shell scripts 2 роки тому
submit_graph.sh working graph implementation and improved shell scripts 2 роки тому

README.md

Project Description

TODO

Installation

Prerequisites:

For the graph implementation specifically you need to install graphframes manually since the official release is incompatible with spark 3.x (pull request pending). A prebuilt copy is supplied in the spark-packages directory.

Setting up

  • Modify settings.json to reflect your setup. If you are running everything locally you can use start_services.sh to turn everything on in one swoop.
  • Load the development database by running python3 setup.py from the project root.
  • Start the spark workload by either running submit.sh (slow) or submit_graph.sh (faster)