Skip to content

GSoC 2024 Brandes Betweenness Centrality

bedupako12mas edited this page Aug 28, 2024 · 53 revisions

Table of Contents

Proposal

Brief Description

The project aims to add the following algorithm to in pgRouting during GSoC 2024:

  • Brandes Betweenness Centrality

    • Betweenness centrality is a measure of node importance based on the number of shortest paths between other node pairs that pass through that node.
    • Brandes's algorithm exploits the sparse nature of typical real-world graphs, and computes the betweenness centrality score for all vertices in the graph
    • It is implemented in the Boost Graph Library as Boost::brandes_betweenness_centrality
    • It works O(VE) for unweighted graphs and O(VE + V(V+E) log V) for weighted graphs. The space complexity is O(VE). [V = number of vertices of the graph; E = number of edges of the graph]

State of the Project Before GSoC

Currently there are no algorithms in pgRouting which compute Graph Metrics like betweenness centrality, bandwidth etc.

Deliverables

  • Implementation of pgr_brandesBetweennessCentrality()
  • Code with detailed and essential comments following the style guide and best practices
  • User’s documentation of the function
  • Basic pgTap test cases
  • A wiki page detailing weekly progress
  • Detailed report for the first and final evaluations

Detailed Proposal

Detailed Proposal Link (Google Doc)

Participants

Title GitHub Handle Name
1st Mentor @cvvergara Vicky Vergara
2nd Mentor @krashish8 Ashish Kumar
Student Developer @bedupako12mas Arun Thakur

Timeline

Community Bonding Period

  • Introduce myself to the pgRouting mentor(s) and other community members
  • Set up my development environment
  • Familiarize myself with the pgRouting codebase and development practices
  • Learn pgTap and Boost Graph Library to help with the project
  • Setup wiki page for tracking weekly progress of the project
  • Participate in community discussions, meetings, or forums to engage with the project
  • Engage with PostgreSQL and PostGIS to gain more insight and comfort for working on the project

Week -1

  • Introduce myself to the pgRouting mentor(s) and other community members
  • Set up my development environment
  • Setup wiki page for tracking weekly progress of the project'

Week 0

  • Familiarize myself with the pgRouting codebase, documentation and development practices
  • Study the Boost Graph Library and Brandes algorithm implementation to help with the project
  • Go through training videos provided during the application period

First Coding Period

Week 1

  • Start designing pgr_brandesBetweennessCentrality()
  • Prepare basic framework for C,C++,sql code

Weekly Report

Pull Request of the week: #378

  • Attended weekly meeting on Meeting on May 27th 2024
  • Changed the name of the function from pgr_bradesBetweennessCentrality to just pgr_centrality
  • Added some of the necessary files for implementing pgr_centrality.
    • doc/metrics/ - CMakeLists.txt, metrics-family.rst, pgr_centrality.rst
    • docqueries/metrics - CMakeLists.txt, doc-centrality.result, doc-centrality.test.sql, test.conf
    • sql/metrics - CMakeLists.txt, _centrality.sql, centrality.sql
    • src/metrics - CMakeLists.txt, centrality.c, centrality_driver.cpp
    • include/drivers - Was not sure about what to add since there were two allpairs folders.
    • pgtap/metrics - Did not add as I wont be working on it just yet and I was unsure about what files I had to add.
  • Modified the files in the sql/ and src/ directories (licenses and function names)
  • Added metrics to configuration.conf
  • Studied BGL's Brandes Algorithm for Betweenness Centrality and wrote a small code snippet for it on the sample data.(Link)
  • What I Plan on doing next week?

    • Finishing up the basic code skeleton by adding the relevant driver files and pgtap files.
    • Start working on sample test cases for the documentation and on the documentation
  • Am I Blocked on anything?

    • The job that I have right now has me working 12 hour shifts (can be night shifts later on) on a 20 days on and 10 days off split. So the hours that I put in might not be enough right now as I am getting used to the work culture and schedule. I will restructure the deliverables such that I do most of the work during those 10 days off and hit the mid term evaluation goals on time. Luckily, I will have my holidays from June 15th and will be able to dedicate more time to cover up for the previous weeks.

Week 2

  • Create basic framework for tests and documentation
  • Start implementing pgr_brandesBetweennessCentrality()

Weekly Report

Pull Request of the week: #383

  • Attended weekly meeting on 3rd June 2024
  • Worked on and added documenation in pgr_centrality.rst
  • Studied the codebase to make docqueries work.
  • Merged remote tracking branch pgrouting develop into centrality-week2
  • Added remaining necessary file to make the code compile
    • include/drivers/metrics/centrality_driver.h
  • Adjusted centrality.c
  • worked on documentation specifically on pgr_centrality.rst
  • Start implementing pgr_centrality()
    • This week I couldn't work a lot due to frequent power cuts in Pune due to the heavy rainfall.
  • What I Plan on doing next week?

    • Finishing up the basic code skeleton by adding the relevant driver files and pgtap files.
    • Start working on sample test cases for the documentation and on the documentation
    • Get the docqueries to work
    • Fix the problems in the code from the first two pull requests
  • Am I Blocked on anything?

    • The job that I have right now has me working 12 hour shifts (can be night shifts later on) on a 20 days on and 10 days off split. So the hours that I put in might not be enough right now as I am getting used to the work culture and schedule. All the time lost and work not done will be compensated in the following two weeks as I have days off from work starting from 13th June 2024 till 22nd June 2024.

Week 3

  • Read data from user (postgresql)
  • Start creating pipeline to process SQL data to C function

Weekly Report

Pull Request of the week: [386]

  • Attended weekly meeting on 10th June 2024
  • Created an include/metrics/centrailty.hpp
  • Worked on include/metrics/centrality.hpp
  • What I Plan on doing next week?

    • Finish up centrality.hpp
    • Start working on the C and SQL portions of the code
    • Work on docqueries and docs if possible
  • Am I Blocked on anything?

    • I am free this week as I am on vacation. I will be leaving for Abu Dhabi on June 23rd (Sunday) and apart from that there are no other commitments.

Week 4

  • Finish data pipeline from SQL to C function
  • Write helper class and wrappers
  • C driver to use Boost C++ function

Weekly Report

Pull Request of the week: [388]

  • Attended weekly meeting on 17th June 2024
  • remove any reference to floydWarshall or johnson in the betweenness centrality code
  • work on include/metrics/centrality.hpp to make the code compile
  • change centrality to betweennesscentrality The code seemed to break everytime and I couldnt figure out the reason yet
  • prepare docqueries with expected output
  • adjust the C, C++ and sql code to generate the expected output
  • code should compile
  • What I Plan on doing next week?

    • Comeplete whatever part remains from the above checklist and discuss the further course of action with my mentor
  • Am I Blocked on anything?

    • I now have regular "school" from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out.

Week 5

  • Create sql query examples
  • Work on tasks for midterm evaluation submission

Weekly Report

Pull Request of the week: [390]

  • changed centrality to betweennesscentrality
  • prepared docqueries with expected output
  • adjusted the C, C++ and sql code to generate the expected output
  • What I Plan on doing next week?

    • minor changes to code to fix expected output problems
    • work on docqueries and documentation
    • clean and finish up code to provide basic pgRouting function _pgr_betweennessCentrality() with documentation as planned in the proposal.
  • Am I Blocked on anything?

    • I now have regular "school" from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out.

Week 6

  • Finalize the function and documentation
  • Prepare report for first evaluation

Weekly Report

Pull Request of the week: [392]

  • Attended meeting on 2nd July 2024
  • change betweenness_centrality to centrality
  • change the name of docs from pgr_centrality.rst to pgr_betweennessCentrality.rst
  • Fix the code for directed graphs
  • change betweenesscentrality to betweennessCentrality
  • Finish the docs and pgr_betweennessCentrality to experimental.rst
  • finish docqueries
    • cleanup test.conf
    • write explicit, and implicit test cases for directed and undirected graphs
    • for all docqueries test provide by-hand calculation
  • What I Plan on doing next week?

    • work on feedback from the first evaluation
  • Am I Blocked on anything?

    • I now have regular "school" from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out.

Second Coding Period

Week 7

  • Work on feedback from the first evaluation

Weekly Report

Pull Request of the week: [393]

  • made minor changes to docs
  • I was extremely sick for the past week and was barely able to work.
  • What I Plan on doing next week?

    • work on feedback from the first evaluation
    • start working on pgTap tests.
  • Am I Blocked on anything?

    • I now have regular "school" from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out. I am also recovering from food poisoning and the flu. Things will get back on track by next week.

Week 8

  • Learn how to make pgTap tests
  • Write meaningful pgTap tests

Weekly Report

Pull Request of the week: [396]

  • added pgtap folder and types_check.pg
  • I was extremely sick for the past week and was barely able to work. I am still recovering from the flu and food poisoning. Things will get back on track from Week 9 onwards

  • What I Plan on doing next week?

    • start working on pgTap tests.
  • Am I Blocked on anything?

    • I now have regular “school” from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out. I am also recovering from food poisoning and the flu. Things will get back on track by next week.

Week 9

  • Fix all bugs and documentation problems
  • Create example queries for documentation using pgRouting Sample Data

Weekly Report

Pull Request of the week: [398]

  • Report Email [OSGeo Discourse]

  • What I got done this week?

    • changed floydwarshall to betweennesscentrality in types_check.pg
    • added pgtap/metrics/betweenessCentrality/edge_cases.pg
    • added pgtap/metrics/betweenessCentrality/no_crash_test.pg
    • added pgtap/metrics/betweenessCentrality/inner_query.pg
    • all added pgTap tests pass (81/81)
    • attended weekly meeting on July 22 2024
  • What I Plan on doing next week?

    • Attend the weekly meeting with my mentor
    • Plan out further tests for my function
    • Update the wiki by the end of the week upto week 10
  • Am I Blocked on anything?

    • I now have regular “school” from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out. I am also recovering from food poisoning and the flu. Things will get back on track by next week.

Week 10

  • Fix any remaining bugs and documentation problems

Weekly Report

Pull Request of the week: [399]

  • Report Email [OSGeo Discourse]

  • What I got done this week?

    • Added all mandatory pgTap tests
    • Removed all linting errors from include/metrics/betweennessCentrality.hpp
    • attended weekly meeting on July 29 2024
  • What I Plan on doing next week?

    • Attend the weekly meeting with my mentor
    • Plan out further tests for my function
    • Update the wiki by the end of the week upto week 11
  • Am I Blocked on anything?

    • I now have regular “school” from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out. I am also recovering from food poisoning and the flu. Things will get back on track by next week.

Week 11

  • Prepare PR to the develop branch to the main repository
  • Prepare for finalizing the project and review the work

Weekly Report

Pull Request of the week: [402]

  • added comments
  • fixed all linting errors in driver code
  • all pgTap tests pass (90/90) I was unable to communicate with my mentors due to an error with my gitter account and heavy workload in my job due to 10 hour oil rig drilling practicals and assessments.
  • What I Plan on doing next week?
    • Attend the weekly meeting with my mentor
    • Plan out further course of action
    • Merge the function to the main pgRouting repository as an experimental function
  • Am I Blocked on anything?
    • I now have regular “school” from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out. I am free this week and plan on finishing up the project this week

Week 12

  • Prepare final evaluation report and finish any remaining tasks
  • Submit Final evaluation report

Weekly Report

Pull Request of the week: [403]

  • added comments
  • fixed all linting errors in driver code
  • all pgTap tests pass (90/90) I was unable to communicate with my mentors due to an error with my gitter account and heavy workload in my job due to 10 hour oil rig drilling practicals and assessments.
  • What I Plan on doing next week?
    • Attend the weekly meeting with my mentor
    • Plan out further course of action
    • Merge the function to the main pgRouting repository as an experimental function
  • Am I Blocked on anything?
    • I now have regular “school” from 9 am to 5pm in Abu Dhabi (UTC +4:00). Apart from that I have no other commitments and will be able to work consistently from here on out. I am free this week and plan on finishing up the project this week

Log of Pull Requests

Link to all the Pull Requests made in GSoC-pgRouting repository

Pull Request Description Date Status
#378 Week 1 May 27 2024 Merged
#383 Week 2 June 03 2024 Merged
#386 Week 3 June 10 2024 Merged
#388 Week 4 June 17 2024 Merged
#390 Week 5 June 24 2024 Merged
#392 Week 6 July 01 2024 Merged
#393 Week 7 July 15 2024 Merged
#396 Week 8 July 15 2024 Merged
#398 Week 9 July 15 2024 Merged
#399 Week 10 July 15 2024 Merged
#402 Week 11 July 15 2024 Merged
#403 Week 12 July 15 2024 WIP

Final Report

Report Mail - [OSGeo Discourse]

Hello everyone,

With GSoC coming to an end, I hereby present my final report of the work I have done over the past three months. It has been an amazing learning experience and great time working with the pgRouting community and mentors.

Title: Implement Brandes Betweenness Centrality Algorithm for pgRouting by the Boost Graph Library

Organisation: pgRouting under the umbrella of OSGeo

Abstract: This GSoC project dealt with the implementation of one new Graph algorithm in pgRouting. The algorithm is described below:

  • Betweenness Centrality: It is an algorithm used to compute the betweenness centrality of all vertices of a graph. Betweenness centrality is a measure that quantifies the importance of a vertex based on its occurrence on shortest paths between other vertices. Higher the betweenness centrality score, higher its importance to the graph as most of the routing will occur through it.

It is implemented in Boost Graph Library (BGL) as boost::brandes_betweenness_centrality. Brandes's algorithm computes the betweenness centrality score for all vertices in the graph in O(VE) for unweighted graphs and O(VE + V(V+E) log V) for weighted graphs. The space complexity is O(VE)

State of the Project Before GSoC: pgRouting currently does not have this algorithm implemented. A single standard function does not exist for the betweenness centrality algorithm. This algorithm will be the first one from the Metrics family.

State of the Project After GSoC: The deliverables are code, documentation, documentation tests, and the pgTAP tests of the function pgr_betweennessCentrality.

Potential Future Work:

  • I would like to provide more tests for my function to promote it from experimental to proposed status.

  • My implementation of Betweenness Centrality in pgRouting is based on older pgRouting function templates. I aim to refactor the code into a header-only file, similar to the Boost Library's approach which is the case for some of the pgRouting functions.

  • The implementation of Brandes Betweenness Centrality is just one of the several metrics family algorithms in the BGL. I would love to implement all the remaining algorithms in this family after GSoC.

Links:

I am so grateful to be a part of the amazing GSoC and OSGeo communities. I have learned a lot during this program and It has helped me kick start my software development journey. I would be happy if my code proves beneficial to the community. Last but not the least, thank you to my mentors and the entire community for the support!

Thanks and Regards,

Arun Thakur

Community Bonding Period

Week 0

  • Report Email [SoC] [pgRouting-dev]

  • What I got done this week?

    • Finished Wiki setup
    • Went through brandes_betweenness_centrality documentation and ran code samples
    • Went through OSGeo pgRouting training videos
    • Setup the development environment
  • What I Plan on doing next week?

    • Prepare basic framework for the C, C++, SQL code
    • start designing pgr_betweennessCentrality() (name to be finalized soon)
  • Am I Blocked on anything?

    • The job that I have right now has me working 12 hour shifts (can be night shifts later on) on a 20 days on and 10 days off split. So the hours that I put in might not be enough right now as I am getting used to the work culture and schedule. I will restructure the deliverables such that I do most of the work during those 10 days off and hit the mid term evaluation goals on time. I will keep the mentors posted on any updates I have which may affect my work with feasible alternatives for both the mentors and I. The work will not be compromised upon and all the deliverables will be completed on time.
  • Meetings attended this week:

    • May 19th 2024: File structure and implementation details, basic output testing and pgtap tests were discussed

Week -1

  • Report Email [SoC] [pgRouting-dev]

  • What I got done this week?

    • Finished up cleaning up the Wiki template and started filling in the relevant fields.
    • Finished filling up the OSGeos wiki for OSGeo-GSoC 2024.
  • What I Plan on doing next week?

    • Finish up and streamline the wiki and weekly reports.
    • Go through and study BGL documentation for brandes betweenness centrality and run samples.
    • Go through pgRouting docs and the entire study material videos provided during the applications period. (Link to Training Videos)
    • Start planning how to implement pgr_brandesBetweennessCentrality() and the work for the coding period starting from next week
    • Create my own branch (from develop/upstream)
    • Plan out how to structure my weeks and daily routine to meet the Mid-Term Evaluation deliverables
  • Am I Blocked on anything?

    • The job that I have right now has me working 12 hour shifts (can be night shifts later on) on a 20 days on and 10 days off split. So the hours that I put in might not be enough right now as I am getting used to the work culture and schedule. I will restructure the deliverables such that I do most of the work during those 10 days off and hit the mid term evaluation goals on time. I will keep the mentors posted on any updates I have which may affect my work with feasible alternatives for both the mentors and I. The work and meetings will not be compromised upon and all the deliverables will be completed on time.
  • Meetings attended this week:

    • May 13th 2024: Introductory meeting and briefing upon basic tasks to do after missing the first meeting on May 6th (missed by me)

References

  1. Boost brandes_betweenness_centrailty

  2. A faster algorithm for Betweenness Centrality

Clone this wiki locally