You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #8 made certain types more polymorphic but it also reshuffled a bit some of the existing traits. In articular, we added two new methods to the Node and Edge traits, respectively fn node_type(&self) -> types::NodeType and fn edge_type(&self) -> types::EdgeType, which return the concrete types.
This is something that @cloudhead wasn't very fond of, but that was initially justified by the fact that for Osrank is very convenient to be able to use the concrete types. The same concrete types are also used in the EdgeRef struct:
Here I am going to explain in full detail why I have done this, to leave a testament behind of my thought process so that this design can be improved.
The main reason why I did come up with those extra methods/fields in the first place is because I was under the impression the Engineering team settled on a design where we were going to use the concrete types as the "meeting point" between the Registry & Osrank; in such scenario is extremely useful for Osrank & Registry to share the same types. Not only that, but let's take as an example a real piece of code from Osrank:
impl<W,R>DynamicWeightsforNetwork<W,R>whereW:Clone + Mul<Output = W> + Div<Output = W> + From<Weight>,R:Clone + Zero,{fndynamic_weight(&self,edge:&implEdge<Self::Weight, <Self::NodeasGraphObject>::Id,Self::EdgeData>,hyperparams:&types::HyperParameters<Self::Weight>,) -> Self::Weight{let e_type = edge.edge_type();// Let's start by assigning this edge the stock default value, by// reading it from the hyperparams.letmut weight:Self::Weight = (hyperparams.get_param(&e_type.to_tag())).clone();// others can't be zero as there is at least one edge, i.e. the// input one.let others = edges_of_same_type(self, edge,Direction::Outgoing, e_type);let source_node = self.get_node(edge.source()).expect("dynamic_weight: source node not found.");// Then we need to do something different based on the type of edge.match e_type.to_tag(){
types::EdgeTypeTag::ProjectToUserContribution => {// contrib is multiplied by the number of contributions of// the account to the project, divided by the total number// of contributions in the project.let total_project_contrib = source_node.node_type().total_contributions();let user_contribs = edge.edge_type().total_contributions();
weight = weight *Weight::new(user_contribs, total_project_contrib).into()}
types::EdgeTypeTag::UserToProjectContribution => {// contrib* and maintain* are multiplied by the number of// contributions of the account to the project, divided by// the total number of contributions of the account.let total_account_contrib = source_node.node_type().total_contributions();let user_contribs = edge.edge_type().total_contributions();
weight = weight *Weight::new(user_contribs, total_account_contrib).into()}
types::EdgeTypeTag::UserToProjectMembership => {// The weight is divided by the corresponding count of// outgoing edges of the same type on the node.
weight = weight / others.into()}
types::EdgeTypeTag::ProjectToUserMembership => {// contrib* and maintain* are multiplied by the number of// contributions of the account to the project, divided by// the total number of contributions of the account.let total_account_contrib = source_node.node_type().total_contributions();let user_contribs = edge.edge_type().total_contributions();
weight = weight *Weight::new(user_contribs, total_account_contrib).into()}
types::EdgeTypeTag::Dependency => {// The weight is divided by the corresponding count of// outgoing edges of the same type on the node.
weight = weight / others.into()}}
weight
}}
Here I was able to write this trait implementation in a fairly polymorphic way by the virtue of the
fact I could rely on calling .edge_type() and .node_type() and be sure they would return what I was expecting. In particular, I was able to write impl Edge<Self::Weight, <Self::Node as GraphObject>::Id, Self::EdgeData> and pass any type which implements that trait. If we were going to remove those node_type/edge_type methods (by the virtue of the fact the concrete types will probably be inside the NodeData/EdgeData, we would have to write something like this, at the very minimum:
We make EdgeData (or even EdgeType at this point?) an extra type parameter:
pubstructEdgeRef<'a,NodeId,EdgeId,EdgeType>{pubfrom:&'a NodeId,pubto:&'a NodeId,pubid:&'a EdgeId,pubedge_type:&'a EdgeType,// This is not the concrete one but a free variable}
I don't have an intuition on which method is better, but it looks like option 2. feels unnatural, as it makes sense to talk about a specific EdgeRef over a Graph G. Last but not least, the reason why I have added this edge_type to EdgeRef in the first place is because it's very handy to have it "pre-computed" in a situation like this:
for eref in network.edges_directed(¤t_node_id,Direction::Outgoing){
possible_edge_types.insert(eref.edge_type);}
If we didn't have this, we would have to fetch the info from the graph, which is very inefficient:
for eref in network.edges_directed(¤t_node_id,Direction::Outgoing){let edge_type = network.get_edge(eref.id()).unwrap().data().into();
possible_edge_types.insert(edge_type);}
And that obviously also requires the Into<types::EdgeType> constraint.
I hope this is useful as a testament for @MeBrei and the rest of the crew :)
The text was updated successfully, but these errors were encountered:
Something else to consider: At the moment, we the following types:
pubenumNodeType{/// A user, eg. contributor, project member etc.User{contributions_to_all_projects:u32},/// A project with users as members and contributors.Project{contributions_from_all_users:u32},}
The idea here is to store at the graphnode-level the number of contributions, to later use this information in the dynamic weight calculation in osrank. However, having this static information is not-so-great for when we actively prune the graph, namely during the TrustRank phase.
In the "real world" is conceivable we will prune the graph (via the TrustRank phase based on the seed set) and we would yield some sort of GraphView which holds into the references of just the nodes we are retaining. At this point we have a problem though, because the TrustRank phase might eliminate neighbours nodes of one of more nodes contained in the GraphView, which means the total number of contributions has now changed, which means it would be wrong to use the static information, and we cannot mutate the graph from the algorithm because we are passed a &'a G.
Now, truth to be told we could always clone the graph (as an interim hack) and mutate it inside the algorithm, but this is not very scalable. Rather, we should do what @MeBrei suggested in another comment: somehow (read, in an efficient way) we should compute this "total number of contributions for project X" and "all contributions of user U to all projects" on the fly, as in both cases this is just a matter of grabbing the outgoing/incoming edges of the right EdgeType and perform the sum over the contribution value of each Edge.
In such scenario it's perhaps convenient to have the concrete EdgeType and NodeType in scope (to perform such filtering on the graph-api side instead of leaving the burden of implementing this on the osrank side), which might tip the scale towards one or the other proposed-above design.
PR #8 made certain types more polymorphic but it also reshuffled a bit some of the existing traits. In articular, we added two new methods to the
Node
andEdge
traits, respectivelyfn node_type(&self) -> types::NodeType
andfn edge_type(&self) -> types::EdgeType
, which return the concrete types.This is something that @cloudhead wasn't very fond of, but that was initially justified by the fact that for
Osrank
is very convenient to be able to use the concrete types. The same concrete types are also used in theEdgeRef
struct:Here I am going to explain in full detail why I have done this, to leave a testament behind of my thought process so that this design can be improved.
The main reason why I did come up with those extra methods/fields in the first place is because I was under the impression the Engineering team settled on a design where we were going to use the concrete types as the "meeting point" between the Registry & Osrank; in such scenario is extremely useful for Osrank & Registry to share the same types. Not only that, but let's take as an example a real piece of code from Osrank:
Here I was able to write this trait implementation in a fairly polymorphic way by the virtue of the
fact I could rely on calling
.edge_type()
and.node_type()
and be sure they would return what I was expecting. In particular, I was able to writeimpl Edge<Self::Weight, <Self::Node as GraphObject>::Id, Self::EdgeData>
and pass any type which implements that trait. If we were going to remove thosenode_type/edge_type
methods (by the virtue of the fact the concrete types will probably be inside theNodeData/EdgeData
, we would have to write something like this, at the very minimum:Then add proper
std::convert::Into
instances and finally in the code call:Which is still do-able, albeit unfortunate.
As regards the
EdgeRef
type, note that we cannot write the following:This is because an
EdgeData
exist only in the context of aGraph
. We have two options here:EdgeRef
polymorphic over the graph:EdgeData
(or evenEdgeType
at this point?) an extra type parameter:I don't have an intuition on which method is better, but it looks like option 2. feels unnatural, as it makes sense to talk about a specific
EdgeRef
over a GraphG
. Last but not least, the reason why I have added thisedge_type
toEdgeRef
in the first place is because it's very handy to have it "pre-computed" in a situation like this:If we didn't have this, we would have to fetch the info from the graph, which is very inefficient:
And that obviously also requires the
Into<types::EdgeType>
constraint.I hope this is useful as a testament for @MeBrei and the rest of the crew :)
The text was updated successfully, but these errors were encountered: