Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge the two tutorials on simplicial complex construction #27

Open
astamm opened this issue Sep 18, 2020 · 2 comments
Open

Merge the two tutorials on simplicial complex construction #27

astamm opened this issue Sep 18, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@astamm
Copy link
Collaborator

astamm commented Sep 18, 2020

The tutorials Simplicial complexes from data points and Simplicial complexes from distance matrix are almost identical.

I propose that we merge them. The Rips complex takes one out of three possible main arguments:

  • a pairwise distance matrix that it uses directly for building up the complex;
  • a point cloud from which Gudhi computes the pairwise distance matrix before proceeding;
  • an OFF file containing the point cloud that is read out first.

So building a Rips complex from data points or from a distance matrix is immediate.

The case of the Alpha complex is less trivial. It theoretically needs a point cloud on which the Delaunay triangulation is computed. As a result, it is natural that the class has a data_points main argument.

It is the current choice of Gudhi to not allow a distance matrix to be passed as argument to the Alpha Complex constructor. The reason seems to be that, on a purely metric space, we then resort to statistical techniques (such as MDS) that push our data from that metric space into an approximate R^p vector space on which a point cloud approximately respecting the original distances between points can be drawn and used to build the Alpha complex.

However, we could add the argument distance_matrix to the Alpha complex constructor, with a proper documentation. In details, we would acknowledge that it is an approximation and that Gudhi uses sklearn.MDS to do that approximation.

What are your thoughts on this ?

@astamm astamm added the enhancement New feature or request label Sep 18, 2020
@mglisse
Copy link
Member

mglisse commented Oct 7, 2020

The Rips complex takes one out of three possible main arguments:

* a pairwise distance matrix that it uses directly for building up the complex;
* a point cloud from which Gudhi computes the pairwise distance matrix before proceeding;
* an OFF file containing the point cloud that is read out first.

The last one was a mistake I think, it would be cleaner to have 2 steps, reading from file, and then calling Rips on that. Even the second one is a convenience, but not really necessary. Handling sparse distance matrices could be useful though.

However, we could add the argument distance_matrix to the Alpha complex constructor, with a proper documentation. In details, we would acknowledge that it is an approximation and that Gudhi uses sklearn.MDS to do that approximation.

While making the interface more uniform sounds nice, I don't really like that idea. Constructions like the Rips or the intrinsic Cech work on any finite metric space, and naturally take a metric (distance matrix) as input. The ambient Cech depends on some ambient space (R^d in our implementation), it depends on more than just the distance matrix. And the alpha-complex is a construction equivalent (from a topological point of view, although it is smaller combinatorially) to the ambient Cech, in some particular cases (essentially the Euclidean case). I believe this difference (Rips in metric spaces, alpha-complex in Euclidean) is important and I don't want to confuse users about it. A notebook/example/doc seems like the right place to show that funny constructions are possible like embedding a metric space in a Euclidean one (there are several ways to do that) and then computing an alpha-complex there, which should remain 2 separate steps.

Does my comment make sense? (you are allowed to disagree 😉)

@mglisse
Copy link
Member

mglisse commented Oct 7, 2020

The tutorials Simplicial complexes from data points and Simplicial complexes from distance matrix are almost identical.

I propose that we merge them.

About this, I would like the opinion of @bertrandmichel on what he would find more clear. If we think about things in terms of constructions, it makes sense to group the various inputs you can pass to Rips in one notebook. If we think in terms of data, it may make sense to keep separate notebooks for "what to do with coordinates" and "what to do with a distance matrix", although of course the first one can mention "compute a distance matrix and see 2". If Bertrand thinks the merge is a good idea, I am fine with that, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants