Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert pose sequence to trajectory #36

Closed
iariav opened this issue Jun 24, 2020 · 12 comments · May be fixed by #72
Closed

Convert pose sequence to trajectory #36

iariav opened this issue Jun 24, 2020 · 12 comments · May be fixed by #72

Comments

@iariav
Copy link

iariav commented Jun 24, 2020

Hi,
Can you please share the code you used in one of your experiments for constructing the traversed route from the sequence of pose estimations?
For some reason I’m having trouble with that..
Thanks!

@VitorGuizilini-TRI
Copy link
Collaborator

We are planning to add pose evaluation support in one of our next releases, so please stay tuned!
In the meantime, what are the issues you are having?

@cbaus
Copy link

cbaus commented Aug 1, 2020

Maybe this is helpful. Here is a script for the pose inference from given a folder with images:
https://github.com/cbaus/packnet-sfm/blob/add_pose_inference/scripts/infer_pose.py

The pose_net return 6 numbers. 3 are translations and the last 3 are euler angles. I am not 100% sure about the convention of te angles used in the trained checkpoint file but I think this is correct. If somebody could verify this, I am happy to make a PR for this.

@iariav
Copy link
Author

iariav commented Aug 11, 2020

@cbaus - thanks. i tested your script on my own data and it seems to be working OK.
i'm attaching the output from pose estimation (left) compared with the actual trajectory the car traversed (right).

Figure_1

you can see that the overall path looks the same but for some reason there's a factor of ~2 between the estimation i get from the network to the actual path.
@VitorGuizilini-TRI - any idea where this could be coming from??

another thing - i also tried replacing your euler_angles_to_matrix function with the pose_vec2mat from pose_utils.py and results were a bit different...

thanks to you both

@VitorGuizilini-TRI
Copy link
Collaborator

This looks pretty good! I am a little tight on deadlines right now, but will check @cbaus's fork as soon as possible, so we can perhaps PR it to master, that would be a great contribution!

@iariav Are you training a self-supervised model? If that is the case they will be unscaled, that might be the reason behind this factor.

@iariav
Copy link
Author

iariav commented Aug 11, 2020

Hi,
I trained the SemiSupervised model.
But I do resize the input before inference. Perhaps that could be the reason?

@iariav
Copy link
Author

iariav commented Aug 23, 2020

another follow-up on that please -
can you please help me understand why in line 48 in pose_decoder.py you multiply the output by 0.01?

out = 0.01 * out.view(-1, self.num_frames_to_predict_for, 1, 6)

does this have something to do with how you scale the depth information, and thus if i scale it differently i should also multiply the output by a different factor?

thanks

@VitorGuizilini-TRI
Copy link
Collaborator

That's from the original PoseNet implementation. To my knowledge, this is included so initial pose estimates are small enough to not produce empty synthesized images in the first stages of training.

@cbaus
Copy link

cbaus commented Aug 31, 2020

I also noticed that I need to apply a scale factor a bit more than 2 to get the right pose.

In another similar version, there is a factor of 0.1 multiplied to it (see here). My guess is that the 0.1 or 0.01 factor doesn't matter because it also has this factor during training and inference. Maybe @ClementPinard could help out with an explanation? Thank you.

PS I also opened the PR as a draft.

@ClementPinard
Copy link

@VitorGuizilini-TRI and @cbaus are right. This is just for training stability, especially at the beginning, where no depth or pose makes sense. The important thing is to keep the same throughout the whole training and testing. This factor can also be mitigated with the network initialization. If you manage to get a normalized network with e.g. kaiming initialization, you already have a grasp of what distribution the random output of the untrained posenet will follow and you can change the factor accordingly.

For my repo, I made a higher scale so that the gradient of my network weights would be higher. and I could afford the unstability at startup that it triggered. I could have just changed the learning rate parameter though.

@TiffanyTseng54
Copy link

Congratulation for the great work and thanks for providing this well documented code!

Also, thank you @cbaus for kindly sharing the useful script for pose inference.
In the code, I have a question and would be very appreciated if you could explain it to me.
In your code L164, why we should use updated orientation for calculating the current position?

I supposed that translation vector should be implemented before rotation, which can be written as follows:

        rot_matrix, translation = poses[key]
        position += orientation.dot(translation.tolist())
        orientation = orientation.dot(rot_matrix.tolist())

Look forward to your reply and thank you for your time!

@pjckoch
Copy link

pjckoch commented May 28, 2021

Hi,

please correct me if I'm wrong, but I think you are only taking the pose transformation from image t-1 to t and ignore the pose transformation from t to t+1:

https://github.com/cbaus/packnet-sfm/blob/8abdc83e10b505f506081e416a5b424561ce5eba/scripts/infer_pose.py#L94

So, if you have e.g. images [0, 1, 2, 3, 4, 5] you would only keep the following pose transformations: [0->1, 2->3, 4->5]
And you are dropping every second pose transformation: [1->2, 3->4]

That would explain why the qualitative output still looks reasonable, but you are off by a factor of 2.

@oconnor127
Copy link

oconnor127 commented Jun 12, 2021

Hey,
did anyone of you actually did have a look at the odometry as well in y-direction (upwards) and the pitch-angle ? I am trying to recover a 3D pose (instead of looking only at x,z components) but the results in y-direction look very weird (huge drift in y-direction, although the car is driving in a very planar area). The results in the x-z plane look reasonable. I attached 2 plots: 1. with the overall calculated pose in order x,y,z and the second plot with the accumulated pitch, roll and yaw. I used the kitti_sequence 2011_09_30_0018_sync and _0028 as well). I used the referenced pose_inference script from the PR as well as the outputted pose during training (with freezed weights).
kitti_positions_xyz
kitti_angle_cumsum

Did anyone faced the same problem? Did anyone had a look and the results looked reasonable (if so, which sequence did you use)? I appreciate any help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants