Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting key error when training on external dataset #32

Open
tristankpka opened this issue Jun 22, 2020 · 5 comments
Open

Getting key error when training on external dataset #32

tristankpka opened this issue Jun 22, 2020 · 5 comments

Comments

@tristankpka
Copy link

tristankpka commented Jun 22, 2020

Using the same format as you. I get errors when using train (default params) on an external dataset previously formated in the same way you suggest (dataset is attached)
07_tracks.txt

Creating pre-processed validation data from raw data
Now processing:  ./data/validation/highd/07_tracks.txt
Creating pre-processed training data from raw data
Now processing:  ./data/train/highd/07_tracks.txt
Loading train or test dataset:  ./data/train/trajectories_train.cpkl
Sequence size(frame) ------> 20
One batch size (frame)--->- 100
Training data from training dataset(name, # frame, #sequence)-->  07_tracks.txt : 40282 : 2014
Validation data from training dataset(name, # frame, #sequence)-->  07_tracks.txt : 0 : 0
Total number of training batches: 402
Total number of validation batches: 0
****************Training epoch beginning******************
0/12060 (epoch 0), train_loss = 18.527, time/batch = 0.945
1/12060 (epoch 0), train_loss = 6.513, time/batch = 0.877
Traceback (most recent call last):
  File "train.py", line 626, in <module>
    main()
  File "train.py", line 94, in main
    train(args)
  File "train.py", line 218, in train
    target_id_values = x_seq[0][lookup_seq[target_id], 0:2]
KeyError: 14
@tristankpka
Copy link
Author

tristankpka commented Jun 23, 2020

The error was caused by a mismatch between the lstm and data sequences length.
That is to say if you're trainig a LSTM of X sequences length, your data files must be composed only of sequences that are X long.
It could be good to include this in the data format description file.

@sammyjojo9
Copy link

We are facing the same issue with our custom dataset.
Can you please guide us through the steps you undertook to resolve this?

@tristankpka
Copy link
Author

@sammyjojo9 It's been a while since I used this implementation but I remember that you should ensure that your custom dataset contain lot's of sequences that match your desired sequences length parameter during learning.
I got the error above by setting the sequence length parameter too high during learning and when the train and test sets were made (randomly) I didn't have enough long-enough sequences to populate the test set (see the 0 size in the error).

@sammyjojo9
Copy link

Thanks alot!!
we still had some issues regarding the dateset using pixel values that need to be converted into real coordinates using homographic matrix.Could you shed some light on figuring out the matrix and the values in the matrix?

@tristankpka
Copy link
Author

@sammyjojo9 The homography matrix should be estimated using DLT methods as explained in
https://docs.opencv.org/master/d9/dab/tutorial_homography.html#lecture_16
Real coordinates can be found using the homography only by considering planar (2D) coordinates. A quick way to convert pixels coord. to real coord. is given in the link above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants