Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues parsing OpenXML SDK generated xlsx files #266

Open
visnup opened this issue Oct 8, 2021 · 7 comments
Open

Issues parsing OpenXML SDK generated xlsx files #266

visnup opened this issue Oct 8, 2021 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@visnup
Copy link
Member

visnup commented Oct 8, 2021

Upstream issue with ExcelJS: exceljs/exceljs#1437
Originally reported to us by @aaronkyle.

I'll try opening a PR to fix in ExcelJS…

@visnup visnup self-assigned this Oct 8, 2021
@Gayathri-Senthilkumar
Copy link

@visnup, I have run into the same issue in opening and parsing the .xlsx file and I would like to know the status of this open issue.

@visnup
Copy link
Member Author

visnup commented Apr 27, 2022

@Gayathri-Senthilkumar I stalled out when I ran into hurdles trying to come up with a minimal method to deal with the namespaces OpenXML uses. Would love help trying to fix it upstream, if someone has time.

@Gayathri-Senthilkumar
Copy link

@visnup, We would love to work with you on fixing this issue. It would be difficult for us right now to allocate dedicated bandwidth to work on this continuously, but we are planning to work and fix this issue with your guidance whenever we find time/additional bandwidth. Please share your input to resolve this issue.

@mbostock mbostock added the bug Something isn't working label Jun 7, 2022
@Gayathri-Senthilkumar
Copy link

@visnup Please let us know if there are any updates on this issue.

@visnup
Copy link
Member Author

visnup commented Jun 29, 2022

@Gayathri-Senthilkumar sadly I haven't been able to find time to work on this. As a workaround, you could use SheetJS to parse and access the file. It's a bit more verbose, but should be able to handle these files for now.

@Gayathri-Senthilkumar
Copy link

@visnup I can understand, but we are using ExcelJS Library in our application for parsing the .xlsx file and also performing a few other functionalities. Moving to a different library (SheetJS) would cause more effort as we need to explore the SheetJS library to check the feasibility of all the functions that we use in our application and then migrate. So we would like to know if this bug fix is in the backlog and any plan to fix it sometime later. If so could you please share a rough timeline for this fix.

@JonasLukasczyk
Copy link

My colleagues (@HLWeil, @Freymaurer) and I ran into the same issue. Instead of switching to SheetJS we monkey-patched ExcelJS. Specifically, we overrode the parse function of the BaseXform class located in exceljs/lib/xlsx/xform/base-xform.js:

async parse(saxParser) {
  for await (const events of saxParser) {
    for (const {eventType, value} of events) {
      if(value.name && value.name.startsWith('x:')) value.name = value.name.slice(2);

      if (eventType === 'opentag') {
        this.parseOpen(value);
      } else if (eventType === 'text') {
        this.parseText(value);
      } else if (eventType === 'closetag') {
        if (!this.parseClose(value.name)) {
          return this.model;
        }
      }
    }
  }
  return this.model;
}

The new parse function is identical to the original except for the additional if statement at the beginning of the second for loop. This if statement checks if a tag starts with a global namespace (here x:) and just removes it before the parsing continues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants