Skip to content

Automatically choose flavor based on type of table in PDF #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vinayak-mehta opened this issue Jul 4, 2019 · 2 comments
Open

Automatically choose flavor based on type of table in PDF #19

vinayak-mehta opened this issue Jul 4, 2019 · 2 comments
Labels
enhancement New feature or request

Comments

@vinayak-mehta
Copy link
Member

Continuing the conversation from #102.

@imri:

When you say that lattice should work perfectly - I sort of wish to create a generic way to detect and extract tables without having to know which detection method (lattice / stream) is best for a given document - I want to decouple them as much as possible.

@vinayak-mehta

I get your use-case and it is not possible currently through the library itself. But I see two possibilities which can be implemented (both heuristics):

  1. As far as I can tell from NurminenDetectionAlgorithm.java, Tabula first filters out all Lattice-type tables from the document and then looks for Stream-type tables, till it cannot find any more tables. Similarly, we can "couple" both flavors into a single one inside Camelot.
  2. We can create a flavor called guess which automatically chooses between Lattice and Stream.
@dpranav1988
Copy link

Hi @vinayak-mehta ,

Could you please let me know how i can auto detect when to use lattice and stream. I dont want to check manually rather need an automated way to identify

Thanks in advance

@javiqm12
Copy link

javiqm12 commented Jan 3, 2021

Hi, this enhancement will be highly appreciated.
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants