A Data Engineer needs to ingest invoice data in PDF format into Snowflake so that the data can be queried and used in a forecasting solution.
What is the recommended way to ingest this data?
A Data Engineer needs to ingest invoice data in PDF format into Snowflake so that the data can be queried and used in a forecasting solution.
What is the recommended way to ingest this data?
To ingest PDF data into Snowflake and make it queryable, creating a Java User-Defined Function (UDF) that leverages Java-based PDF parser libraries is the most recommended approach. This method allows you to parse the PDF data directly into structured data, which can then be ingested into Snowflake tables for querying and use in forecasting solutions. Other options do not address the need for parsing the PDF format into a structured format that can be easily queried.
Using python or java libraries so I guess D