You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug INFO:main:Detected file type: application/pdf INFO:main:Sending request to https://api.unstructured.io/general/v0/general INFO:main:Headers: {'Accept': 'application/json', 'unstructured-api-key': 'xxx'} INFO:main:File being sent: Mahmoud_Gamal_Resume.pdf INFO:main:Response status code: 500 INFO:main:Response headers: {'Date': 'Sun, 29 Sep 2024 21:51:15 GMT', 'Content-Type': 'application/json', 'Content-Length': '47', 'Connection': 'keep-alive', 'server': 'uvicorn'} ERROR:main:500 Internal Server Error: {"detail":"'6114cee903d6a72fa0370b97d042b71c'"} ERROR:main:Error details: { "detail": "'6114cee903d6a72fa0370b97d042b71c'" } HTTP error occurred: 500 Server Error: Internal Server Error for url: https://api.unstructured.io/general/v0/general
To Reproduce
A simple code to reproduce the error:
importrequestsimportjsonimportloggingimportmagiclogging.basicConfig(level=logging.INFO)
logger=logging.getLogger(__name__)
defis_pdf(file_path):
mime=magic.Magic(mime=True)
file_type=mime.from_file(file_path)
logger.info(f"Detected file type: {file_type}")
returnfile_type=="application/pdf"defparse_pdf(api_key, file_path):
ifnotis_pdf(file_path):
return"Error: The provided file is not a PDF."url="https://api.unstructured.io/general/v0/general"headers= {
"Accept": "application/json",
"unstructured-api-key": api_key
}
try:
withopen(file_path, "rb") asfile:
files= {"files": (file_path, file, "application/pdf")}
logger.info(f"Sending request to {url}")
logger.info(f"Headers: {headers}")
logger.info(f"File being sent: {file_path}")
response=requests.post(url, headers=headers, files=files)
logger.info(f"Response status code: {response.status_code}")
logger.info(f"Response headers: {response.headers}")
response.raise_for_status()
returnresponse.json()
exceptrequests.exceptions.HTTPErrorashttp_err:
ifresponse.status_code==500:
logger.error(f"500 Internal Server Error: {response.text}")
try:
error_details=response.json()
logger.error(f"Error details: {json.dumps(error_details, indent=2)}")
exceptjson.JSONDecodeError:
logger.error("Could not parse error response as JSON")
returnf"HTTP error occurred: {http_err}"exceptrequests.exceptions.RequestExceptionaserr:
returnf"An error occurred: {err}"exceptExceptionase:
returnf"An unexpected error occurred: {e}"defmain():
api_key="YOUR_API_KEY_HERE"file_path="Mahmoud_Gamal_Resume.pdf"result=parse_pdf(api_key, file_path)
print(result)
if__name__=="__main__":
main()
Filetype: PDF
Any additional API parameters: No
Environment:
Ubuntu 22.04
SKD
Additional context
I attached one of the .pdf that produces such error, a side note when I used llamaparse it worked fine with this pdf Mahmoud_Gamal_Resume.pdf
The text was updated successfully, but these errors were encountered:
Describe the bug
INFO:main:Detected file type: application/pdf INFO:main:Sending request to https://api.unstructured.io/general/v0/general INFO:main:Headers: {'Accept': 'application/json', 'unstructured-api-key': 'xxx'} INFO:main:File being sent: Mahmoud_Gamal_Resume.pdf INFO:main:Response status code: 500 INFO:main:Response headers: {'Date': 'Sun, 29 Sep 2024 21:51:15 GMT', 'Content-Type': 'application/json', 'Content-Length': '47', 'Connection': 'keep-alive', 'server': 'uvicorn'} ERROR:main:500 Internal Server Error: {"detail":"'6114cee903d6a72fa0370b97d042b71c'"} ERROR:main:Error details: { "detail": "'6114cee903d6a72fa0370b97d042b71c'" } HTTP error occurred: 500 Server Error: Internal Server Error for url: https://api.unstructured.io/general/v0/general
To Reproduce
A simple code to reproduce the error:
Environment:
Additional context
I attached one of the .pdf that produces such error, a side note when I used llamaparse it worked fine with this pdf
Mahmoud_Gamal_Resume.pdf
The text was updated successfully, but these errors were encountered: