R interface to Mistral AI's to convert PDF to markdown

Mistral AI has great capabilities to tranform PDF to markdown and extract images, see https://mistral.ai/news/mistral-ocr

This R package is a simple interface for this OCR functionality. This is a new package in development and specification may change.

Installation

Soon it should be available on r-universe. Then call:

install.packages('rmistral', repos = c('https://skranz.r-universe.dev', 'https://cloud.r-project.org'))

If that does not yet work, download the ZIP and build locally the R project.

Usage Example

Examples

Load library and specify Mistral API key.

# 1. Load library and set API key
library(rmistral)
set_mistral_api_key("<YOUR MISTRAL AI API KEY>")


# 2. Specify URL to your pdf file
#    To convert a local PDF file use instead the file argument
#    in your mistral_ocr call.

url = "https://raw.githubusercontent.com/skranz/rmistral/main/pdf_example/paper_excerpt.pdf"

# 3. Convert the PDF at the URL to result object
#    Most relevant information are in ocr$pages.

ocr = mistral_ocr(url=url,timeout_sec = 120, include_images = TRUE)

# 4. Save results as markdown file and all 
#    extracted images as separate files

md_file = "mydir/myfile.md"
mistral_ocr_save_md(ocr,md_file, by_page=FALSE, overwrite = TRUE, save_images=TRUE)

# 5. To convert to a different format, you can use e.g. pandoc.
#    Here an example that converts to an HTML file

library(rmarkdown)
html_file = "mydir/myfile.html"
pandoc_convert(md_file, output = html_file,
  options = c("--mathjax","--standalone"))

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
R		R
man		man
pdf_example		pdf_example
.Rbuildignore		.Rbuildignore
.gitattributes		.gitattributes
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
rmistral.Rproj		rmistral.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R interface to Mistral AI's to convert PDF to markdown

Installation

Usage Example

Examples

About

Releases

Packages

Languages

skranz/rmistral

Folders and files

Latest commit

History

Repository files navigation

R interface to Mistral AI's to convert PDF to markdown

Installation

Usage Example

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages