MarkItDown tweets

MarkItDown: A Python library by Microsoft to convert various file formats (Word, PowerPoint, Excel, PDF, images, audio, HTML, JSON, XML, CSV, ZIP) into Markdown, using OCR and speech recognition for multimedia files. → GitHub

image.png

As website tool → here

<aside> 💡

Microsoft has released its own document parser for LLM use!

</aside>