MarkItDown: A Python library by Microsoft to convert various file formats (Word, PowerPoint, Excel, PDF, images, audio, HTML, JSON, XML, CSV, ZIP) into Markdown, using OCR and speech recognition for multimedia files. → GitHub

As website tool → here
<aside> 💡
Microsoft has released its own document parser for LLM use!
</aside>